The letter A styled as Alchemists logo. lchemists

Putin's War on Ukraine - Watch President Zelenskyy's speech and help Ukraine fight against the senseless cruelty of a dictator!

Published November 1, 2021 Updated December 23, 2022
Cover
Ruby Structs

Ruby’s Struct is one of several powerful core classes which is often overlooked and under utilized compared to the more popular Hash class. This is a shame and I’m often surprised when working with others who don’t know about structs or, worse, abuse them entirely. I’d like to set the record straight by sharing the joy of structs with you and how you can leverage their power to improve your Ruby code further. 🚀

Overview

Structs are a hybrid between a class and a hash where, by default, they are a mutable container for data. They are best used to give a name to an object which encapsulates one to many attributes. This allows you to avoid using arrays or hashes which leads to Primitive Obsession code smells.

To illustrate further, let’s consider an object which is a point on a graph and consists of x and y coordinates. Here’s how you might define that object using an Array (tuple), Hash, Struct, Class, and OpenStruct:

# Array
point = [1, 2]
point.first                        # 1
point.last                         # 2

# Hash
point = {x: 1, y: 2}
point[:x]                          # 1
point[:y]                          # 2

# Struct
Point = Struct.new :x, :y
point = Point.new 1, 2
point.x                            # 1
point.y                            # 2

# Class
class Point
  attr_accessor :x, :y

  def initialize x, y
    @x = x
    @y = y
  end
end

point = Point.new 1, 2
point.x                            # 1
point.y                            # 2

# OpenStruct
require "ostruct"
point = OpenStruct.new x: 1, y: 2
point.x                            # 1
point.y                            # 2

Based on the above you can immediately see the negative effects of Primitive Obsession with the Array and Hash instances. With the Array tuple, you can only use #first to obtain the value of x and #last to obtain the value of y. Those are terrible method names to represent a point object because the methods are not self describing. With the Hash, the methods are more readable but you have to message #[] with the key of the value you want to obtain which isn’t ideal when having to type the brackets each time you send a message.

When we move away from the Array and Hash by switching to the Struct, you can see the elegance of messaging our point instance via the #x or #y methods to get the desired values.

The Class example has the same capabilities as the struct but at the cost of being more cumbersome. Plus, the class is more heavyweight compared to the struct. There are additional reasons why a Struct is better than a Class, like performance, but I’ll expand upon this more later.

Finally, we have OpenStruct which is part of Ruby Core as well. At this point, you might be thinking: "Hey, looks like OpenStruct is better in terms of melding Struct and Class syntax and functionality." Well, you’d be very wrong in terms of performance but, as mentioned with Class usage, I promise to expand upon this more later.

History

Before proceeding, it’s important to note that this article is written using modern Ruby syntax so the following highlights the differences between each Ruby version in order to reduce confusion.

3.0.0 (and earlier)

Earlier versions of Ruby required two ways to define a struct via positional or keyword arguments. For example, here’s the same struct defined first with positional arguments and then with keyword arguments:

# Positional
Point = Struct.new :x, :y
point = Point.new 1, 2

# Keyword
Point = Struct.new :x, :y, keyword_init: true
point = Point.new x: 1, y: 2

The difference is using keyword_init: true to construct a struct via keywords instead of positional parameters. For astute readers, you’ll recognize this as a Boolean Parameter Control Couple code smell which caused a lot of grief and confusion.

3.1.0

Warnings were added when constructing structs with the keyword_init: true flag so people would be aware that the flag was being deprecated.

3.2.0

Use of the keyword_init: true flag was no longer required which means you can define structs using the same syntax but initialize using either positional or keyword arguments. Example:

# Construction
Point = Struct.new :x, :y

# Initialization (positional)
point = Point.new 1, 2

# Initialization (keyword)
point = Point.new x: 1, y: 2

Construction

Now that we’ve gone over what a struct is along with historical context, let’s delve into construction.

New

You can construct a struct multiple ways. Example:

# Accepts positional or keyword arguments.
Point = Struct.new :x, :y

# Accepts keyword arguments only.
Point = Struct.new :x, :y, keyword_init: true

# Accepts positional arguments only.
Point = Struct.new :x, :y, keyword_init: false

# Accepts positional arguments only.
Point = Struct.new :x, :y, keyword_init: nil

Being restricted to only using positional arguments is no longer recommended so you should avoid using keyword_init when constructing your structs and only use the first example shown above. Here’s a closer look:

Point = Struct.new :x, :y

# Positional
Point.new 1           # <struct Point x=1, y=nil>
Point.new nil, 2      # <struct Point x=nil, y=2>

# Keyword
Point.new y: 2, x: 1  # <struct Point x=1, y=2>
Point.new y: 2        # <struct Point x=nil, y=2>

While keyword arguments require more typing to define your key and value, you are free of positional constraints and can even construct with a subset of attributes which isn’t possible with positional arguments unless you fill in all positions prior to the position you desire to set. Even better — once you’ve constructed your struct, you can use positional or keyword arguments freely during initialization.

Subclass

So far you’ve only seen class construction using .new but you can use a subclass as well. Example:

class Inspectable < Struct
  def inspect = to_h.inspect
end

Point = Inspectable.new :x, :y

# Positional
Point.new(1, 2).inspect        # "{:x=>1, :y=>2}"

# Keyword
Point.new(x: 1, y: 2).inspect  # "{:x=>1, :y=>2}"

Subclassing can be useful in rare cases but you can also see it’s not as much fun to use due to the additional lines of code (even with the overwritten #inspect method). Additionally, you can’t add more attributes to your subclass like you can with a regular subclass. For example, trying to add a z subclass attribute in addition to your existing x and y superclass attributes won’t work. The reason is that once a struct is defined, it can’t be resized.

While subclassing is good to be aware of, use with caution because inheritance carries a lot of baggage with it. You’re much better off using Dependency Inversion which is the D in SOLID design. So compose your objects rather than inheriting them.

Initialization

Now that we know how to construct a Struct, let’s move on to initialization. We’ll continue with our Point struct for these examples.

New

As shown earlier — and with nearly all Ruby objects — you can initialize a struct via the .new class method:

# Positional
point = Point.new 1, 2

# Keyword
point = Point.new x: 1, y: 2

Keep in mind that even though you can initialize an instance of your Point struct using positional or keyword arguments, you can’t mix and match them. Example:

point = Point.new 1, y: 2

point.x   # 1
point.y   # {y: 2}

Notice {y: 2} was assigned to y when the value should have been 2 so use either all positional arguments or all keyword arguments to avoid the this situation. Don’t mix them!

Anonymous

You can anonymously create a new instance of a struct by constructing your struct and initializing it via a single line. Example:

# Positional
point = Struct.new(:x, :y).new 1, 2

# Keyword
point = Struct.new(:x, :y).new x: 1, y: 2

The problem with the above is anonymous structs are only useful within the scope they are defined as temporary and short lived objects. Worse, you must redefine the struct each time you want to use it. For anything more permanent, you’ll need to define a constant for improved reuse. That said, anonymous structs can be handy in a pinch for one-off situations like scripts, specs, or code spikes.

Inline

While anonymous structs suffer from not being reusable, you can define inline and semi-reusable structs using a single line of code as follows (pay attention to the string which is the first argument):

# Positional
point = Struct.new("Point", :x, :y).new 1, 2

# Keyword
point = Struct.new("Point", :x, :y).new x: 1, y: 2

You can even use shorter syntax but I don’t recommend this because it’s harder to read and a bit too clever:

# Positional
point = Struct.new("Point", :x, :y)[1, 2]

# Keyword
point = Struct.new("Point", :x, :y)[x: 1, y: 2]

To create new instances of the above struct, you’d need to use the following syntax:

Struct::Point.new x: 3, y: 4  # <struct Struct::Point x=3, y=4>
Struct::Point.new x: 5, y: 6  # <struct Struct::Point x=5, y=6>

The downside is you must keep typing Struct::Point which isn’t great as a constant you’d want reuse on a permanent basis. Regardless, the difference between an inline struct and the earlier anonymous struct is that the first argument we pass in is the name of our struct which makes it a constant and reusable. To illustrate further, consider the following:

# Anonymous
point = Struct.new(:x, :y).new 1, 2
point = Struct.new(:x, :y).new 1, 2
# No warnings issued.

# Constant
point = Struct.new("Point", :x, :y).new 1, 2
point = Struct.new("Point", :x, :y).new 1, 2
# warning: redefining constant Struct::Point

With anonymous initialization, we don’t get a Ruby warning stating a constant has been defined. On the other hand, with constant initialization, we do get a warning that the Point class has already been defined when we try to define it twice.

While it can be tempting to define a struct via a single line — and sometimes useful in one-off scripts — I would recommend not using this in production code since it’s too easy to obscure finding these constants within your code.

Brackets

As hinted at earlier, there is a shorter way to initialize a struct and that’s via square brackets:

point = Point[x: 1, y: 2]

This is my favorite form of initialization and for two important reasons:

  1. Brackets require three less characters to type. ⚡️

  2. Brackets signify, more clearly, you are working with a struct versus a class which improves readability. Calling out structs like this when reading through code makes a big difference over time and encourage you to do the same.

Defaults

Structs, as with classes, can set defaults. The way to do this is to define an #initialize method. Here are a couple examples:

# First
Point = Struct.new :x, :y do
  def initialize x: 1, y: 2
    super
  end
end
# Second
Point = Struct.new :x, :y do
  def initialize **arguments
    super

    self[:x] ||= 1
    self[:y] ||= 2
  end
end

With each of the above, any Point instance will default to x = 1, y = 2:

point = Point.new

point.x  # 1
point.y  # 2

There are a few important aspects of the above to call out:

  1. You must call super to ensure incoming arguments are forwarded to the superclass.

  2. The first example leverages concise syntax to define defaults and is recommended if your parameter lists is three or less.

  3. The second example allows you forward all keyword arguments to super than then define defaults by using self with memoization. This is the recommended approach if you have more than three parameters to prevent your parameter list from getting long and unwieldy.

I use either of these examples when needing an instance of struct with safe defaults. This can also be abused so keep your defaults simple and without side effects. If you don’t need a default or can’t think of a safe default, then don’t override the initializer unnecessarily.

Transformations

Along the same lines as initialization is the ability for structs to transform an incoming data type to itself. This is a variant of the Adapter Pattern but instead of having a second object which adapts one object into an instance of your struct, you have the struct do the transformation. For example, consider the following:

module Graphs
  POINT_KEY_MAP = {horizontal: :x, vertical: :y}.freeze

  Point = Struct.new(*POINT_KEY_MAP.values) do
    def self.for(location, key_map: POINT_KEY_MAP) = new(**location.transform_keys(key_map))
  end
end

With the above, you can now transform in incoming Hash into the Struct we need:

location = {horizontal: 1, vertical: 2}
point = Graphs::Point.for location
point.inspect                            # <struct Graphs::Point x=1, y=2>

This is a lot of power for a small amount of code because you can now convert one data type — which looks roughly similar to your struct but has the wrong keys — into your struct which is properly named and has a better interface. Let’s break this down further:

  1. The Graphs module gives you a namespace to group related constants (i.e. POINT_KEY_MAP and Point).

  2. The POINT_KEY_MAP constant allows you to define — in one place — the mapping of keys you need to transform. The hash keys are the foreign keys to transform while the hash values are the keys used to define your struct’s attributes.

  3. The .for class method allows you to consume the location hash along with an optional key map for transforming the foreign keys. Since location is a hash, we can ask it to transform its keys using the provided key map. The result is then used to initialize the struct with the newly transforms keys and values of the original hash.

The reason this is powerful is because, in Domain Driven Design, you have a single method — .for in this case — serving as a boundary for converting a foreign type into a struct with more flexibility and reuse with minimal effort. This is handy in situations where you might be dealing with an external API or any kind of similar data which is almost shaped the way you need but isn’t quite right.

I should point out that if .for isn’t to your liking, you can use .with, .for_location, .with_location, and so forth for the class method name. I tend to stick with short and simply named transforming method names like .for or .with until I find I need something more specific.

You can take all of this too far and put too much responsibility on your struct. Should that happen, consider crafting an adapter class that consumes and converts the incoming data into an instance of your struct. Otherwise, for simple situations like the above example, this is an nice way to give your struct extra superpower with concise syntactic sugar.

Whole Values

Another superpower of structs is that they are whole value objects by default. This is lovely because you can have two or more structs with the same values and they’ll be equal even though their object IDs are different. Here’s an example where, again, we reach for our Point struct:

a = Point[x: 1, y: 2]
b = Point[x: 1, y: 2]

a == b      # true
a === b     # true
a.eql? b    # true
a.equal? b  # false

This is exactly what’s makes the Versionaire gem so powerful by being able to provide a primitive, semantic, version type for use within your Ruby applications. Example:

a = Version major: 1, minor: 2, patch: 3  # <struct Versionaire::Version major=1, minor=2, patch=3>
b = Version [1, 2, 3]                     # <struct Versionaire::Version major=1, minor=2, patch=3>
c = Version "1.2.3"                       # <struct Versionaire::Version major=1, minor=2, patch=3>

a == b && b == c                          # true

Another advantage of having a whole value object shows up when writing RSpec specs where you expect the Struct answered back to be comprised of the correct set of values. Example:

expect(client.call).to contain_exactly(Point[x: 1, y: 2])

Pattern Matching

I’ve written about pattern matching before, so you’ll know I’m a fan. Structs, along with arrays and hashes, natively support pattern matching. If we use the same point object, defined earlier as a keyworded struct, we can write code like this:

By Key And Value

case Point[x: 1, y: 1]
  in x: 1, y: 1 then puts "Low."
  in x: 10, y: 10 then puts "High."
  else puts "Unknown point."
end

# Prints: "Low."

By Position and Value

case Point[x: 10, y: 10]
  in 1, 1 then puts "Low."
  in 10, 10 then puts "High."
  else puts "Unknown point."
end

# Prints: "High."

By Range

case Point[x: -5, y: -1]
  in 0, 0 then puts "Neutral."
  in ..0, ..0 then puts "Negative."
  in 0.., 0.. then puts "Positive."
  else puts "Mixed."
end

# Prints: "Negative."

By Explicit Type

case {x: 1, y: 1}
  in Point[x: 1, y: 1] then puts "Low."
  in Point[x: 10, y: 10] then puts "High."
  else puts "Unknown point."
end

# Prints: "Unknown point."

In the above examples, you’d typically not inline an instance of your struct for pattern matching purposes but pass in the instance as an argument to your case expression. I inlined the instance to keep things concise. That aside — and as you can see — being able to pattern match gives you a lot of power and the above is by no means exhaustive.

Refinements

Structs, as with any Ruby object, can be refined. I’ve written extensively about refinements and have a gem, of the same name, which refines several Ruby core primitives, including structs. Here’s an example of some of the ways in which we can refine our Point struct even further:

#! /usr/bin/env ruby
# frozen_string_literal: true

# Save as `snippet.rb` and run as `ruby snippet.rb`

require "bundler/inline"

gemfile true do
  source "https://rubygems.org"

  gem "refinements"
end

require "refinements/structs"

Point = Struct.new :x, :y

module Demo
  using Refinements::Structs

  def self.run
    puts Point.with_keywords(x: 1, y: 2)            # #<struct x=1, y=2>
    puts Point.keyworded?                           # false

    point = Point[1, 2]

    puts point.merge x: 0, y: 1                     # #<struct x=0, y=1>
    puts point.revalue { |position| position * 2 }  # #<struct x=2, y=4>
  end
end

Demo.run

If you were to run the above script, you’d see the same output as shown in the code comments. The above is only a small taste of how you can refine your structs. Feel free to check out the Refinements gem for details or even add it to your own projects.

Benchmarks

Earlier, when talking about construction, I hinted at additional reasons for reaching for a Struct over a Class or — worse — an OpenStruct. Well, improved performance is one of them. Consider the following benchmark script which compares the performance of an Array, Hash, Struct, OpenStruct, and Class.

#!/usr/bin/env ruby
# frozen_string_literal: true

# Save as `benchmark`, then `chmod 755 benchmark`, and run as `./benchmark`.

require "bundler/inline"

gemfile true do
  source "https://rubygems.org"
  gem "benchmark-ips", require: "benchmark/ips"
end

require "ostruct"

MAX = 1_000_000

ExampleStruct = Struct.new :to, :from

ExampleClass = Class.new do
  attr_reader :to, :from

  def initialize to:, from:
    @to = to
    @from = from
  end
end

Benchmark.ips do |benchmark|
  benchmark.config time: 5, warmup: 2

  benchmark.report "Array" do
    MAX.times { %w[Mork Mindy] }
  end

  benchmark.report "Hash" do
    MAX.times { {to: "Mork", from: "Mindy"} }
  end

  benchmark.report "Struct" do
    MAX.times { ExampleStruct[to: "Mork", from: "Mindy"] }
  end

  benchmark.report "OpenStruct" do
    MAX.times { OpenStruct.new to: "Mork", from: "Mindy" }
  end

  benchmark.report "Class" do
    MAX.times { ExampleClass.new to: "Mork", from: "Mindy" }
  end

  benchmark.compare!
end

If you save the above script to file and run locally, you’ll get output that looks roughly like this:

Warming up --------------------------------------
               Array     2.000  i/100ms
                Hash     1.000  i/100ms
              Struct     1.000  i/100ms
          OpenStruct     1.000  i/100ms
               Class     1.000  i/100ms
Calculating -------------------------------------
               Array     24.001  (± 0.0%) i/s -    120.000  in   5.000514s
                Hash     12.345  (± 8.1%) i/s -     62.000  in   5.032527s
              Struct      3.736  (± 0.0%) i/s -     19.000  in   5.086211s
          OpenStruct      0.175  (± 0.0%) i/s -      1.000  in   5.716019s
               Class      4.066  (± 0.0%) i/s -     21.000  in   5.165897s

Comparison:
               Array:       24.0 i/s
                Hash:       12.3 i/s - 1.94x  (± 0.00) slower
               Class:        4.1 i/s - 5.90x  (± 0.00) slower
              Struct:        3.7 i/s - 6.42x  (± 0.00) slower
          OpenStruct:        0.2 i/s - 137.19x  (± 0.00) slower

Based on the benchmark statistics above, the Array is the clear winner with Hash as a runner up. No surprises there. You get great performance at the cost of readability/usage as mentioned earlier in this article.

When you ignore the Array and Hash, you are left with Struct, Class, and OpenStruct. This is where having a Struct truly shines. Granted, using a Class wouldn’t be the end of the world in terms of performance but when you compare the results against an OpenStruct, you can clearly see why an OpenStruct is not advised. This is why I don’t recommend using a Class over a Struct for encapsulating data and definitely avoid OpenStruct altogether.

Avoidances

Before wrapping up this article, there are a few avoidances worth pointing out when using structs in your Ruby code. Please don’t use these techniques yourself or, if you find others writing code this way, send them a link to this section of the article. 🙂

Anonymous Inheritance

We talked about how you can subclass a struct earlier but you can also create a subclass of an anonymous struct as well. Example:

class Point < Struct.new(:x, :y)
end

I didn’t bring this up earlier because the distinction is worth highlighting here due to the dangerous nature of creating a subclass from an anonymous struct superclass. The distinction might be subtle but Point < Struct.new is being used in the above example instead of class Point < Struct as discussed earlier. To make this more clear, consider the following for comparison:

# Normal Superclass
class Point < Struct
end

# Anonymous Superclass
class Point < Struct.new(:x, :y)
end

The normal superclass example is using proper inheritance as discussed earlier but the anonymous superclass example is creating a subclass from a temporary superclass which is not recommended. Even the official documentation on Ruby structs says as much:

Subclassing an anonymous struct creates an extra anonymous class that will never be used.

Ruby’s documentation goes on to state that the recommended way to use or even customize a struct is what we discussed earlier which is:

Point = Struct.new :x, :y do
  def inspect = to_h.inspect
end

OpenStruct

By now I hope I have convinced you to avoid OpenStruct usage in your code. Don’t get me wrong, they are fun to play with in your console for modeling data quickly but shouldn’t be used in any professional capacity. The reason is made clear in the official Ruby documentation:

An OpenStruct utilizes Ruby’s method lookup structure to find and define the necessary methods for properties. This is accomplished through the methods method_missing and define_singleton_method.

This should be a consideration if there is a concern about the performance of the objects that are created, as there is much more overhead in the setting of these properties compared to using a Hash or a Struct. Creating an open struct from a small Hash and accessing a few of the entries can be 200 times slower than accessing the hash directly.

This is a potential security issue; building OpenStruct from untrusted user data (e.g. JSON web request) may be susceptible to a “symbol denial of service” attack since the keys create methods and names of methods are never garbage collected.

Not only do you suffer a performance penalty but you expose a security vulnerability too. Even the RuboCop Performance gem has a Performance/OpenStruct linter to throw an error when detected in your code.

Conclusion

Structs come with a ton of power and are a joy to use. My Ruby code is better because of them. Hopefully, this will inspire you to use structs more effectively within your own code without reaching for classes or more complex objects. Even better, maybe this will encourage you to write cleaner code where data which consists of related attributes are given a proper name and Primitive Obsession is avoided altogether. 🎉