December 16, 2022Code I like

Active Record, nice and blended

Active Record restates the traditional question of how to separate persistence from domain logic: what if you don’t have to?

Jorge Manrubia: Principal Programmer, Product

Persisting objects in relational databases is an intricate problem. Two decades ago, it looked like the ultimate orthogonal problem to solve: abstract persistence out so that programmers don’t have to care about it. Many years later, we can affirm that… it’s not that simple. Persistence is indeed a crosscutting concern, but one with many tentacles.

Because it can’t be fully abstracted, many patterns aim to isolate database access on its own layer to keep the domain model persistence-free. For example repositories, Data Mappers or DAOs (Data Access Objects). Rails, however, went with a different approach — Active Record — a pattern introduced by Martin Fowler in Patterns of EAA:

An object that wraps a row in a database table or view, encapsulates the database access, and adds domain logic on that data.

The distinctive trait of the Active Record pattern is combining domain logic and persistence in the same class, and that’s how we use it here at 37signals. At first glance, this might not look like a good idea. Shouldn’t we keep separated things, well, separated? Even Fowler mentions that the pattern is a good choice for domain logic that isn’t too complex. It’s our experience, however, that Rails’ Active Record keeps code elegant and maintainable in large and complex codebases. In this article, I will try to articulate why.

An impedance-less match

Object–relational impedance mismatch is a fancy way of saying that Object-Oriented languages and relational databases are distinct worlds, and this results in friction when translating concepts between them.

I believe Active Record — the Rails framework, not the pattern — works so well in practice because it reduces this impedance mismatch to a minimum. There are two main reasons:

It looks and feels like Ruby, even when you need to go lower level to fine-tune things.
It comes with fantastic and innovative answers for recurring needs when dealing with objects and relational persistence.

A perfect Ruby companion

Let me show you an example from HEY. It shows the internals from the Contact#designate_to(box) method I referenced in the Vanilla Rails article. This method handles the logic when you select a box as the destination for emails from a given contact. I’m highlighting the lines involving Active Record:

module Contact::Designatable
  extend ActiveSupport::Concern

  included do
    has_many :designations, class_name: "Box::Designation", dependent: :destroy
  end

  def designate_to(box)
    if box.imbox?
      # Skip designating to Imbox since it’s the default.
      undesignate_from(box.identity.boxes)
    else
      update_or_create_designation_to(box)
    end
  end

  def undesignate_from(box)
    designations.destroy_by box: box
  end

  def designation_within(boxes)
    designations.find_by box: boxes
  end

  def designated?(by:)
    designation_within(by.boxes).present?
  end

  private
    def update_or_create_designation_to(box)
      if designation = designation_within(box.identity.boxes)
        designation.update!(box: box)
      else
        designations.create!(box: box)
      end
    end
end

The persistence bits look natural and easy to follow. The code is eloquent and concise, and it reads like Ruby. It doesn’t feel like a mix of concerns that don’t belong together; you don’t see a cognitive jump between “business logic” and “persistence duties”. To me, this trait is a game-changer.

Answers for persistence needs

Active Record offers many options to persist object-oriented models into tables. When presenting the original pattern, Fowler argues that:

If your business logic is complex, you’ll soon want to use your object’s direct relationships, collections, inheritance, and so forth. These don’t map easily onto Active Record, and adding them piecemeal gets very messy.

Rails’ Active Record offers answers for those and more. To enum a few: associations, single table inheritance, serialized attributes or delegated types.

Some people recommend avoiding Rails associations at all costs. I have a hard time understanding this one: I think associations are one of the best features of Active Record, and we use them extensively in our apps. When studying object-oriented programming, associations between objects are a fundamental construct, just like inheritance. Same with relationships between tables in the relational world. What’s not to like about direct support for translating those into code and getting the framework to do all the heavy lifting for you?

Let me show you an example of associations. In HEY, an email thread internally looks like a Topic model with many entries (Entry). In some scenarios, the system needs to access aggregated data at the topic level based on the contained entries, such as the addressed contacts in the thread or the blocked trackers. We implement the bulk of this with associations:

class Topic
  include Entries

  #...
end

module Topic::Entries
  extend ActiveSupport::Concern

  included do
    has_many :entries, dependent: :destroy
    has_many :entry_attachments, through: :entries, source: :attachments
    has_many :receipts, through: :entries
    has_many :addressed_contacts, -> { distinct }, through: :entries
    has_many :entry_creators, -> { distinct }, through: :entries, source: :creator
    has_many :blocked_trackers, through: :entries, class_name: "Entry::BlockedTracker"
    has_many :clips, through: :entries
  end

  #...
end

We use other Active Record features profusely to get direct persistence support for rich Ruby object models. For example, single table inheritance to model the different kinds of boxes in HEY:

class Box < ApplicationRecord
end

class Box::Imbox < Box
end

class Box::Trailbox < Box
end

class Box::Feedbox < Box
end

Or serialized attributes to store the recurrence schedule details for Basecamp checkins:

class Question < ApplicationRecord
  serialize :schedule, RecurrenceSchedule
end

And we use delegated types to model the different kinds of contacts in HEY.

class Contact < ApplicationRecord
  include Contactables
end

module Contact::Contactables
  extend ActiveSupport::Concern

  included do
    delegated_type :contactable, types: Contactable::TYPES, inverse_of: :contact, dependent: :destroy
  end
end

module Contactable
  extend ActiveSupport::Concern

  TYPES = %w[ User Extenzion Alias Person Service Tombstone ]

  included do
    has_one :contact, as: :contactable, inverse_of: :contactable, touch: true
    belongs_to :account, default: -> { contact&.account }
  end
end

class User < ApplicationRecord
  include Contactable
end

class Person < ApplicationRecord
  include Contactable
end

class Service < ApplicationRecord
  include Contactable
end

The caveat here is that Active Record requires you to control the database schema to leverage what it offers fully. Assuming this is the case, the ability to seamlessly persist rich and complex object models is key to making the pattern work in large and complex codebases.

Encapsulation-friendly

Because it blends so well with Ruby, Active Record plays great with using regular Ruby goodies to hide details away. This allows you to write code that looks natural and encapsulates persistence concerns without paying the ceremony tax of a separate data-access layer.

For example, you can check the previous Contact::Designatable code. The Active Record code is wrapped in plain private methods. You can also notice how everything — domain logic and persistence — is hidden behind a method #designate_to, which is part of that natural interface we want to see from the system boundaries, as explained in this article. So persistence is mixed in but well organized and encapsulated.

In more complex scenarios, nothing prevents you from creating objects to hide complexity away. For example, in Basecamp, to render the activity timeline for a given user, it uses a class Timeline::Aggregator, which is a PORO in charge of serving the relevant events. This class encapsulates the querying logic:

class Reports::Users::ProgressController < ApplicationController
  def show
    @events = Timeline::Aggregator.new(Current.person, filter: current_page_by_creator_filter).events
  end
end

class Timeline::Aggregator
  def initialize(person, filter: nil)
    @person = person
    @filter = filter
  end

  def events
    Event.where(id: event_ids).preload(:recording).reverse_chronologically
  end

  private
    def event_ids
      event_ids_via_optimized_query(1.week.ago) || event_ids_via_optimized_query(3.months.ago) || event_ids_via_regular_query
    end

    # Fetching the most recent recordings optimizes the query enormously for large sets of recordings
    def event_ids_via_optimized_query(created_since)
      limit = extract_limit
      event_ids = filtered_ordered_recordings.where("recordings.created_at >= ?", created_since).pluck("relays.event_id")
      event_ids if event_ids.length >= limit
    end

    def event_ids_via_regular_query
      filtered_ordered_recordings.pluck("relays.event_id")
    end

    # ...
end

For querying, we use scopes extensively. Combining those with associations and other scopes lets you express complex queries with natural-looking code. For example, for rendering a collection in HEY, it needs to fetch all the active topics in the collection accessible to the acting contact — in HEY you can have different acting users depending on the selected filter. This is how the related code looks:

class Topic < ApplicationController
  include Accessible
end

module Topic::Accessible
  extend ActiveSupport::Concern

  included do
    has_many :accesses, dependent: :destroy
    scope :accessible_to, ->(contact) { not_deleted.joins(:accesses).where accesses: { contact: contact } }
  end

  # ...
end

class CollectionsController < ApplicationController
  def show
    @topics = @collection.topics.active.accessible_to(Acting.contact)
    # ...
  end
end

While this is a bit of an edge case, you can see how we also used scopes for one of the HEY performance optimizations Donal described in this article:

module Posting::Involving
  extend ActiveSupport::Concern

  DEFAULT_INVOLVEMENTS_JOIN = "INNER JOIN `involvements` USE INDEX(index_involvements_on_contact_id_and_topic_id) ON `involvements`.`topic_id` = `postings`.`postable_id`"
  OPTIMIZED_FOR_USER_FILTERING_INVOLVEMENTS_JOIN = "STRAIGHT_JOIN `involvements` USE INDEX(index_involvements_on_account_id_and_topic_id_and_contact_id) ON `involvements`.`topic_id` = `postings`.`postable_id`"

  included do
    scope :involving, ->(contacts, involvements_join: DEFAULT_INVOLVEMENTS_JOIN) do
      where(postable_type: "Topic")
        .joins(involvements_join)
        .where(involvements: { contact_id: Array(contacts).map(&:id) })
        .distinct
    end

    scope :involving_optimized_for_user_filtering, ->(contacts) do
      # STRAIGHT_JOIN ensures that MySQL reads topics before involvements
      involving(contacts, involvements_join: OPTIMIZED_FOR_USER_FILTERING_INVOLVEMENTS_JOIN)
        .use_index(:index_postings_on_user_id_and_postable_and_active_at)
        .joins(:user)
        .where("`users`.`account_id` = `involvements`.`account_id`")
        .select(:id, :active_at)
    end
  end
end

A restatement of the persistence and domain logic separation problem

In theory, a rigid separation of persistence and domain logic sounds like a good idea. However, in practice, it comes with two significant challenges.

First, whatever approach you use will imply adding and orchestrating additional data-access abstractions in multiples of the number of persisted models in your app. That translates into additional ceremony and complexity.

Second, building rich domain models becomes harder. If domain models are free of persistence concerns, how can they implement business logic that needs to access the database? In DDD, for example, the answer is adding additional domain-level elements such as repositories and aggregates. Now you have three elements to coordinate, with plain domain entities knowing nothing about persistence. How do they access each other? How do they collaborate? This is a tempting scenario to come up with a service to orchestrate everything. It is not surprising that, ironically, many implementations that aim to embrace the best design practices end up suffering the anemic domain model problem, where most entities are mere data holders without behavior.

The desire to separate persistence for domain logic exists for a reason. Merging both can easily result in code that doesn’t belong together or, in other words, that is difficult to maintain. This becomes obvious if you use raw SQL directly, but it is also the case if you use most ORM libraries because they only focus on the persistence side of the equation. Active Record, however, is designed on the premise that persistence and domain logic belongs together, and it comes with almost two decades of iteration around this idea.

This insight in an article by Kent Beck blew my mind quite recently:

An element is cohesive to the degree that its subelements are coupled. A perfectly cohesive elements requires all subelements to change at the same time.

In a database-powered application, domain logic is indissolubly linked to persistence. Is isolating persistence a goal or mitigation against not having the right ORM (Object Relational Mapping) technology in place? My point of view is that if the ORM perfectly blends with the host language, and if it comes with good answers for persisting your object models, and if it offers good encapsulation mechanisms, then the original question gets restated. Instead of “how can I isolate persistence from domain logic”, it becomes “why would I”?

Conclusion

I’ve known for a long time that using Active Record as I explained here works great in practice. I have never missed a standalone layer for persistence in the Rails apps I’ve worked on.

Like with vanilla Rails, some people argue that Active Record works for quick prototypes but that, at some point, you need to embrace an alternative that isolates persistence from domain logic. This is not our experience. We use Active Record as described here in Basecamp and HEY, which are two quite large Rails applications used by millions. Active Record is at the heart of everything these applications do, and we keep evolving them.

There is a caveat, which is common in these articles: Active Record is a tool, and of course, you can use it to write messy and unmaintainable code. In particular, like it happens with concerns, it doesn’t replace the need to design your system properly and it won’t help you to do that if you don’t know how. If you are using Active Record and having code maintenance problems, it might not be the tool, but that you are not getting the other thousand of things that make code maintainable right.

I believe Active Record is so good that it eliminates the traditional reasons for a strict separation of persistence and domain logic. I consider that a feature to embrace proudly, not a choice to justify.

Other articles in the “Code I like” series
Photo by Michael Dziedzic on Unsplash
Translations: Japanese(zenn), Japanese(techracho)