Ecto Preloads in Phoenix Contexts

Damon Janis
4 min readOct 29, 2019

Are not as difficult as they seem

Photo by Jude Beck on Unsplash
Photo by Jude Beck on Unsplash

When I’m working with a new Phoenix app (written in Elixir), I often come across the following situation. I happened to run into it last night on a side-project I’m building called Remind Me (which you should definitely check out, it’s free and it will help you stay on top of a whole class of to-do’s that often fall through the cracks), so I decided to write up my thoughts on the subject. Here’s the issue (with fake resource names for illustration):

  • I have a few associated resources in my database, like a User which has_many TextMessageItem‘s and has_one ApiConnection
  • I have a MyApp.Users context, which when generated by default has a function like this:
def list_users do
Repo.all(User)
end
  • I need to be able to get a few User’s out of the database and send them to the front-end (or through GraphQL or whatever is needing the data) with the associated TextMessageItem's and ApiConnection preloaded.
  • I hate having to make a function like this to do the work:
def list_users_with_text_messages_and_api_connection() do
User
|> Repo.all()
|> Repo.preload([:text_messages, :api_connection])
end

Because as soon as I need another preload or I don’t need one of them, I have to make related functions like list_users_with_api_connection/0 and so on. Over time you end up with long function names and the callers are always receiving more or less data than the really need (usually more than needed, which can become a performance problem).

So here’s a more elegant solution.

In the context, e.g. users.ex:

def list_users(opts \\ []) do
preloads = Keyword.get(opts, :preloads, [])
User
|> Repo.all()
|> Repo.preload(preloads)
end

The first thing we’re doing here is adding a keyword list as a parameter, with an empty list as the default value.

Then, we’re checking the keyword list for the :preloads option, and if it doesn’t exist in opts we’ll default to an empty list.

We rely on the behavior of Ecto.Repo.preload/2, which just returns the struct(s) in the first argument if you pass an empty list as the second argument.

Modifying the function this way gives the calling function ultimate flexibility: it can choose to preload only the associations it needs, to get the minimum amount of data back as possible.

For example, in the controller or resolver, e.g. user_controller.ex:

defmodule MyAppWeb.UserController do
alias MyApp.Users
def index(conn, _params) do
preloads = [:text_message_items, :api_connection]
users = Users.list_users(preloads: preloads) render(conn, "index.html", users: users)
end
end

As a bonus, this approach even works with nested preloads! Imagine that each text_message_item had many attachment‘s, we could preload those like this:

preloads = [:api_connection, text_message_items: [:attachments]]
Users.list_users(preloads: preloads)

Caveats

In software, most decisions aren’t about the “right” or “wrong” way to do things, but rather they’re about tradeoffs.

The solution I’ve proposed here has some advantages:

  • Simplicity
  • Easy to read and reason about
  • Maintainable

But it also has a disadvantage: it requires multiple trips to the database.

Any time you use Repo.preload/2 or the Ecto.Query equivalents (more on that syntax in a minute), each item (including nested preloads) will generate a separate query. So in our example, where our list of preloads looks like [:text_message_items, :api_connection], we will have three queries: one to fetch the user’s, another to fetch the associated text_message_item’s, and another to fetch the api_connection.

With two preloaded associations, it’s probably not that bad and unless you’re developing for scale that demands optimizing every query to the hilt, it’s probably worth a couple extra trips to the database for the code readability. In addition, you’ll keep your context nice and tidy while requirements are added, and by the time the query becomes a bottleneck you’ll have a much more mature idea of what preloads you really need from the query.

Optimizations

So say you are at the point where every trip to the database makes a big difference, whether it’s about page load times or the stress on the database.

There is a great article put out by ThoughtBot that details this exact issue of preloads causing multiple database queries, and some of the solutions.

The basic idea for optimizing preloads is that you need to use the Ecto.Query syntax and join the associations yourself. Here’s one way to do it, but we’re back to the dilemma of having long function names, and it’s not nearly as pretty:

def list_users_with_text_messages_and_api_connection do
User
|> from()
|> join(:left, [user], _ in assoc(user, :text_message_items))
|> join(:left, [user, _], _ in assoc(user, :api_connection))
|> preload([_user, tmi, api], [text_message_items: tmi, api_connection: api])
|> Repo.all()
end

In Conclusion

Ecto is a very powerful tool with a whole range of functions to help you interact with a database at any level of abstraction you like.

When development speed is of the essence, Ecto has high-level abstractions to make your life easy and your code beautiful. It’s easy to maintain and extend, and will keep your development cycle short while you’re building and iterating rapidly.

But, when your app hits scale and performance becomes critical, Ecto has the tools for you to get down into the nitty gritty SQL layers while still writing safe, expressive, composable queries.

Edit:

There is a great macro solution from joshnuss in the comments that allows you to join and preload in one step: https://gist.github.com/joshnuss/9c68ad2c2649b571dd241693dad6f6f6

It looks really promising (and I tested it and it works great!) and it gives me hope that there are elegant solutions within reach to keep the maintainability and simplicity of piping in preloads from a caller, without multiple queries to the database.

--

--