Using Ecto Schemas with Database Views

Database views are a great way to provide a simplified abstraction over complicated data logic. They provide a way to treat a complex query as if it were a simple table, and with no extra work, you can build Ecto schemas on top of database views to bring the power of that abstraction into your Elixir code!

In a side project of mine, an Elixir/Phoenix app for tracking daily habits called (imaginatively) Habits, the most complicated data logic involves figuring out “streaks” of check-ins. You can check in to a habit (i.e. record that you’ve done it) once each day, and a streak is the number of days in a row that you’ve checked in. The intention—based on Jerry Seinfeld’s “don’t break the chain” motivational strategy—is that if I see that I’ve flossed 100 days in a row, I’m more likely to floss on the 101st.

I don’t store “streaks” as a unique thing in the database. I only store habits and check-ins, but the database can figure out and tell us the streaks if we ask nicely. Unfortunately, asking nicely looks like this:

WITH start_streak AS (
  SELECT
    check_ins.date,
    check_ins.habit_id,
    CASE WHEN check_ins.date - LAG(check_ins.date, 1) OVER (PARTITION BY check_ins.habit_id ORDER BY check_ins.date) > 1
      THEN 1
      ELSE 0
    END streak_start
  FROM check_ins
  ),
  streak_groups AS (
    SELECT
      date,
      habit_id,
      SUM(streak_start) over (PARTITION BY habit_id ORDER BY date) streak
    FROM start_streak
  )
SELECT
  habit_id,
  MIN(date) AS start,
  MAX(date) AS end,
  MAX(date) - MIN(date) + 1 AS length
FROM streak_groups
GROUP BY habit_id, streak

For the purpose of this post, it doesn’t really matter what the query above is doing, only that it returns data that looks like this:

habit_id |   start    |    end     | length
----------+------------+------------+--------
       5 | 2014-04-23 | 2014-04-24 |      2
       6 | 2015-12-05 | 2015-12-09 |      5
       3 | 2013-07-25 | 2013-07-25 |      1
       2 | 2015-09-23 | 2015-09-24 |      2
       8 | 2014-09-26 | 2014-09-29 |      4
       5 | 2013-09-22 | 2013-09-22 |      1
       6 | 2015-09-20 | 2015-09-24 |      5
       8 | 2014-12-04 | 2014-12-04 |      1
       8 | 2016-10-23 | 2016-10-26 |      4
       5 | 2014-02-10 | 2014-02-10 |      1

For every “streak”, it tells me which habit it belongs to (which makes it easy to query streaks for a given habit), when the streak started, ended, and how many days it lasted. Awesome! Except that if I want to use this data in an Elixir function—perhaps to query the longest streak for habit—I have to copy and paste this big ball of SQL into an .ex file, manually set up join()s, etc. Wouldn’t it be great if Ecto could just treat streak data with the same elegant interface I can use with Habits and CheckIns? Happily, it can!

Creating the View with a Migration

The first step is actually creating the view in the database. This requires some SQL like CREATE VIEW my_view AS followed by your query, which can be accessed with the name my_view. As with any database changes, we’ll make this with a migration:

$ mix ecto.gen.migration create_streaks_view

This will create a file in priv/repo/migrations. Unfortunately, there aren’t yet any great tools like Thoughtbot’s scenic library for Rails that make working with database views easier in Ecto. (Update: keep an eye on this issue.) So we’re going to have to write some SQL, but this is the only time. We can run raw SQL in migrations by using the execute function.

Because different SQL is required for creating and destroying database views, we’ll need separate up and down functions defined, instead of the default change. The migration will look something like this (see the entire file on Github):

defmodule Habits.Repo.Migrations.AddStreaksView do
  use Ecto.Migration

  def up do
    execute """
    CREATE VIEW streaks AS
      --- the SQL from above, omitted for brevity
    ;
    """
  end

  def down do
    execute "DROP VIEW streaks;"
  end
end

Finally, running mix ecto.migrate will create our streaks view in the database. With that, we can query the streaks view as if it were any other table:

SELECT habits.name, MAX(streaks.length)
FROM habits
INNER JOIN streaks ON streaks.habit_id = habits.id
GROUP BY habits.id;

          name           | max
-------------------------+-----
 Write in Journal        |   5
 Evening Stretches       |  45
 Floss                   | 607
 Morning Stretches       |  27
 Exercise                |  14

(Elixir: great for oral hygiene!) Of course, I did say above that we wouldn’t need to write any more raw SQL. Since database views behave a lot like tables, let’s see what Ecto schemas can do for us.

View-Backed Ecto Schemas

There’s actually nothing you need to do differently! By default Ecto’s schema macro can use the name of a database view just like a table, and allow the full range of type declarations and associations:

defmodule Habits.Streak do
  use Habits.Web, :model

  @primary_key false
  schema "streaks" do
    belongs_to :habit, Habit
    field :start, :date
    field :end, :date
    field :length, :integer
  end
end

I’ve set @primary_key false because my streaks view doesn’t have a primary key (such as id, which Ecto will expect by default). However, this line won’t be necessary for views with an id column.

As with any schema module, this new Habits.Streak module is a great place to encapsulate logic related to streak data, such as finding the longest streak:

defmodule Habits.Streak do
  # schema, etc.

  def longest(queryable \\ __MODULE__) do
    queryable
    |> select([s], max(s.length))
    |> Repo.one
  end
end

to be used like so:

iex> Streak.longest
607

or, after setting up a has_many :streaks, Streak association in the Habit schema:

iex> Repo.get(Habit, 1)
...> |> Ecto.assoc(:streaks)
...> |> Streak.longest
5

For context, I put the entire set of changes into a pull request, which you can see here: github.com/stevegrossi/habits/pull/4

Further Reading

Database views are effectively subqueries: they’re run each time you call them. If necessary, you can have the database cache the results of a view using something called a “materialized view”. Learn more about using those with Ecto in this post by Sylvain Kieffer.