How to Count With ActiveRecord
Counting records returned from the database by ActiveRecord is quite a bit more complicated than you’d expect, but that’s the price of Rails’ magic and the invisible efficiency it gives you behind the scenes. ActiveRecord has its own implementation of Ruby’s size
, any?
, and empty?
methods which behave differently than their array counterparts. Understanding how each works will help you write more efficient code when dealing with database records.
In General
- When counting the number of results from the database, use
size
unless you know you’ll be loading all of the results later on, then uselength
. - When checking if there are any results from the database, use
any?
or its oppositeempty?
unless you know you’ll be loading all of the results later on, then usepresent?
or its oppositeblank?
.
To understand why, read on.
Counting the Number of Records
Rails gives us three ways to count the number of records returned from the database.
count
When called on an ActiveRecord::Relation
object, count
will perform a SQL COUNT
query, which is very fast, especially on large result sets:
Post.all.count
# (87.5ms) SELECT COUNT(*) FROM "posts"
#=> 100000
But count
is also a plain Ruby method (defined on Enumerable
) which can be called on arrays:
Post.all.to_a.count
# (1396.5ms) SELECT "posts".* FROM "posts"
#=> 100000
Notice that after converting Post.all
from an ActiveRecord::Relation
to an array, count
no longer performs a COUNT
query. Instead, to_a
loads all 100,000 records from the database and initializes them as an array of Post
objects, and count
calculates the size of that array, which in this case is about 16x slower. The takeaway here is that ActiveRecord::Relation#count
is much faster than initializing an array of objects and calling Enumerable#count
on the array.
length
As mentioned above, length
is a plain Ruby method which functions like count
when called on an array. The difference is that ActiveRecord does not define its own length
method. So if you call length
on an ActiveRecord::Relation
object, it will be converted to an array, and will count the array’s objects. This is equivalent to the slower count
method above:
Post.all.length
# (1372.2ms) SELECT "posts".* FROM "posts"
#=> 100000
(As an aside, when called on an array, length
is slightly more efficient than count
. See this post for more on that.)
size
Like count
, size
works differently depending on whether you call it on an array or an ActiveRecord::Relation
object. When called on an array, size
is identical to length
: it returns the number of objects in the array. But things get interesting when you call size
on an ActiveRecord::Relation
. In this case, it behaves differently depending on whether or not the relation’s records have been loaded from the database yet. To test this, lets create an ActiveRecord::Relation
object @posts
(as we’d typically do in a controller) and ensure it’s not been loaded from the database.
@posts = Post.all
@posts.class
#=> Post::ActiveRecord_Relation
@posts.loaded?
#=> nil
Since @posts
hasn’t been loaded, size
will perform a COUNT
query:
@posts.size
# (82.1ms) SELECT COUNT(*) FROM "posts"
#=> 100000
Now, if we manually load the posts:
@posts.load
# (1368.2ms) SELECT "posts".* FROM "posts"
#=> #<ActiveRecord::Relation [#<Post id: ...
@posts.loaded?
#=> true
@posts
contains an array of Post
objects from the database, so size
will no longer perform a COUNT
query, and instead just get the length of the array:
@posts.size
#=> 100000
However, ActiveRecord::Relation#size
has one more trick up its sleeve: it can avoid the COUNT
query entirely when called on an association with a counter cache. (To learn more about counter caches, the Rails Guide is an excelent resource. For now, I’ll assume you’re familiar with the concept.) Assuming a User
model with a posts_count
column and which has_many :posts, counter_cache: true
:
user = User.find(1)
#=> #<User id: 1 ...
@posts = user.posts
@posts.size
# Look ma, no query! Instead, size returns user.posts_count
#=> 100
size
Is Your Friend
For this reason, you should almost always use size
when counting ActiveRecord relations. It will usually choose the most efficient way to count: a length
on the array if you’ve already loaded the records, a speedy COUNT
query if you haven’t, or simply reading from the available counter cache if you don’t need to. The only case where you might not want to use size
is to avoid the COUNT
query if you know you’re going to load the records, but haven’t yet (and don’t have a counter cache). For example, consider this view code:
<h2><%= @posts.size %> Posts</h2>
<ul>
<%= render @posts %>
</ul>
Assuming @posts
hasn’t been loaded yet, @posts.size
is going to perform a COUNT
query to get the number of posts, and then render @posts
is going to perform a second query to load all of the posts. Here, you could eliminate the COUNT
query by using length
to load all of the posts, since you’re going to load them anyway:
<h2><%= @posts.length %> Posts</h2>
<ul>
<%= render @posts %>
</ul>
So, that’s how to count the number of results. But what if we just want to know if there are results? We could always just do if @posts.size > 0
, but there are more elegant ways…
Checking Whether There Are Any Results
As it does with count
and size
, ActiveRecord overrides some of Ruby’s array methods in order to more efficiently check whether an ActiveRecord::Relation
will return any records.
When called on an array, empty?
returns true
if the array contains no elements:
[].empty?
#=> true
[1, 2, 3].empty?
#=> false
However, when called in an ActiveRecord::Relation
object, empty?
performs an efficient COUNT
query to check if there are any records:
Post.all.empty?
# (80.8ms) SELECT COUNT(*) FROM "posts"
#=> false
But if the results of the ActiveRecord::Relation
have already been loaded into an array of Post
objects, empty?
skips the COUNT
query and falls back to its array behavior, just like size
does:
@posts = Post.all
@posts.load
# (1364.5ms) SELECT "posts".* FROM "posts"
#=> #<ActiveRecord::Relation [#<Post id: ...
@posts.empty?
# No superfluous COUNT query
#=> false
any?
is effectively the opposite of empty?
(see note below) and its ActiveRecord version works the same way, performing a COUNT
query if the records haven’t been loaded, and falling back to its array behavior if they have.
Post.all.any?
# (81.4ms) SELECT COUNT(*) FROM "posts"
#=> false
@posts = Post.all
@posts.load
# (1364.5ms) SELECT "posts".* FROM "posts"
#=> #<ActiveRecord::Relation [#<Post id: ...
@posts.any?
# No superfluous COUNT query
#=> true
(Note: technically any?
, when called on an array, checks if any of its values evaluate to true, whereas empty?
just checks if the array has values. So [false].empty?
is false
because it contains an element, but [false].any?
is also false
because it contains no elements which evaluate to true
. So any?
is not quite the opposite of empty?
, but since ActiveRecord returns arrays containing objects which always evaluate to true
, in such cases any?
and empty?
do function as opposites.)
And like size
, any?
and empty?
can take advantage of an available counter cache in order to efficiently avoid even the COUNT
query when possible. Assuming the earlier User
model with a posts_count
column and which has_many :posts, counter_cache: true
:
user = User.find(1)
#=> #<User id: 1 ...
@posts = user.posts
# No query! empty? checks whether user.posts_count == 0
@posts.empty?
#=> false
# No query! any? checks whether user.posts_count > 0
@posts.any?
#=> true
Because it’s smart like size
, you’ll almost always want to use any?
and empty?
in order to check whether records exists in the database. any?
and empty?
will choose the most efficient way: checking the array of results if they’ve already been loaded, performing a COUNT
query if they haven’t, and checking for a counter cache if the results don’t need to be loaded. The exception to this rule is similar to size
: if you know you’re going to be loading an array of results but haven’t yet (and don’t have a counter cache), you’ll probably want to avoid the superfluous COUNT
query. A common example is when checking if there are any posts before rendering them:
<% if @posts.any? %>
<h2>All Posts</h2>
<ul>
<%= render @posts %>
</ul>
<% end %>
Here, @posts.any?
will perform a COUNT
query, which is pretty efficient, but if you expect there will almost always be posts you may want to avoid the extra query entirely. In that case, you can preemptively convert the posts to an array (@posts.to_a.any?
), but I would recommend using ActiveSupport’s present?
, which will convert @posts
to an array for you:
<% if @posts.present? %>
<h2>All Posts</h2>
<ul>
<%= render @posts %>
</ul>
<% end %>
The reason I recommend present?
over to_a.any?
has to do with the note above on the difference between any?
and empty?
. empty?
simply checks whether an array contains elements, while any?
has to check each individual element in the array to determine whether it evaluates to true. As a result, empty?
is faster than any?
, especially with large arrays. ActiveSupport’s present?
method just calls !blank?
, and blank?
internally calls empty?
, so present?
will always be as fast or faster than any?
.