Rails app
I created a rails app with models like so:
class User < ApplicationRecord
has_many :comments
end
class Article < ApplicationRecord
has_many :comments
end
class Comment < ApplicationRecord
belongs_to :article
belongs_to :user
end
Seed
I then seeded with some records:
art1 = Article.create(title: 'Ruby')
art2 = Article.create(title: 'Rails')
user1 = User.create(name: 'Ola')
user2 = User.create(name: 'Hello')
user3 = User.create(name: 'Wow')
comment1 = Comment.create(article: art1, user: user1, rating: 1)
comment2 = Comment.create(article: art1, user: user2, rating: 2)
comment3 = Comment.create(article: art2, user: user2, rating: 3)
comment3 = Comment.create(article: art2, user: user3, rating: 5)
comment4 = Comment.create(article: art2, user: user1, rating: 2)
SQL views
SELECT * FROM users
id | name |
----+-------+--
1 | Ola |
2 | Hello |
3 | Wow |
SELECT * FROM articles
id | title |
-----+--------+
1 | Ruby |
2 | Rails |
Implentation of queries in User model
What I wanted to do is:
- create a method to rank the user by comment rating
order_comment_rating
- create a method to select users with comments
with_comments
- chaine the two methods above
with_comments_order_comment_rating
The probleme being manly to write the third method because of the distinct
that may appear in the second method.
Let's deep into it a little later.
Ordering records by comment rating
This is the method to order the users by comment rating
def self.order_comment_rating
joins(:comments).merge(Comment.order(:rating))
end
Let's split the method to understand what it does.
Join comments
First we need to connect the users
and comments
tables
>> User.joins(:comments)
>> SELECT "users".* FROM "users" INNER JOIN "comments" ON "comments"."user_id" = "users"."id" LIMIT
This is an interesting :joins
as it returns 5 records, but we only have 3 users. What happens here ?
In SQL, this query does this:
id | name | comment_id |
----+-------+-------------+
1 | Ola | 1 |
2 | Hello | 2 |
2 | Hello | 2 |
3 | Wow | 3 |
1 | Ola | 1 |
It returns all the users but some of them appears twice. We have to imagine that there is a virtual table on the right, where each one of the comments is connected.
Users Hello
and Ola
appear twice as they both wrote two comments.
Order comments by rating
>> Comment.order(:rating)
>> SELECT "comments".* FROM "comments" ORDER BY "comments"."rating" ASC
With this query we get all the comments from the database, ordered by rating, so we have 5 records in output.
Chain both
Then we have no trouble chaining the two methods together as both returns 5 records, to have our order_comment_rating
method to list users.
def self.order_comment_rating
joins(:comments).merge(Comment.order(:rating))
end
List users with comments
I implemented this method to return the users who wrote comments.
def self.with_comments
joins(:comments).distinct
end
As we saw earlier joins(:comments)
return all users (even several times) for all the comments they wrote.
I applied a distinct
there so I don't have duplicatas of users. Because I just want to know which users have comments.
>> joins(:comments).distinct
>> SELECT DISTINCT "users".* FROM "users" INNER JOIN "comments" ON "comments"."user_id" = "users"."id"
So here the SQL would be:
id | name |
----+-------+---
1 | Ola |
2 | Hello |
3 | Wow |
Same as the table above, we have to imagine a virtual table on the right, the comments table.
List users with_comments
and order_comment_rating
The first reflex would be to chain the two methods above, like so:
>> User.with_comments.order_comment_rating
Let's split to explain what happens. Reading the definition of the methods in the User model, the query above is the same as
-- User.with_comments.order_comment_rating
>> User.joins(:comments).distinct.merge(Comment.order(:rating))
This fails as the User.joins(:comments)
returns five records:
- 2 records for the User
id = 1
as he wrote two articles - 2 records for the User
id = 2
as he wrote two articles - 1 record for the User
id = 3
as he wrote one article
When we apply the .distinct
, the query returns then, three records:
- 1 record for the User
id = 1
- 1 record for the User
id = 2
- 1 record for the User
id = 3
then we try to merge Comment.order(:rating)
.
But this query returns 5 records (one for each comment).
So we try to connect together one table with three records from one side and a table with five records in the other side.
Traceback (most recent call last):
ActiveRecord::StatementInvalid (PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list)
LINE 1: ..." ON "comments"."user_id" = "users"."id" ORDER BY "comments"...
^
: SELECT DISTINCT "users".* FROM "users" INNER JOIN "comments" ON "comments"."user_id" = "users"."id" ORDER BY "comments"."rating" ASC LIMIT $1
ActiveRecord doesn't know which line to connect with one line.
There is a way to handle this !
Subquery
We can handle this by doing a subquery.
# In Active Record
>> User.from(User.with_comments, :users)
>> SELECT "users".* FROM (SELECT DISTINCT "users".* FROM "users" INNER JOIN "comments" ON "comments"."user_id" = "users"."id") users
So basically, this query uses the with_comments
method and wrap it into a subquery. Then says to select all users from this subquery.
It returns exactly 3 users as the User.with_comments
query does.
But the differences is within the SQL:
>> SELECT DISTINCT "users".* FROM "users" INNER JOIN "comments" ON "comments"."user_id" = "users"."id"
The SELECT DISTINCT
is not called at the same moment.
Now we can apply the order_comment_rating
on the subquery, like so:
>> User.from(User.with_comments, :users).order_comment_rating
>> SELECT "users".* FROM (SELECT DISTINCT "users".* FROM "users" INNER JOIN "comments" ON "comments"."user_id" = "users"."id") users INNER JOIN "comments" ON "comments"."user_id" = "users"."id" ORDER BY "comments"."rating" ASC
It returns 5 records listing the users ordered by comment rating.
These queries were done using a posgresql database configuration.
I changed the database configuration to use the adapter sqlite3
.
The app remained the same, I still used the users and comments relationships and did not change the ActiveRecord queries in User
model.
I opened a rails console and wrote the exact same query as before.
>> User.with_comments.order_comment_rating
Just before I had a ActiveRecord::StatementInvalid
error and had to use a subquery to perform this query.
What was my surprise when, expecting this error to come up, I had 3 records returned and no error.
I looked at the SQL generated:
SELECT DISTINCT "users".* FROM "users" INNER JOIN "comments" ON "comments"."user_id" = "users"."id" ORDER BY "comments"."rating" ASC
sqlite
does not have any troubles handling the DISTINCT
in queries.
Conclusion and further researches to do
- Be carefully with database configuration: it has an impact on SQL queries.
- And if changing from one database to the other, queries can start to fail.
- What is the difference between
sqlite3
andposgresql
that explains this behavious ?