Few weeks ago I was assigned to develop a feature, generating dynamic comparisons. It was about tracking 295 columns across 80 tables, with all kind of associations and create csv/xls reports.
Firstly I tried to create ActiveRecord joins just like every rails newbie would try! Well, I've failed at polymorphic associations. So after my failure, I wrote raw sql with many left/inner joins.
It was quite long trip but in the end everything was working well. I had like 80 tables joined manually and everything was working just fine. The time came and we had to check the performance - I have to admit, my solution was not so good in cases when only few joins were needed to select specific columns, performance was bad.
Next target was to create dynamic joins, that was quite challenging but I did manage. Had to use aliases a lot, every selected column had an alias and it told me what tables should be joined. So after lets say a day I ended up with well working method that was doing its job, everything was working just fine - the speed was very good but... it was quite complicated method, not cool to read and hard to maintain. And here we are, to the main topic of this post, the SQL views.
I accidentally heard my colleagues talking about that forgotten thing in rails. First thing to do : google it, after some research I knew that I have to check it out. More google and I found scenic - great gem to maintain these views. But I dont like shorcuts, I did it normal way, the SQL way !
Firstly lets create migration.
rails g migration create_comparision_view
class CreateComparisionTable < ActiveRecord::Migration def change execute <<-SQL CREATE VIEW comparision_tables AS SELECT clients.name as client_name clients.id as clients_id order.name as order_name ... FROM clients LEFT JOIN orders ON client.order_id = orders.id ... LEFT JOIN payments ON client.payment_id = payments.id; SQL end end
To comfortable use of that view we can create a model to our created view.
rails g mode comparision_view
class ComparisionTable < ActiveRecord::Base self.table_name = 'comparision_tables' def readonly? true end end
ComparisionTable.find_by(order_name: 'Star Wars Legi')
Ok after all was set up I did some benchmarks regarding what will be best for our project. For calculations whats faster I used benchmark.ips gem. All comparisons are made on my DB, where I have over 6 000 000 records, the result limits on sql query were set on 10000.
First attempt:select 295 columns from sql view vs select 295 columns using dynamic joins method(80).
Benchmark.ips do |setup| setup.time = 20 setup.warmup = 2 setup.report('Dynamic joins') do ActiveRecord::Base.connection.execute(joins_295) end setup.report('SQL views') do ActiveRecord::Base.connection.execute(views_295) end setup.compare! end Comparison: Dynamic joins: 1.1 i/s SQL views: 1.1 i/s - 1.01x slower
Quite similar results, but the sql statement length in my case is: joins 41 013 character vs sql views 14 027 characters so quite shorter!
Second attempt: select 38 columns from sql_view vs select 38 columns using dynamic joins method(12)
Comparison: Dynamic joins: 7.6 i/s SQL views: 0.4 i/s - 17.56x slower
Third attempt: select 1 column from sql_view vs select 1 column using dynamic joins method(2)
Comparison: Dynamic joins: 96.4 i/s SQL views: 0.6 i/s - 155.18x slower
Well the results are obvious, if we do not know what columns we need in each comparisons the dynamic joins are must, otherwise I would definitely say that SQL views are better. Selecting 290 columns from just one table looks great, and we do not have to worry about joins, in my opinion it is big facilitation. We should treat views as an “alias” for whole query, but sadly it does not increase the execution time As for now I will stick with dynamic joins, but it is worth to know that SQL views exists and are very friendly to use.
Of course there are materialized views that are known to be fast, it caches all selected columns. There is a reason I do not use them in my project, the materialized SQL views are not automatically updated, so after every change in these 295 columns I would have to trigger a function to update that view. Sadly that cant happen in that project :)
I had to check it myself that the materialized views are faster, and they are ! Here is the benchmark:
Last attempt: select 295 columns from materialized view vs select 295 columns using dynamic joins method(80)
Comparison: Materialized view: 12.0 i/s Dynamic joins: 6.6 i/s - 1.81x slower