Wednesday, September 24, 2008

improving performance with :select in rails

For newbies, rails is amazing. With its constituent modules such as active-record, action-pack, action-mailer etc.. acting behind the scenes, rails provides a lot of abstraction to the developers making things more simpler. Given this simplicity through abstraction, there follow some issues which may degrade your application performance when not taken care of properly.

Let's have a look at Active Record. It is Active Record that makes people go crazy about rails. It shoulders the responsibility of database operations for the users providing them with different flavours of methods to deal with. But there are some pitfalls to consider:

1. The default 'find' method fetches all columns from a table row:
Active Record works at the row level but not at the column level. Consider a table "employees" having emp_id, emp_name, emp_slary, emp_address and etc.. upto 50 columns. When you want to make a detailed list of all the employess, you may tend to write

@employees = Employee.find(:all).

The above statement fetches all the fifty columns of every single employee from the table and converts it as Employee objects. What if you need only a set of
columns(say
emp_id, emp_name, emp_slary, emp_address) but not all. Now you can achieve this using :select option in the find method
@employees = Employee.find(:all, :select => ['
emp_id', 'emp_name', 'emp_slary',
'emp_address
'])
What this does is select only the specified columns from the table and converts it into Employee objects. Accessing unspecified attributes from the resultant Employee objects may reult in an Error/Exception, but saves a lot of overhead in selecting all the columns and turning the rows into heavy objects.

Another case may be an articles table. Though it may seem to contain less number of columns, one tends to use a column for article body which may contain text as well as images(usually these type of columns are set to BLOB type). Here also, to make a list of all the articles, you can write


@articles = Article.find(:all, :select => ['article_id', 'article_title', 'author_name',
'published_date'
])

and avoid the body column if you feel not needed.You are free to use all the other options along with :select.

2. Eagerloaded associations that contain heavy data:

class Author < ActiveRecord::Base
has_many :articles
end

class Article < ActiveRecord::Base
belongs_to :author
has_many :comments
end

class Comment < ActiveRecord::Base
belongs_to :article
end

Assuming that you know how to eagerload associated models using :include option , lets look at how we can finegrain the eagerloaded models using the :select option with :include.

@author = Author.find(:id, :include => [:articles])

This fetches an author's record and all the article records that belong to this author. But how to avoid fetching the heavy 'body' column from articles table. Can we use the :select to fetch only the desired columns from the associated model through :include ? This is not possible because eagerloading generates SELECT statement too, the use of :select together with :include is not supported . This can be achieved with a bit of extra code:

Download the patch from http://dev.rubyonrails.org/attachment/ticket/7147/init.5.rb submitted by mrj to Rails Trac.
Place this file in your lib directory and require it in environment.rb. For ex, if the file is named 'include_with_select.rb' in your lib, you can write in your environment.rb as:
require 'include_with_select'.

With this setup, you can freely select the desired columns in the :included associations as:
@author = Author.find(:id, :include => [:articles[:article_id, :article_title, :author_name, :published_date]])

If you want to eagerload a set of comment attributes for every article, you can write it as:
@author = Author.find(:id, :include => [{:articles[:article_id, :article_title, :author_name, :published_date] => :comments[:comment_text, :comment_by]}])

This way we can achieve a better performance using :select with/without :include.

No comments: