Developer: Frameworks
 Oracle Database
 Ruby on Rails
 Sample Code
xe, opensource, frameworks, All


Tips for Optimizing Rails on Oracle

by Luca Mearelli


Learn tips and techniques to build more efficient Rails applications on Oracle Database.

Published May 2007

Using the Ruby on Rails framework it is relatively easy to develop complex systems that are backed by an Oracle database, but this ease doesn't exempt the developer from the need to optimize the data model and the code that manipulates it.

This article is for developers working with Ruby on Rails applications on an Oracle database, with special attention to those having a live application that needs to be optimized. You will learn which options tweak the parameters that handle the connection to the database for the Oracle adapter to boost the application performance, as well as get a useful plugin to track down the query execution plans for live applications. You'll build a simple set of models and look at various typical queries on them, exploring how you can improve their efficiency.


The prerequisite to follow this article is having Ruby on Rails installed and configured properly to connect with an Oracle database. "The Ruby on Rails with Oracle FAQ" by Casimir Saternos explains the process in detail. (It's quite easy.)

All the examples for this article have been written and tested on a Windows XP machine running Instant Rails 1.5 preview 1 connecting to Oracle Database 10g Express Edition (XE) installed on Debian Linux.

For the remainder of this article I will make examples based on a Rails "version" of the HR schema that has been modified to directly comply with the common Rail conventions (as opposed to using the standard HR schema from a Rails application, explained in Saternos' "HR Schema On Rails"). This choice has been dictated by the wish to show the behavior of a standard Rails application on an Oracle database without being distracted by the specifics of handling a separate schema to map the HR schema into the Rails conventions. (Which is indeed a very clever way to handle legacy schemas!)

In the sample code file you will find the complete migrations for the schema and a (yml format) dump of the data which can be used to load them using the Active Record fixtures plugin, which is also included. (The db/db_loading.rb file does the data import and can be executed using script/runner).

The six tables are populated by the very same data that the regular HR schema holds, while they have some columns modified. (E.g., all of them now use the common Rails convention of having one id column as their primary key.)

The models also define some relations among them; you can see the full code for the models in the sample code file:

class Country < ActiveRecord::Base
    belongs_to :region                                                                       
    has_many  :locations

class Department < ActiveRecord::Base
    belongs_to :manager, :class_name => "Employee", :foreign_key => "manager_id"
    belongs_to :location                                                                   
    has_many :employees                                                                     

class Employee < ActiveRecord::Base
    belongs_to :department                                                                   
    belongs_to :manager, :class_name => "Employee", :foreign_key => "manager_id"
    belongs_to :job                                                                          
    belongs_to :department                                                        
    has_many :managed_employees, :class_name => "Employee", :foreign_key => "manager_id"
class Job < ActiveRecord::Base
  has_many :employees                                                                            

class Location < ActiveRecord::Base
  belongs_to :country                                                                     
  has_many :department  

class Region < ActiveRecord::Base
         has_many :country

All our tests will be done simply by connection with the Rails console and interactively querying the database through the model methods as we would do from within our controllers:

C:\Progetti\ArticoliROR\Oracle\project> ruby script\console
Loading development environment.

Part 1: Of configuration, Cursors, and Rows

Basic Configuration

So, you have already configured all the required prerequisites for running Rails on an Oracle database, and bootstrapping your Rails application is just a matter of opening a prompt window (terminal, xterm, or whatever) and typing:

rails myapp       

This will reward you with a skeleton application that is ready to hold your code (models, views, controller and tests). You will need to customize the config/database.yml file telling the framework basic information for each one of the databases that Rails uses.

Each entry is identified with a key that tells the system if the specific information is to be used while developing the application, while running the tests or when running the application in production mode. You could even define other database connections beyond the standard three and use them from your Rails application, but this subject goes beyond the scope of this article.

Having separate database settings means that you may modify the values in the development or testing database and conduct your tests and measures without touching the production database.

Here is an example of the basic configuration:

  adapter: oracle
  host: DEBIAN
  username: rails
  password: rails

The adapter parameter specify that we are connecting with an Oracle database and selects the OracleAdapter (see: activerecord/lib/active_record/connection_adapters/oracle_adapter.rb ). Older Rails releases used the 'oci' value to identify the oracle adapter, and this value can still be used.

With the host parameter you specify the database to which you want to connect. If you have configured TNS entry for your database:

    (ADDRESS = (PROTOCOL = TCP)(HOST = = 1521))

you may use its name for the host parameter; otherwise, you can use the Easy Connect naming convention to address the Oracle instance (host: //

The username and password are the credentials you want to use for your application. All the objects (tables, sequences) needed by the application should be accessible from this user and anything built thorough the database migrations will be built into this user's schema (at least this is the default behavior).

Models and SQL Queries

Now we will look at some of the queries that are generated by ActiveRecord, and this is made simple by the fact that while in development mode Rails writes all the queries it is performing to the log file. In this manner you can familiarize yourself with the behavior of the framework and gain greater confidence in what the tool will do.

I find that having a deeper understanding of how the framework code is going to translate your high level requests into actual SQL is a required step when working with powerful frameworks that make complex interactions with the data. You will gain a better understanding of the Rails idioms that should be used and how to write more efficient code.

One of the most common ways to get the model data is looking up an object by its id:

emp = Employee.find(202)

will get you the employee record with id = 202. Looking at the development.log we will see the queries that are getting executed:

Employee Columns (0.120000)   
 select column_name as name, data_type as sql_type, data_default, nullable,
  decode(data_type, 'NUMBER', data_precision,
  'FLOAT', data_precision,
  'VARCHAR2', data_length,
 null) as limit,
 decode(data_type, 'NUMBER', data_scale, null) as scale
 from all_tab_columns
 where owner = 'RAILS'
 and table_name = 'EMPLOYEES'
 order by column_id

Employee Load (0.070000)  SELECT * FROM employees WHERE ( = 202) 

The first query is executed once the first time a model class is loaded to pull the columns from the employee table allowing the system to dynamically generate the required methods for the model based on the database structure. (This is one of those features that make Rails development really quick!) Be warned that while doing your activity and running the server in development mode you will see this query repeated many times as the models get reloaded on each browser request. This makes the server slower but allows you to get the updates to the database structures without restarting the server.

The second query gets the data from the employees table for the employee whose id is 202.

Other examples of the models may be the following (all presented with the actual query executed on the database):

Given an employee, get its department manager.

mgr = emp.department.manager

SELECT * FROM departments WHERE ( = 20) 
SELECT * FROM employees WHERE ( = 201) 
Given an employee, get all the colleagues at the same department.
emps = emp.department.employees

SELECT * FROM employees WHERE (employees.department_id = 20) 


SELECT * FROM employees WHERE (employees.manager_id = 201)

You may use the include option to pull down in a single query from the server also objects which have some relation with our entity:

emp = Employee.find(202, :include=>[:department,:manager])

SELECT AS t0_r0, employees.commission AS t0_r1, employees.job_id AS t0_r2, 
employees.manager_id AS t0_r3,employees.salary AS t0_r4, employees.hire_date AS t0_r5, 
employees.phone_number AS t0_r6, employees.department_id AS t0_r7, 
employees.first_name AS t0_r8, employees.last_name AS t0_r9, AS t0_r10, AS t1_r0, AS t1_r1, departments.manager_id AS t1_r2, 
departments.location_id AS t1_r3, AS t2_r0, 
managers_employees.commission AS t2_r1, managers_employees.job_id AS t2_r2, 
managers_employees.manager_id AS t2_r3, managers_employees.salary AS t2_r4, 
managers_employees.hire_date AS t2_r5, managers_employees.phone_number AS t2_r6, 
managers_employees.department_id AS t2_r7, managers_employees.first_name AS t2_r8, 
managers_employees.last_name AS t2_r9, AS t2_r10 FROM employees LEFT OUTER JOIN departments 
  ON = employees.department_id LEFT OUTER JOIN employees managers_employees 
  ON = employees.manager_id WHERE ( = 202) 

This is really handy and efficient when working with objects behaving as containers for sets of other items as it avoids going to the database with a query for each and every row you want to manipulate.

Here's another example:

job = Job.find(id,:include=>:employees)

SELECT AS t0_r0, jobs.job_title AS t0_r1, jobs.min_salary AS t0_r2, 
jobs.max_salary AS t0_r3, AS t1_r0, employees.commission AS t1_r1, 
employees.job_id AS t1_r2, employees.manager_id AS t1_r3, employees.salary AS t1_r4, 
employees.hire_date AS t1_r5, employees.phone_number AS t1_r6, 
employees.department_id AS t1_r7, employees.first_name AS t1_r8, 
employees.last_name AS t1_r9, AS t1_r10 FROM jobs LEFT OUTER JOIN employees 
  ON employees.job_id = WHERE ( = 7) 

One of the points of this article is to show you the new configuration parameters introduced with the release 1.2 of Rails that may help in tuning the database-related performance of your application, but before going deeper into it let me explain how currently Rails builds the actual SQL used.

Bind Variables and Cursor Sharing

ActiveRecord uses its (dynamically built) knowledge of the models and database to create queries on the fly. It even builds many methods of the models themselves in this way allowing the developer to write things like:


but it does so by building the SQL query as a string and doing parameter interpolation before sending the query to the adapter for execution (without using bind variables). SQL injection and security problems are attacked at the adapter level by quoting the values on SQL interpolation but the performance hit remains, and it also impacts the scalability of any Rails application running on Oracle. To understand how this impacts performance and scalability, let me explain what happens when a query is executed.

When a query is first executed Oracle performs a hard parse step which verifies that it is syntactically correct and that all the objects (tables, columns, ...) being accessed really exist. If the checks pass, a new entry is created in the library cache with the parse results that may be reused for subsequent executions. The key to the library cache is the SQL text for the query itself, therefore two queries that have any literal difference are considered different for the purposes of parsing.

When a new query is executed, a hash lookup into the library cache is done to see if the query had been already parsed. If the query is found then the cached parse result is used and the hard parse step is avoided, moreover no new entries are added to the library cache. A way to minimize hard parses is to always use bind parameters.

Within the context of a pre-1.2 Rails application, without any specific database tuning this means that each time a different query is executed (even if it differs just for the parameter values), the statement is hard parsed by the database engine and a new query entry is inserted in the library cache.

You can verify what happens by looking at the v$sql view where the sql_text column holds the actual SQL used for the queries: each row in v$sql corresponds to an entry in the library cache. We may use the following query to get all the queries generated by our application (limiting ourselves to the queries that hit one of the 6 tables):

select sql_text from v$sql 
where (    lower(sql_text) like '%employee%'
        or lower(sql_text) like '%countries%'
        or lower(sql_text) like '%departments%'
        or lower(sql_text) like '%jobs%'
        or lower(sql_text) like '%locations%'
        or lower(sql_text) like '%regions%'
and   not lower(sql_text) like '%v$sql%'
order by sql_text
If we do the following:
(1..200).each do |id|
        Employee.find(id) rescue nil
which tries to get each Employee with id between 1 and 200. You'd end up with the following in v$sql:
SELECT * FROM employees WHERE ( = 1)
SELECT * FROM employees WHERE ( = 10)
SELECT * FROM employees WHERE ( = 100)
SELECT * FROM employees WHERE ( = 101)

This is not an advisable behavior from an application expected to scale, especially for the common patterns of use where you may expect many queries using the primary key (e.g. accessing to the user table for login purposes or for configuration data).

Work is under way to provide Rails with proper handling of bind variables, or at least to let ActiveRecord adapters themselves do the binding of parameters within the query. In the meantime the best option is to use the ability to handle query rewriting built into the database.

A specific configuration offered by Oracle comes to the rescue here. The CURSOR_SHARING database parameter changes the way the database behaves doing hard and soft parses in presence of queries that are not using bind parameters. The parameter can be set to exact, similar, or force (and the latter can be set databasewide or on a specific session).

  • exact is the default value where the database behaves as outlined above.
  • force tells the database to rewrite all the queries substituting the literals in the SQL text with bind parameters (e.g. :"SYS_B_0") resulting in one parsed entry for each query.
  • similar rewrites the query but also looks at the effect of such rewrite and generates different entries in library cache for queries that would end up using different execution plans—that is, if substituting the literal with a bind parameter would end up in a different plan being executed then the replacement is not done.

Before Rails 1.2, to change the cursor sharing setting you'd need to change it for the entire database instance or patch the ActiveRecord adapter. From the last major release an adapter configuration parameter has been added to do just that.

The cursor_sharing setting can be used in database.yml to choose the preferred value without messing with the global database parameters (thus being better citizens in an heterogeneous environment where other applications may require different settings).

  adapter: oracle
  host: DEBIAN
  username: rails
  password: rails
  cursor_sharing: similar

Its value is used straight in an alter session statement upon establishing a connection (in OracleConnectionFactory.new_connection):

conn.exec "alter session set cursor_sharing = #{cursor_sharing}" rescue nil

Moreover the default value has been chosen to be similar, which means that without needing to add cursor_sharing to the parameters you get a reasonable behavior: the queries are changed according to the plan that would be used depending on the values.

If you redo the query above you'll see this in v$sql:

SELECT * FROM employees WHERE ( = :"SYS_B_0")
just one library cache entry for any Employee.find(id) call.

To see the impact of similar vs. force , update the table to have some skewed data:

Employee.find(:all).each do |emp| 
        emp.salary = 3000 unless == 100

You updated all the records to have a salary of 3,000 except for the one whose id equals 100. Now you can add an index to the employee table from within the Rails project by generating a migration:

C:\Progetti\ArticoliROR\Oracle\project>ruby script/generate migration AddIndexes
      exists  db/migrate
      create  db/migrate/007_add_indexes.rb
In the generated file you should have
class AddIndexes < ActiveRecord::Migration
  def self.up
    add_index :employees, :salary, :name=>:idx_emp_salary 

  def self.down
    remove_index :employees, :name=>:idx_emp_salary 
then we do a rake db:migrate from the command line. Now if we do:
Employee.find(:all, :conditions=>["salary = ?", 24000])
Employee.find(:all, :conditions=>["salary = ?", 3000])

You'll find two parsed entries in v$sql:

SELECT * FROM employees WHERE (salary = :"SYS_B_0")
SELECT * FROM employees WHERE (salary = :"SYS_B_0")

This is because the two queries generated would use two different plans depending on the salary value. (In the first case the index would be used while in the second case a full table scan would result being the best plan.) Using force cursor sharing would tell the database to use to the same parsed entry for both queries which is not what we want in this case, even if it meant having only one cache entry.

Having a configuration available for this from Rails, let's tune its value on a per-application case, depending also on which kind of data you have and on which patterns of use you are seeing for your application.

A second parameter added to Rails 1.2, which may help in tuning the application, is the prefetch_rows configuration. This allows setting the OCI_ATTR_PREFETCH_ROWS connection parameter, which specifies the amount of rows to be pre-fetched from the database on each round trip. It can give a great performance boost in all cases where you want to fetch many rows.

Typical patterns are iterations over a set of entities like in:

Employee.find(:all, :conditions=>"salary < 5000")

In these cases rows pre-fetching means fewer hits to the database, while the cost is more memory used by the adapter—which is not a problem anyway if you want to process all the rows returned by the query.

The default value is set to 100 which has been identified by the Oracle adapter contributors to be ideal in a range of test cases, but more than for cursor_sharing the prefetch value needs to be tweaked by benchmarking with the real data-sets and queries used in the application. For custom tuning this parameter is set in database.yml as usual:

  adapter: oracle
  host: DEBIAN
  username: rails
  password: rails
  prefetch_rows: 100

Part 2: Execution plans and Indexes

What Rails Cannot Do

While Rails goes great lengths toward reducing the burden of development through a clever use of conventions and design pattern implementation, there are situations where it cannot have the required knowledge to make the decisions for the developer. Thus, the Rails developer should not forget that he or she is building a database-backed application, nor which database he or she is developing for.

While most of the coding is doable in a fairly database independent way a good understanding of how a database works is required if you want to get the best performance out of your applications.

Knowing how your database works is useful while developing the data model and while building the queries, but it is much more important when deciding which indexes you need on your tables. This is something that the framework cannot do and does not even attempt to do, as it would depend on:

  • the specific database that is used (as it might have or not have some capability when it comes to creating and using indexes)
  • the data that is present in the tables (as they influence the query execution plan and which indexes are used)
  • the specific application (as the actual queries used may access the data in unpredictable ways based on usage patterns)
Moreover these factors influence each other and any optimization will depend on all of them collectively. In his blog post "Indexing for DB Performance", Jamis Buck gives some tips on the kind of indexes that might be needed:
  • foreign keys (has_one, belongs_to, has_many, ... relations)
  • non-index fields that are frequently queried upon (e.g. login names)
  • columns where you sort of do calculations frequently
  • sets of columns that are used as conditions together

Each index adds some overhead as it needs to be updated on data insert and update, so you should not add indexes for every column (or combination of them). Rather you should make informed choices starting with some reasonable set of indexes, modifying them as the application evolves over time, taking into account the actual data and usage patterns.

Explain Plan. In an Oracle context, this means using the explain plan command to understand how queries are executed. It gives the developer a way to optimize the query itself and to eventually add indexes.

You should always try to optimize the query first and only afterward go on to add the indexes.

Rails makes it rather easy to use custom queries instead of a regular one with the handy find_by_sql method of ActiveRecord. Note however that in the find_by_sql call you could simply pass the text of the SQL query, but you should really confirm that you use the parameter substitution form, passing an array with the query and the list of parameters. This way you can use the parameter quoting that is in the database adapter and you make sure that you will use real bind variables in future when ActiveRecord supports them without the need to modify your code.

Query_Analyzer Plugin. While looking for bottlenecks in your application it is important that you get to know which queries are executed. The Rails log files can be really useful here, as the application while running in development mode logs all the actual queries. They might be collected and examined one by one to find the queries and generate the execution plans. This can become tedious work. The query_analyzer plugin helps relieve the burden by dumping the explain plan output in the log file. You can get an Oracle compatible release here; it was originally developed by Bob Silva for MySQL. To install it just unzip the archive contents in the vendor/plugins directory on your Rails project.

To allow the plugin to work in Oracle the database user will need to be given enough privileges on the data dictionary tables that are queried on loading the model. The following statement will do it:


The plugin monkey patches the database adapter to add the explain plan for each query that is executed by the adapter while running at a logging level below INFO. The plan is formatted and printed at the debug level to the default logger. This default behavior makes sure that the plan is not executed while running on the production environment.

Two parameters are available to modify the default behavior; both can be modified within the environment setup files: The plan_table_name parameter can be used to specify the name of the plan table and its default is the standard PLAN_TABLE; the plan_details parameter can be used to specify which details should be printed by the explain plan statement. Possible values are:

  • BASIC: displays minimum information
  • TYPICAL: displays most relevant information (default value)
  • SERIAL: like TYPICAL but without parallel information
  • ALL: displays all information
If you execute a common query:
it will dump the following data to the development.log file:
Employee Load (0.000000)  explain plan for SELECT * FROM employees WHERE ( = 101) 
Analyzing Employee Load
   Plan hash value: 1171009080                                                                
   | Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
   |   0 | SELECT STATEMENT            |             |     1 |   719 |     2   (0)| 00:00:01 |
   |   1 |  TABLE ACCESS BY INDEX ROWID| EMPLOYEES   |     1 |   719 |     2   (0)| 00:00:01 |
   |*  2 |   INDEX UNIQUE SCAN         | SYS_C004026 |     1 |       |     1   (0)| 00:00:01 |
   Predicate Information (identified by operation id):                                        
      2 - access("EMPLOYEES"."ID"=101)            


Note that the query uses a unique index related to the primary key for the table; in this case, nothing needs to be done.

Leaving the plugin active (setting a debug level of informations) allows you to collect the plan table outputs for all queries while running the application.

The plan table outputs a lot of information on the queries. You can quickly spot the tables that may need a new index by searching for 'TABLE ACCESS FULL' in the log file.

Employee Load (0.010000)  explain plan for SELECT * FROM employees WHERE (first_name = 'Stephen') 
Analyzing Employee Load
   Plan hash value: 1445457117                                                    
   | Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
   |   0 | SELECT STATEMENT  |           |     1 |   719 |     3   (0)| 00:00:01 |
   |*  1 |  TABLE ACCESS FULL| EMPLOYEES |     1 |   719 |     3   (0)| 00:00:01 |
   Predicate Information (identified by operation id):                            
      1 - filter("FIRST_NAME"='Stephen')                                          
      - dynamic sampling used for this statement  


This is the result of querying all the employees that have first name 'Stephen': Employee.find(:all, :conditions=>["first_name = ?", "Stephen"]).

Note that this query needs to scan the full EMPLOYEES table to get just one row when using a filter on the first_name column. If this kind of query is seen many times through the application, you should consider adding an index. The predicate information for the more costly operations gives you a good hint at what indexes may benefit the query execution—in this case, obviously an index on the first_name column.

Rails Migrations. You can also use Rails Migrations to manage indexes. (See the API docs for more information on what Migrations are and how to use them to manage a database schema.) To add an index use:
add_index table_name, column_names, options = {}

this adds an index on "table_name" for "column_names" where the latter can be a single column or a list of columns:

add_index :departments, :manager_id 
add_index :locations, [:city, :postal_code]

Within the options parameter you can specify if the index is to be created as unique:

add_index :locations, [:city, :postal_code, :street_address], :unique => true

or with a given name. (This is useful on Oracle since the default is to use a combination of table name and first column name, which can be too long.)

add_index :locations, [:city, :postal_code], :name => :idx_city_zip

To remove an index just use the remove_index method:

remove_index table_name, options = {} 
we can pass as options the column name(s) or the index name:
  remove_index :departments, :manager_id
  remove_index :departments, :column => :manager_id
  remove_index :locations, :column => [:city, :postal_code, :street_address]
  remove_index :locations, :name => :idx_city_zip
If you need to use more complex or database-specific SQL, use the execute command from the up or down methods to which you pass the SQL string to execute:
execute "CREATE INDEX idx_emp_first_name ON employees (first_name)"

In the example above:

C:\Progetti\ArticoliROR\Oracle\project>ruby script/generate migration AddFirstNameIndex
      exists  db/migrate
      create  db/migrate/008_add_first_name_index.rb
This is the migration file:
class AddFirstNameIndex < ActiveRecord::Migration
  def self.up
    add_index :employees, :first_name, :name=>:idx_emp_first_name 

  def self.down
    remove_index :employees, :name=>:idx_emp_first_names 

Run the migration using rake db:migrate.

C:\Progetti\ArticoliROR\Oracle\project>rake db:migrate
(in C:/Progetti/ArticoliROR/Oracle/project)
== AddFirstNameIndex: migrating ===============================================
-- add_index(:employees, :first_name, {:name=>:idx_emp_first_name})
   -> 0.0100s
== AddFirstNameIndex: migrated (0.0100s) ======================================

Executing the query again shows that the index is now used:

Employee Load (0.010000)  explain plan for SELECT * FROM employees WHERE (first_name = 'Stephen') 
Analyzing Employee Load
   Plan hash value: 2736374945                                                                       
   | Id  | Operation                   | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
   |   0 | SELECT STATEMENT            |                    |     1 |   719 |     2   (0)| 00:00:01 |
   |   1 |  TABLE ACCESS BY INDEX ROWID| EMPLOYEES          |     1 |   719 |     2   (0)| 00:00:01 |
   |*  2 |   INDEX RANGE SCAN          | IDX_EMP_FIRST_NAME |     1 |       |     1   (0)| 00:00:01 |
   Predicate Information (identified by operation id):                                               
      2 - access("FIRST_NAME"='Stephen')                                                             
      - dynamic sampling used for this statement         


As you have seen it is really easy to use the plugin to poke at the execution plans used by a running application and you may even leave it installed on the production server and disable/enable it when needed by changing the logging level.


In this article you dug a bit into how connections to an Oracle database are configured in a Rails application, and you have seen how the framework has been updated in the 1.2 release to have better performances thorough the use of the cursor_sharing and prefetch_rows parameters (while waiting for a real bind variable implementation).

You have also examined the Rails migration commands that are relevant to tuning the database (specially creating/removing indexes).

Finally, as you've learned, a good understanding of both how Rails builds the SQL from the code and how the database executes them is needed to get the best performance. The provided plugins should help here.
Luca Mearelli [] is a specialist in Oracle and Web technologies based in Città di Castello, Italy.
Send us your comments