Monday, October 5, 2009

Scaling Injection of Data via ActiveRecord in Initializers

ActiveRecord find_or_create is not (by default at least in Rails 2.3.3) an atomic operation on the DB, which means that if you have 4 servers doing a find_or_create for the same object at the same time, you're going to get anywhere between 1 and 4 records created. This is especially an issue in any initializers that might create data.

It would be interesting for Rails to go in and create find_or_create functions in the database, but it doesn't do that. And while you could use client-side transactions, as a general rule, you shouldn't.

So instead, an easy way to handle it is via enforcement of uniqueness of rows in the database via indexes. Here is a sample migration:

class EnsureUniqueViaIndexes < ActiveRecord::Migration
  def self.up
    add_index :apples, [:color], :unique => true, :name => 'unique_color_on_apple'
    add_index :cars, [:type, :year], :unique => true, :name => 'unique_type_and_year_on_car'    
  end

  def self.down
    remove_index :cars, [:type, :year]
    remove_index :apples, [:color]
  end
end

Then in the config/initializers/create_apples_and_cars.rb:

unless Apple.find_by_color('green')
  p "Creating green apple"
  begin
    Apple.create!(:name => 'green')
  rescue Exception => e
    p "Got exception during apple create. If is duplicate row, please ignore error(s)- is probably due to multiple servers hitting at once: #{e} #{e.backtrace}"
  end
end

# there happens to be default_scope of year on Car, so we only have to lookup by type
unless Car.find_by_type('compact')
  p "Creating compact car"
  begin
    Car.create!(:type => 'compact')
  rescue Exception => e
    p "Got exception during car create. If is duplicate row, please ignore error(s)- is probably due to multiple servers hitting at once: #{e} #{e.backtrace}"
  end
end

This way, you can get errors on the console, but it all works as expected with as little overhead as possible.

No comments: