How we sped up our model spec to run 12 times faster

We are using cancancan as an authorization gem for one of our applications. To make sure that our authorization rules are correct, we unit-tested the Ability object. In the beginning, the test was quite fast, but the more rules we added, the longer it took to run the whole model test.
When we analyzed what was slowing down our test, we saw that quite some time is actually used persisting our models to the database with factory_girl as part of the test setup. It took a bit more than 60 seconds to run the whole ability spec, which is far too much for a model test.

Let’s look at an excerpt of our ability and its spec:


# ability.rb

def acceptance_modes
  can [:read], AcceptanceMode
  if @user.admin?
    can [:create, :update], AcceptanceMode
    can :destroy, AcceptanceMode do |acceptance_mode|
      acceptance_mode.policies.empty?
    end
  end
end


# ability_spec.rb

describe Ability do

  let!(:admin_user) { create(:admin_user) }
  subject!(:ability) { Ability.new(admin_user) }

  context 'acceptance mode' do

    let!(:acceptance_mode) { create(:acceptance_mode) }

    before(:each) do
      create(:policy, :acceptance_mode => acceptance_mode)
    end

    [:read, :create, :update].each do |action|
      it { should be_able_to(action, acceptance_mode) }
    end

    it { should_not be_able_to(:destroy, acceptance_mode) }

  end
end


# ability_matcher.rb

module AbilityHelper
  extend RSpec::Matchers::DSL

  matcher :be_able_to do |action, object|
    match do |ability|
      ability.can?(action, object)
    end

    description do
      "be able to #{action} -- #{object.class.name}"
    end

    failure_message do |ability|
      "expected #{ability.class.name} to be able to #{action} -- #{object.class.name}"
    end

    failure_message_when_negated do |ability|
      "expected #{ability.class.name} NOT to be able to #{action} -- #{object.class.name}"
    end
  end
end

RSpec.configure do |config|
  config.include AbilityHelper
end

We first set up a user — in this case it’s an admin user — and then initialize our ability object with this user. We further have a model called AcceptanceMode, which offers the usual CRUD operations. An acceptance mode has many policies. If any policy is attached to an acceptance mode, we don’t want to allow it to be deleted.

Note that a lot of models are created, meaning these are persisted to the database. In this excerpt, we have 4 test cases. Each of these test cases needs to create the admin user, acceptance mode and also create a policy. This is a lot of persisted models, even more so if you realize that this is not all the acceptance mode specs and acceptance mode specs are only a small fraction of the whole ability spec. Other models are even more complex and require more tests for other dependencies.

But is this really necessary? Do we really need to persist the models or could we work with in-memory versions of these?

Let’s take a look at this modified spec:


describe Ability do

  let(:stub_policy) { Policy.new }
  let!(:admin_user) { build(:admin_user) }
  subject!(:ability) { Ability.new(admin_user) }

  context 'acceptance mode' do

    let(:acceptance_mode) { build(:acceptance_mode, :policies => [stub_policy]) }

    [:read, :create, :update].each do |action|
      it { should be_able_to(action, acceptance_mode) }
    end

    it { should_not be_able_to(:destroy, acceptance_mode) }

  end
end

Note that all the create calls are replaced with build. We actually don’t need the models to be persisted to the database. The ability mainly checks if the user has admin rights (with admin?), which can be tested with an in-memory version of a user. Further, the acceptance mode can be built with an array that contains an in-memory stub policy. If you look closely at the Ability implementation, you will see that that’s not even necessary. Any object could reside in the array and the spec would still pass. But we decided to use an in-memory policy nonetheless.

With this approach, no model is persisted to the database. All models are in-memory but still collaborate the same way as they would have when loaded from the database first. However, no time is wasted on the database. The whole ability spec run time was reduced from 60 seconds to 5 seconds, by simply avoiding to persist models to the database in the test setup.

As an aside: there’s a lot of discussions around the topic of factories and fixtures. Fixtures load a fixed set of data into the database at the start of the test suite, which avoids these kinds of problems entirely.

That’s it. We hope you can re-visit some of your slow unit tests and try to use in-memory models, or avoid persisting your models for the next unit test you write!

Dare to question

We had a problem.

Let me call it Project X. We were six months behind. Requirements Creep resulted in enormous methods, bloated controllers, a test coverage below the belt and still no clear plan of finishing. We worked a year on the thing, it has been close to finished for months now, but it wasn’t coming together. We had a problem.

Screen Shot 2015-02-05 at 6.03.28 pmWasn’t it cool in the old days, when we were the wizards, the magicians – where just the fact that we were able to create a simple calculating form or create a script saving someone two days of busywork per week? They trusted us when we said, it is going to take three weeks to implement it. If you understood how to “fix a computer” by finding the loose cable connector on the keyboard. When running a defragmentation tool made your uncle feel like he bought a brand new machine.

It’s no longer like that. Writing software is not so magical anymore, it’s a craft. We know what we do, and we’re appreciated for it. But things have to get done. The customer is king again. We’re constantly struggling in the space between what the customer wants and what we know is the right thing. We learned a while ago that wearing a hoodie and carrying a sticker-infested laptop to a board meeting doesn’t automatically raise their respect for us. We learned to listen. We learned to learn each customer’s language, to better understand, to better craft what’s needed.

On the other hand, we still feel like wizards. We know what works, and don’t want to waste our precious time with dull decoration. We want our effort limited to a minimum, working on the ambitious adventure, the principal puzzle, the real riddle. The cool stuff. Let’s write the simplest thing that can possibly work. You want more? You Ain’t Gonna Need It (YAGNI™). Because an apparently simple request might lead to days of unforeseen work, which might even go unpaid because its complexity never got onto any offer.

So we grew an instinct to say no, to approach a request with a certain defensive attitude. A feature has to pass a threshold first: Is it really needed?

But then, the customer actually pays for what we do, so saying no doesn’t fly well with them. We apparently need a different attitude.

We had a routine importing data which needed almost a day to run, and one wrongly formatted element in the source data would knock the process out. We added and tweaked, only to find the next edge case… we dearly wanted to exclude those edge cases, but many were still essential.

What was going wrong? What was the problem behind the problem?

Complexity is not value. But neither is simplicity as such. We are trained to write what’s needed, in budget, and on time. Those constraints are natural. We coders have experienced many situations where broken business models resulted in hopeless strategies, which turned into convoluted requirements. Sometimes we call it “design by committee”, where the results of a brainstorming session is translated into demands full of contradictions, wishful thinking and pies in the sky. After the session, several people “flesh out” the requirements, and the input of all participants is gathered, but never questioned.

Now try to write good code with that. We try to manage upwards, trying to filter what should never have made it into requirements.

Hence, the first draft of our company values had the line “dare to say no”.

“Dare to say no” at least tames the devil of blindly implementing what’s requested, only to find the contradictions at the very end where ideas meet reality, when bugs show up stemming from the bad design decisions above. Code is honest, code is pure. There is no handwaving, no “maybes” in code, no “mostly” or “generally” – come with unfinished ideas and you will be mercilessly punished. The wall of logic can’t be broken with sheer will, you’ll be crushed between requirements and feasibility.

But saying no doesn’t give you good code.

And Project X wasn’t finished. We saw it ourselves. We had something which worked, and somehow fulfilled requirements. But it didn’t feel right. It felt buggy and convoluted. It looked the part… We needed a reboot.

Reboot

“Dare to say no” apparently needed a reboot too. We worked on that line. And we found out what we meant by it. We wanted to be able to work on all levels of software to find the right solutions. We needed to be able to address the first decisions. Those which lead to the requirements causing trouble.

Mind you, this happens anyway – at the latest, when broken code goes in production. At this point, even the people who brainstormed the ideas will see the contradictions, because they’re now glaringly obvious. Only now the important questions get asked. Can’t we get to that knowledge earlier?

We can. It requires courage to show the contradictions, the unfinished thoughts. It requires tact and skill to identify the core requirements which clash, and talk about them. It requires a lot of guts to ask fundamental questions.

Invigorated, we addressed Project X with new energy. We started with tidying up the code. Where weird requirements held us up, we went back to the customer and asked why they wanted a certain feature, why it had to be like that. The pruning and culling resulted in a much more streamlined user experience, clean code, and somewhat to our surprise, a greatly improved relationship with the customer.

Our value became “dare to question.” Ask why, understand the answer – or ask why again. Get to the bottom of it. Find the need behind the need. Throw away what’s not necessary, make it clean – with the full understanding of the requirement.

The project is live now. We have more work coming.

Maybe we can still be wizards. We just have to learn the new magic.

Dare to question.

Filter Rails SQL log in production

In order to debug a problem, which only occurred in production, we recently wanted to tweak our Rails SQL logs to only show the access to a specific table.

Here’s what we did to accomplish this. We created a file initializers/filter_sql_log.rb with this content:

if Rails.env.production?

  module ActiveRecord
    class LogSubscriber
      alias :old_sql :sql

      def sql(event)
        if event.payload[:sql].include? 'users'
          old_sql(event)
        end
      end
    end
  end

end

This monkey-patches the ActiveRecord::LogSubscriber class and only delegates to the old logging method, if the SQL statement includes the string "users".

By default, SQL logging is deactivated in the Rails production environment. Therefore we needed to change config/environments/production.rb like this:

config.log_level = :debug

Simple Vagrant setup for Rails applications

There are many reasons on why you should use Vagrant for your development, as described here and here.

In order to get your Rails application running in Vagrant, the VM needs to have several components installed, such as: Ruby, Rails, a database, etc. One of the most common ways to provision (install the necessary packages) your VM is via Puppet of Chef. However, not everyone knows them well, and luckily there is an easy approach, namely to use shell scripts.

In a terminal window navigate to your existing Rails application and run the following command (don’t worry, Vagrant will not break your existing Rails project):

 $ vagrant init

A `Vagrantfile` has been placed in this directory. You are now
 ready to `vagrant up` your first virtual environment! Please read
 the comments in the Vagrantfile as well as documentation on
 `vagrantup.com` for more information on using Vagrant.

Like the output mentions, the command creates a file called ‘Vagrantfile’ in the current directory. Open it and read through the comments in order to get familiar with the available options. You will notice that all configuration is done in Ruby.

The first thing we need to do is to instruct Vagrant which OS to install. Edit the Vagrantfile and change the line config.vm.box = “base” with

 config.vm.box = "ubuntu/trusty64"

You can also search for available Vagrant VMs.

Next, we need to forward port 3000, in order to be able to access the Rails server in a browser outside the VM. We also want to tell Vagrant how it should provision our VM. To do that, add the next lines to the Vagrantfile:

 config.vm.network 'forwarded_port', guest: 3000, host: 3000
 config.vm.provision 'shell', path: 'bootstrap/bootstrap_vagrant.sh'

Now, it’s time to create the file bootstrap/bootstrap_vagrant.sh inside the root folder of your Rails application. The commands we place in this file will be executed when the VM will be provisioned.

An easy way to tell the provisioning script to only install packages it didn’t install already is to organize it in blocks. When a block completes it will track the progress by writing a tag to a temporary file, for instance the .provisioning-progress file.

Here is a basic example that installs Ruby (downloads the binary and compiles it):

# Install ruby
if grep -q +ruby/2.1.5 .provisioning-progress; then
  echo "--> ruby-2.1.5 is installed, moving on."
else
  echo "--> Installing ruby-2.1.5 ..."
  su vagrant -c "mkdir -p /home/vagrant/downloads; cd /home/vagrant/downloads; \
                 wget --no-check-certificate https://ftp.ruby-lang.org/pub/ruby/2.1/ruby-2.1.5.tar.gz; \
                 tar -xvf ruby-2.1.5.tar.gz; cd ruby-2.1.5; \
                 mkdir -p /home/vagrant/ruby; \
                 ./configure --prefix=/home/vagrant/ruby --disable-install-doc; \
                 make; make install"
  sudo -u vagrant printf 'export PATH=/home/vagrant/ruby/bin:$PATH\n' >> /home/vagrant/.profile

  su vagrant -c "echo +ruby/2.1.5 >> /home/vagrant/.provisioning-progress"
  echo "--> ruby-2.1.5 is now installed."
fi

As you can see, the script first checks the .provisioning-progress file for the tag +ruby/2.1.5. If it finds it then it skips the install (the whole block). Otherwise it installs and appends +ruby/2.1.5 to the .provisioning_progress file after it finishes. In this way, the next time you provision your VM it will detect that Ruby is already installed and will skip this block.
Similarly we can group our requirements and define setup blocks:

  • Set system locale
  • Install core libraries
  • Install a database
  • Install Ruby
  • Install Bundler and bundle the application
  • Run the migrations

Therefore our bootstrap_vagrant.sh script will have several blocks. At this gist: https://gist.github.com/luciancancescu/57025d19da727cfdc18f you will find an example that works for a new “blog” rails application. To get started copy the entire gist to your project and begin customising it.

Important: by default the provisioning script is run as user root.

After you have the provisioning script in place you can run:

 vagrant up

This will create a VM and will start provisioning it. When it finishes you can start your Rails application like:

 vagrant ssh
 cd /vagrant
 bin/rails s

In a browser open lvh.me:3000 and you should see the homepage of your Rails application. (read more about Lvh.me here)

Note:
The first time you run vagrant up it performs the provisioning. If you want to run the provisioning script again simply run vagrant provision.

Bonus 1:
If you need to install something new in the VM don’t to it by hand. Instead add the install commands in new block in the provisioning script file and from outside the VM run:

 vagrant provision

This will print a message for each of the existing blocks saying that it is installed and will only install the new package you added.

Bonus 2:
If for some reason you want to reinstall an already installed package just delete the corresponding block tag from ~/vagrant/.provisioning-progress and rerun vagrant provision.

Happy provisioning! If you have any suggestions or alternatives leave a reply in the comments box below.

increment/decrement counters in ActiveRecord

In lots of web apps you need to count something. Availability of products, number of login attempts, visitors on a page and so on.

I’ll show multiple ways to implement this, all of them are based on the following (somewhat fictional) requirements.

You have to implement a web-shop which lists products. Every product has an availability which must not go below 0 (we only sell goods if we have them on stock). There must be a method #take which handles the decrement and returns itself. If anything goes wrong (i.e. out of stock) then an exception must be raised.

The samples only deal with decrementing a counter. But of course incrementing is the same as decrementing with a negative amount. All the code is available in a git repository and each way is implemented in its own XYZProduct class. I ran the samples on Postgres and one implementation is Postgres specific but should be easy to adapt for other RDBMS.

Change value of attribute and save

The first thing that might come to your mind could look like this:

class SimpleProduct > ActiveRecord::Base
 validates :available, numericality: {greater_than_or_equal_to: 0}
 def take!
   self.available -= 1
   save!
   self
 end
end

The #take! method just decrements the counter and calls save!. This might throw an ActiveRecord::RecordInvalid exception if the validation is violated (negative availability). Simple enough and it works as expected. But only as long as there are not multiple clients ordering the same product at the same time!

Consider the following example which explains what can go wrong:

p = SimpleProduct.create!(available: 1)

p1 = SimpleProduct.find(p.id)
p2 = SimpleProduct.find(p.id)

p1.take!
p2.take!

puts p.reload.available
# => 0

p1 and p2available does not go below 0 is executed against the current state of the instance and therefore p2 uses stale data for the validation.

The same holds true for increment! and decrement!

So how do deal with this problem? We somehow need to lock the record in order to prevent concurrent updates. One simple way to to achieve this is by using optimistic locking.

Optimistic Locking

By adding a lock_version column, which gets incremented whenever the record is saved, we know if somebody else has changed the counter. In such cases ActiveRecord::StaleObjectError is thrown. Then we need to reload the record and try again. Rails allows to specify the lock column name. See Locking::Optimistic for details.

The following snippet should explain how optimistic locking works:

p = OptimisticProduct.create(available: 10)

p1 = OptimisticProduct.find(p.id)
p2 = OptimisticProduct.find(p.id)

p1.take!
p2.take!
# => ActiveRecord::StaleObjectError: Attempted to update a stale object: OptimisticProduct

by reloading p2 before we call #take! the code will work as expected.

p2.reload.take!

Of course we can not sprinkle reload calls throughout our code and hope that the instance is not stale anymore. One way to solve this is to use a begin/rescue block with retry.

begin
  product.take!
rescue ActiveRecord::StaleObjectError
  product.reload
  retry
end

When StaleObjectError is rescued then the whole block is retried. This only makes sense if the product is reloaded from the DB so we get the latest lock_version. I do not really like this way of retrying because it boils down to a loop without a defined exit condition. Also this might lead to many retries when a lot of people are buying the same product.

Pessimistic Locking

ActiveRecord also supports pessimistic locking, which is implemented as row-level locking using a SELECT FOR UPDATE clause. Other, DB specific, lock clauses can be specified if required. Implementation could look as follows:

class PessimisticProduct > ActiveRecord::Base
  validates :available, numericality: {greater_than_or_equal_to: 0}
  def take!
    with_lock do
      self.available -= 1
      save!
    end
    self
  end
end

The #with_lock method accepts a block which is executed within a transaction and the instance is reloaded with lock: true. Since the instance is reloaded the validation also works as expected. Nice and clean.

You can check the behaviour of #with_lock by running following code in two different Rails consoles (replace Thing with one of your AR classes):

thing = Thing.find(1)
thing.with_lock do
  puts "inside lock"
  sleep 10
end

You will notice that in the first console the “inside lock” output will appear right away whereas in the second console it only appears after the first call wakes up from sleep and exits the with_lock block.

DB specific, custom SQL

If you are ready to explore some more advanced features of your RDBMS you could write it with a check constraint for the validation and make sure that the decrement is executed on the DB itself. The constraint can be added in a migration like this:


class AddCheckToDbCheckProducts > ActiveRecord::Migration
  def up
    execute "alter table db_check_products add constraint check_available \
    check (available IS NULL OR available >= 0)"
  end

  def down
    execute 'alter table db_check_products drop constraint check_available'
  end
end

This makes sure that the counter can not go below zero. Nice. But we also need to decrement the counter on the DB:


class DbCheckProduct > ActiveRecord::Base
  def take!
    sql = "UPDATE #{self.class.table_name} SET available = available - 1 WHERE id = #{self.id} AND available IS NOT NULL RETURNING available"
    result_set = self.class.connection.execute(sql)
    if result_set.ntuples == 1
      self.available = result_set.getvalue(0, 0).to_i
    end
    self
  end
end

Should the check constraint be violated, then ActiveRecord::StatementInvalid is raised. I would have expected a somewhat more descriptive exception but it does the trick.

This again works as expected but compared to the with_lock version includes more code, DB specific SQL statements and could be vulnerable to SQL injection (through a modified value of id). It also bypasses validations, callbacks and does not modify the updated_at timestamp.

Performance

Yes I know. Microbenchmark. Still I measured the time for each implementation in various configurations.

1 thread, 1’000 products available, take 1’000 products

Implementation Duration [s] Correct?
SimpleProduct 1.71 YES
OptimisticProduct 1.87 YES
PessimisticProduct 2.16 YES
DbCheckProduct 0.91 YES

1 thread, 1’000 products available, take 1’500 products

Implementation Duration [s] Correct?
SimpleProduct 2.52 YES
OptimisticProduct 2.81 YES
PessimisticProduct 3.25 YES
DbCheckProduct 1.42 YES

10 threads, 1’000 products available, take 1’000 products

Implementation Duration [s] Correct?
SimpleProduct 1.51 NO
OptimisticProduct 15.86 YES
PessimisticProduct 1.87 YES
DbCheckProduct 0.61 YES

10 threads, 1’000 products available, take 1’500 products

Implementation Duration [s] Correct?
SimpleProduct 2.19 NO
OptimisticProduct 18.94 YES
PessimisticProduct 2.74 YES
DbCheckProduct 1.23 YES

Some interesting things to learn from these results:

  • SimpleProduct gives wrong results for concurrent situations, as explained above.
  • OptimisticProduct has some problems to scale with multiple threads. This makes sense as there is retry involved when concurrent updates occur.
  • DbCheckProduct is the fastest implementation which seems reasonable as there is no locking involved
  • DbCheckProduct and PessimisticProduct can both profit in a concurrent setup

Summary

Depending on your requirements the simplest way could already work and be good enough. If you have more specific requirements (i.e. validations, concurrency) then I’d suggest to go with the pessimistic locking as it is quite easy to implement and well tested (compared to my check constraint implementation of #take!). It is important to release a pessimistic lock ASAP as it blocks other clients from accessing the data.