May your coming year be filled with magic and dreams and good madness. I hope you read some fine books and kiss someone who thinks you’re wonderful, and don’t forget to make some art — write or draw or build or sing or live as only you can. And I hope, somewhere in the next year, you surprise yourself.
Neil Gaiman

Everything In Its Right Place

Heather Armstrong, who always manages to write candidly about depression and medication and life, and how sometimes you just have days where your emotions are on a rollercoaster.

So I offer up my humanness if [...] you need to hear that even with everything in its right place it’s okay if you still don’t know why it doesn’t feel that way.

Depression doesn’t have to be the debilitating condition where you can’t function or spend days in bed. I’ve not been this happy for a long time, and yet, there are still days that are hard. Even with everything in its right place.

From a London Evening Standard article on Mark Lewis, the solicitor at the centre of the phone hacking scandal:

“You’ve got News Corp, you’ve got News International, you’ve got News Group Newspapers, News of the World, you’ve got Farrer & Co, you’ve got Linklaters, you’ve got Olswang’s, you’ve got Clifford Chance, you’ve got so many of the big law firms on this and then on the other side you’ve got me,” Lewis says. “I haven’t even got a f**king secretary, I’ve got one hand and, you know, if I had two hands I’d tie one behind my back because they need a head start.”

BusinessWeek: Apple’s Supply-Chain Secret:

About five years ago, Apple design guru Jony Ive decided he wanted a new feature for the next MacBook: a small dot of green light above the screen, shining through the computer’s aluminum casing to indicate when its camera was on. The problem? It’s physically impossible to shine light through metal.

Ive called in a team of manufacturing and materials experts to figure out how to make the impossible possible, according to a former employee familiar with the development who requested anonymity to avoid irking Apple. The team discovered it could use a customized laser to poke holes in the aluminum small enough to be nearly invisible to the human eye but big enough to let light through.

I’d not heard this story before. Crazy stuff.

For what it’s worth: it’s never too late or, in my case, too early to be whoever you want to be. There’s no time limit, stop whenever you want. You can change or stay the same, there are no rules to this thing. We can make the best or the worst of it. I hope you make the best of it. And I hope you see things that startle you. I hope you feel things you never felt before. I hope you meet people with a different point of view. I hope you live a life you’re proud of. If you find that you’re not, I hope you have the strength to start all over again.
F. Scott Fitzgerald

Serving One Rails Application With Multiple Databases

First steps: reducing memory usage of multiple applications

For the last 12 months or so, we’ve been deploying each new account as a separate application instance, virtual hosted via a subdomain per application. This was a great (and easy) way of sandboxing data, but our memory usage has been increasing dramatically recently as we increased the number of instances. I’ve spent the last three weeks looking in detail at what we can do to reduce memory usage.

We’re actually running fairly low-load applications: our big drivers are security, uptime, and speed – in that order. Our server stack hasn’t changed and won’t be changing in the next few months – our cloud network is currently a few networked application servers, a database server, and an API server. Just for sheer sanity purposes, we want to keep the number of application servers down – and that’s not even counting the running cost. Our goal has been about 15-20 applications per 2-3GB of RAM.

We started out on an Apache/Passenger stack, which seemed to be averaging out about 150mb per instance – meaning we could fit perhaps 15 applications on a server. The bigger issue, though, was that at times of high load, it’d burst up to about 600mb, taking the server down.

The first real shot at fixing this was to move to an nginx stack running Unicorn to serve the Rails applications. There’s a lot of examples of this usage – such as Github – and so I thought this would be worth a shot. We ended up seeing usage of around 95mb for the master process, with the same again for each worker process. But to make this actually usable, we were needing 3 or 4 worker processes per application – even if I’d have settled on 3 worker processes that’s pretty much 400mb gone per instance. It just wasn’t going to hold up.

A question: are we tackling the wrong problem?

So we went back to the drawing board, and worked out what separate application instances was actually giving us. An ease of data sandboxing, but that was entirely due to the separate databases – which is the only thing we’re not willing to concede. But as for the application code – well, it’s all shared, surely? There’s just no reason to load up the entire Rails environment separately.

We needed to hang onto the same access method as we have at the moment – a separate subdomain per school – and I wondered if it would be possible to switch the database being loaded by Rails based on the hostname being accessed.

Connecting to the database based on the hostname

It turns out, this is actually really easy, but there’s a few… caveats… to forcing Rails to work like this.

It starts out with a fairly simple call to ActiveRecord::Base.establish_connection(). This has to be made in ApplicationController because it’s the only place we can access the request object.

class ApplicationController < ActionController::Base
 
  protect_from_forgery
 
  before_filter :override_db
 
  def override_db
    begin
      if Rails.env.production?
        @application_name = request.env['HTTP_HOST'][/^[\w]+/]
      else
        @application_name = ""
        return
      end
 
      ActiveRecord::Base.establish_connection(
        :adapter  => "mysql2",
        :host     => "",
        :username => "",
        :password => "",
        :database => "myapp_#{@application_name}"
      )
    rescue Exception => e
        # Database can't be found/connected to
    end
  end

The first issue with this is that Rails goes through its standard initialisation routine – so it will expect to find a database it can connect for your default development or production environment in database.yml. This means you have to have a dummy database – with the same structure as your actual application databases – made available to Rails.

The caching problem

This works without a hitch in the development environment, but in production, when Rails is set to config.cache_classes = true, things can get a little weirder. I’ve been seeing a very intermittent bug when trying to save a model – any model:

undefined method `name' for nil:NilClass
arel (2.0.10) lib/arel/visitors/to_sql.rb:56:in `visit_Arel_Nodes_InsertStatement'

This is actually being thrown by Arel. My biggest fear – that caching classes would mean data is available to other applications – just isn’t happening, through many many tests. However, Arel caches table definitions – field names, for example – and sometimes seems to get confused (returning a nil column name), even if the “default” and “active” database schemas are identical.

I did some pair debugging on this with Caius, and there are two methods available to clear this cache before the connection is established. In Rails 3.1.x, ActiveRecord::Base.clear_cache! has been added. I’m still running a 3.0.10 codebase for this application, and so we ended up going with ActiveRecord::Base.connection_pool.clear_reloadable_connections!. This makes the full code as below:

class ApplicationController < ActionController::Base
 
  protect_from_forgery
 
  before_filter :override_db
 
  def override_db
    begin
      if Rails.env.production?
        @application_name = request.env['HTTP_HOST'][/^[\w]+/]
      else
        @application_name = ""
        return
      end
 
      ActiveRecord::Base.connection_pool.clear_reloadable_connections!
 
      ActiveRecord::Base.establish_connection(
        :adapter  => "mysql2",
        :host     => "",
        :username => "",
        :password => "",
        :database => "myapp_#{@application_name}"
      )
    rescue Exception => e
        # Database can't be found/connected to
    end
  end

I haven’t managed to throw the load at it to see if this holds up yet, because it does seem to be a very intermittent bug. I’ll update this post over the next few days when I find out if this has fixed it.

UPDATE 19/10/11: This doesn’t fix it – I haven’t had chance to check the Rails 3.1 method yet, but so far on 3.0.10 the only thing I’ve found that’s a fix is to set config.cache_classes to false in production. Also interesting – I’m unable to reproduce in development environment, even with a production configuration. More soon.

Deploying and migrations

This is another one of those gotchas – how do you run migrations against an environment that doesn’t exist? I first looked at hijacking Rails’ migrations to loop over a list of databases every time it runs a migration, but because Rails is connected to the default database, if that’s fully migrated, it won’t even load the migration file.

The solution I found is by no means pretty. In your database.yml, this requires a separate environment creating for each database. Migrations can then be run using rake db:migrate RAILS_ENV=some_database_connection. This also creates a new log file for each “environment”, but this seems like a small price to pay.

This is the faux-code I’m using in my Capistrano deploy.rb at the moment. This just overrides the migrations task to connect to each database and run the migrations. It’ll be a short edit to read in every environment from database.yml instead, ignore development, production, and test, and loop over that instead of an array – but this works for now.

  desc "Running migrations on all databases"
  task :migrations, :roles => :app do
 
    installations = %w{
    database1
    database2
    database3
    }
 
    installations.each do |db|
      run "cd #{current_path} && bundle exec rake db:migrate RAILS_ENV=#{db}"
    end
  end
 
  before "deploy:migrations", "deploy"
  after "deploy:migrations", "deploy:restart"

So where from here?

This solution seems to work, assuming the caching of table structures issue doesn’t rear its head again. It’s certainly serving our purposes very well. Memory usage (now running nginx and Passenger) is down to about 150mb – one application – that is happily serving 15 installations’ worth of concurrent users. It’s a serious gain on hosting resources to what we were seeing.

But the biggest issue for me is just how much this feels like going against how Rails works. It’s obviously a specialised use-case, but from speaking to people at MagRails yesterday, I’m not the only person who’s come across this. I had a good discussion with Aaron yesterday about seeing if there’s anything that can be done at a Rails core level to support this, too.

If anyone else is doing this, or has any feedback, I’d really appreciate hearing thoughts!

I consider myself to be an inept pianist, a bad singer, and a merely competent songwriter. What I do, in my opinion, is by no means extraordinary.
Billy Joel