March 12, 2008

Scaling Rails with Apache 2, mod_proxy_balancer and Thin Clusters

Recently I have been working on a rails application that is now about to be deployed. One of my concerns about this app was to be able to make it really fast and responsive to the users while still loading a huge amount of database records and serving a relatively large amount of pages.

As far as making database calls faster is concerned, there are two things that work here:


  • Whenever possible, use Model.find(:all/:first, :select => "model_table.column_1, model_table.column_2)" to only get what you want and not all the stuff ActiveRecord loads for you about the model's instance.

  • Second, you might want to have a look at memcached and how to use it with rails. In order to facilitate the use of the memcached, you could either look at the acts_as_cached plugin or the cached_model gem. A good tutorial for using cached_model can be found here.



However, the purpose of this post is not to discuss database calls performance issues so I will reserve this for another article and focus on the matter at hand: scaling rails through Apache, Load Balancing and the Thin ruby server.

Thin


Thin is a fast Ruby HTTP server that puts together 3 excellent libraries and thus brings the best of all worlds: the Mongrel Parser (which Mongrel is based on), Event Machine and Rack. When I first read about Thin, I thought: Yet another RoR server. But I got curious, downloaded it, set it up on my development machine and ran the application I was working on with it. I was amazed at how much faster everything got. Just a click and your page would load in an instant. I liked the improvement over the original Mongrel. First, I assume we would want to get up and running with Thin.

Installing and running Thin



$ sudo gem install thin


All the dependencies will be downloaded with. The next step is to run it and appreciate the speed differences.

  • CD into your app's directory

  • Once in there, type "thin start".



By default, you will be able to access your application on http://localhost:3000. So nothing has changed as per normal. Give it a ride.

In case you want to run a cluster of say 5 servers, you would do so with the following command from within the directory of your rails app:

$ thin start -s5



To look at all available options and the features Thin offers, try:

$ thin --help


You can stop the cluster started above by running:

$ thin stop -s5


So, we have managed to get Thin up and running and we love the speed gains as well as the simplicity. Thin can be downloded here: http://code.macournoyer.com/thin/.

We'll now move on to the next step: getting Apache running with mod_proxy / mod_proxy_balancer and mod_proxy_http

Apache 2.2 and mod_proxy_balancer


If you did not have apache or are willing to recompile


If you do not have Apache on your machine or would prefer recompiling apache with mod_proxy, all you have to do is to download it, tar -xvzf (if file extension is tar.gz) or tar -xjf (if file extension is tar.bz2) the tarball and run a command similar to the following:

(assuming you've untar-red httpd-2.2.4)

$ cd httpd-2.2.4

$ ./configure --enable-modules=most --enable-shared=max --enable-ssl --enable-deflate --enable-headers --enable-proxy --disable-dav --prefix=/usr/local/apache2 --with-included-apr --with-apxs2

$ make
$ sudo make install


I have to say, there are a million ways of configuring Apache, all depending on what your needs are and there are a million tutorials on the web for the same. So I wouldn't waste too much ink on this. Find what works best for your environment.

If you already have apache and do not want to recompile


Now, for those who already have Apache 2 installed and wouldn't want to recompile from scratch, you can dynamically load the required modules in. But you still need the source code. Assuming you've gotten it and untar-red it as per above, the following commands would help you do just that:

(note that the APXS path that I am using is based on the configuration of my system. You will have to adapt it to your own)

$ cd httpd-2.2.4

$ cd modules/proxy

$ /usr/local/apache2/bin/apxs -i -a -c mod_proxy.c
$ /usr/local/apache2/bin/apxs -i -a -c mod_proxy_balancer.c
$ /usr/local/apache2/bin/apxs -i -a -c mod_proxy_http.c


If everything worked out fine, you should have those modules installed in the modules directory of your Apache 2 configuration. On my system, that means "/usr/local/apache2/modules". Check if yours are there. The next step would be to tell Apache to load those modules on startup. First, make sure Apache isn't running by shutting it down:

$ /usr/local/apache2/bin/apachectl stop


Next, edit "httpd.conf" that you will find at "/usr/local/apache2/conf/httpd.conf" and add the following lines to it:


LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule proxy_balancer_module modules/mod_proxy_balancer.so


So, at this point, we've got Apache 2 running and configured with all the nice ingredients for load balancing.
Let's move on to step 3!

Configuring Apache for Thin clusters


I assume we're still in the "httpd.conf" file we were editing earlier. Let's scroll to the bottom of it.

Let's configure the load balancer



<Proxy balancer://super_production_balancer>
BalancerMember http://127.0.0.1:3000
BalancerMember http://127.0.0.1:3001
BalancerMember http://127.0.0.1:3002
</Proxy>


Note that the balancer can be called anything. It's just a reference name that can be used later when we're configuring our virtual host or something similar.

I have 3 BalancerMember(s). This assumes that I know the ports at which my clusters are going to run on and which IP addresses. Although, this example is very much focused on the "localhost", you could provide IP addresses to others servers within your network. There is even a special option to add more load to a server which you assume has more hardware to support that type of load. In such example we would have something such as:


<Proxy balancer://super_production_balancer>
BalancerMember http://127.0.0.1:3000
BalancerMember http://127.0.0.1:3001
BalancerMember http://127.0.0.1:3002
BalancerMemeber http://192.168.0.176:9001 loadfactor 4 #supermarchine this is
</Proxy>


Let's configure the virtual host



<VirtualHost *:80>
ServerAdmin info@superapp.com
ServerName www.production.superapp.com
ServerAlias production.superapp.com
ProxyPass / balancer://super_production_balancer/
ProxyPassReverse / balancer://super_production_balancer/
</VirtualHost>


All we did above is to give our app the possibility to be accessed via a normal HTTP request through any of the following domain names "www.production.superapp.com" or "production.superapp.com". From here, assuming you working on a "localhost" system, you will need to edit "/etc/hosts" to tell your computer where to find production.superapp.com. It goes as follows

127.0.0.1 production.superapp.com


Finally...


We're almost done. So assuming that we wanted to have a cluster of 3 Thin servers running in production mode, we would go back into our rails app directory and type in the following:

$ thin start -s3 -e production


It's now time to tell your users to go to http://production.superapp.com and have fun using your brand new super-scalable application!

Have fun!

Note: This is just an overview of what can be done in a very easy way. There is a lot more that you can do with the tools described above and I would recommend you research more on it.

Links:

No comments:

Post a Comment