Skip to main content

Migrate your Linux application to the Amazon cloud, Part 3: Building scalability

Serve more traffic with ease

Sean Walberg, P. Eng (, Network Engineer, Independent author
Author1 photo
Sean Walberg is a network engineer and the author of two books on networking. He has worked in several different verticals from healthcare to media.

Summary:  Your article abstract goes here. Put the main points and key phrases at the beginning of the abstract, because it may be truncated in search results. Include symbols for any trademarked terms, such as Java™ or WebSphere®, as well as any HTML tagging needed for highlighting or linking to related developerWorks content. If this article is part of a series, the stylesheet creates a link to the entire series.

View more content in this series

Date:  26 Mar 2010
Level:  IntermediatePDF:  A4 and Letter (56KB | 8 pages)Get Adobe® Reader®

This series started off by looking at a Software as a Service (SaaS) offering called The application was picked up and moved to the Amazon Elastic Compute Cloud (EC2) in the first article, and then made more robust in the second article by adding redundancy, backups, and more reliable disks.

It's time to look beyond stability and into scalability. Because servers are rented by the hour, would it make more sense to run extra servers only when needed? What about asynchronously processed jobs?

Content Distribution Networks (CDNs), which cache static content on the edge of the network, used to be costly. Cloud computing has made CDNs accessible to even the smallest site. There's a big performance boost to be had by using a CDN, so we'll set one of those up too.

Automatic deployment

A key element of growing your infrastructure dynamically is that a new instance is going to have to attach itself to the production environment without outside intervention. It would be possible to continuously rebundle an AMI after each change, such as a code deploy or database change. However, a bit of scripting means that a server can deploy itself without too much work. has a deployment process like most other applications:

  1. Pull the code from the repository, such as Subversion or git.
  2. Start the application servers.
  3. Verify the application started correctly.
  4. Add the new server to the load balancer pool.

The instance will also have to configure a database server in /etc/hosts.

Pulling the code from the repository

The code base is stored in a Subversion repository. Subversion is a source code control system that tracks changes to the code base and allows developers to branch and merge code to work on features in a separate environment.

At the simplest level, a Rails application can run right out of a checked out copy of the source code. When changes are made, the production server performs an update and is restarted. A Ruby Gem called Capistrano manages these deployments in a way that allows for centralized control of multiple servers, and easy rollback if problems are encountered.

Rather than checking out the code by hand, a new server will bootstrap the Capistrano process. This is more work up front, but it means that the server can later be managed by Capistrano quite easily. Listing 1 shows the contents of the initial Capfile which will be placed in the payroll user's home directory.

Listing 1. A Capfile to bootstrap the Capistrano deployment
load 'deploy' 

# Where is the code? This will need to be customized!
set :repository,  ""
set :scm_username, "DEPLOY_USERNAME"
set :scm_password, "DEPLOY_PASSWORD"

# Deploy to the local server
server "localhost", :app, :web, :db, :primary => true

# By default Capistrano tries to do some things as root through sudo - disable this
set :use_sudo, false

# Define where everything goes
set :home, '/home/payroll'
set :deploy_to, "#{home}"
set :rails_env, "production"

The Capfile is executed by the cap command. Capistrano expects that it will be SSH'ing into a server even if it's the local host. Listing 2, which is to be run as the application user, prepares the server so that the application user can ssh in to the local server as itself, so that the Capistrano tasks may be run.

Listing 2. Preparing the environment
# Create the SSH directory and generate a secure keypair
mkdir .ssh
chmod 700 .ssh
ssh-keygen -b 1024 -t dsa -q -f .ssh/id_dsa -N ""

# Allow the user to SSH in to the server without a password
cat .ssh/ >> .ssh/authorized_keys
# SSH is very fussy about permissions! Nothing should be readable
# by other users
chmod 600 .ssh/*

# SSH in once, ignoring host keys, so that the server will save the host key
ssh -o StrictHostKeyChecking=false /bin/false

Listing 2 is divided into three sections:

  1. The first two commands create an SSH directory that's only readable by the application user. The third command creates a 1024 bit DSA key (-b 1024 -t dsa), turns off output (-q), and specifies the name of the key and that there shall be no password.
  2. Copy the user's public key to the authorized_keys file which allows the user holding the corresponding private key to log in without a password. Make sure that no one else can read files in the directory.
  3. Without a saved host key, ssh and Capistrano will prompt to verify the key which breaks the automated deployment. The last command logs in to the local server, ignoring host keys, and executes /bin/false. This is enough to save the host key.

The final step in the process creates the environment and deploys the current version of the application. Listing 3 shows how to use Capistrano to take care of these tasks.

Listing 3. Deploying the application with Capistrano
cap deploy:setup
chmod 700 /home/payroll
cap deploy

Listing 3 runs the deploy:setup task which creates the directory structure underneath the application home directory consisting of directories named shared and releases. Each deployment goes in its own directory in the releases folder. The shared directory is used for logs and other elements that can be shared between separate deployments.

Before deploying the application, the second command changes the permissions on the home directory to be 700. This is an artifact of deploying to the home directory instead of a sub directory, as the setup task opens up the permissions of the deployment directory to allow group writes. The SSH daemon will not allow key based authentication if the home directory is writable by people other than the user.

The deploy task is finally run, which checks the code out of Subversion, puts it in a release directory, and makes a symbolic link called current. With this symlink in place and each deployment in a separate directory, it is easy to roll back to previous versions of the application.

Launching application servers automatically

Even though it's possible to build a system that automatically adds more web servers when the load gets high, it's not always a great idea. Web traffic fluctuates considerably, so you would want to make sure the traffic is going to stay around before launching another server. You will need about 10 minutes to determine that the load you are seeing merits a new server, then another 5 - 10 minutes to launch an instance and get a new web server in your pool. This may not be effective for a web server, but it can work well for job servers.

You can still grow and shrink your application servers on a schedule. For example, most users of the application do so from about 8am to 9pm on week days. To improve response time for the registered users and reduce server costs, 3 servers will be run on weekdays from 8am to 9pm, and 2 servers during other times. The deployment scripts are already written, all that needs to be added is to start the server and add the server to the load balancer.

Launching servers from cron

The cron facility seems the natural place to turn the extra instance on and off on a preset schedule. The ideal crontab would look like Listing 4.

Listing 4. A crontab that starts and stops servers automatically
0  8 * * 1-5 $BASEDIR/servercontrol launch app
0 21 * * 1-5 $BASEDIR/servercontrol terminate app

The code in Listing 4 is entered in a user's crontab on a machine outside the virtual environment. At 8 in the morning on days 1-5 (Monday through Friday), a script called servercontrol is run with the parameters "launch app". At 9pm the same command is run with parameters of "terminate app". These parameters will be understood by the script to be an operation, and a server role. For now, the roles will always be app.

Writing the control script

The next step, then, is to write the servercontrol script, which is run from cron to start up servers on demand.

Listing 5. The servercontrol script


if [ "$OP" == "" -o "$ROLE" == "" ]; then
  echo You\'re doing it wrong.
  echo $0 operation number role

case "$OP" in

    # Launch the instance and parse the data to get the instance ID
    DATA=`ec2-run-instances -k main -d "DB=;ROLE=$ROLE" $AMI`
    INSTANCE=`echo $DATA | grep INSTANCE  | cut -f 6 -d ' '`
    echo $INSTANCE is your instance

    # Keep on checking until the state is "running"
    while [ "$STATE" != "running" ]; do
      STATE=`ec2-describe-instances  $INSTANCE |awk '/^INSTANCE/ {print $6}'`
      echo the state is $STATE
      sleep 10
    # Keep track of which instances were started by this method
    echo $INSTANCE >> $HOME/.ec2-$ROLE

    # Now that the instance is running, grab the IP address
    IP=`ec2-describe-instances $INSTANCE |awk '/^INSTANCE/ {print $13}'`

    # If this is to be an app server...
    if [ "$ROLE" == "app" ]; then
      # Check the web server to make sure it returns our content
      while [ $UP -eq 0 ]; do
        OUTPUT=`curl -s -m 5 http://$IP`
        if [ $? -eq 0 ]; then # curl was successful
          echo $OUTPUT | grep -qi 'welcome'
          if [ $? -eq 0 ]; then
            sleep 5

      # Register with the load balancer
      elb-register-instances-with-lb smallpayroll-http --instances $INSTANCE


     # Grab the instance ID. It's the last line of the file

     # Assuming the file exists, of course
     if [ ! -f $FILE ]; then
        echo No dynamic instances have been started for the $ROLE role

     # The last instance started is the last line of the file
     INSTANCE=`tail -1 $HOME/.ec2-$ROLE`

     # Assuming there's something in that file, that is
     if [ "$INSTANCE" == "" ]; then
         echo No dynamic instances have been started for the $ROLE role

     # Terminate the instance
     ec2-terminate-instances $INSTANCE

     # Take the instance out of the load balancer
     elb-deregister-instances-from-lb smallpayroll-http --instances $INSTANCE

     # Delete the last line of the file
     sed -i '$d' $FILE

      echo "You may only launch or terminate"

Listing 5 may look long, but a good part of the script is error checking. The script starts by parsing parameters, and breaking the launch and terminate functions into a case statement. For the launch, the script starts the instance and waits until the instance has entered the running state. The script then gets the IP address of the instance and waits until a web request to the server successfully returns and contains the word "welcome".

Stopping the instance is much easier. The instance IDs are written to a dotfile in the user's home directory with newer instances being appended to the file. To stop the instance, the script reads the last line of the file to get the instance ID of the most recently started instance, issues a termination command, removes the instance from the load balancer, and then deletes the last line of the file containing the instances.

Note that each role has its own file. Right now there is only the web role, but this will be expanded later.

One curious item from Listing 5 is the -d parameter that was passed to ec2-run-instances. This parameter contains information that can be read by the instance by visiting a special URI that's only accessible to the instance. This information is in the form of a string. In Listing 5, the server's role and database server are passed to the instance.

Writing the init script

The init script is run during system boot, and will configure the instance with a current version of the application, along with starting the appropriate application configuration. Listing 6 makes use of the information passed from the control script to make those configuration decisions. The code below can be part of a SYSV startup script, or it can go in the rc.local file to be run once on boot.

Listing 6. The application startup script

# If this is a fresh AMI, set up the application directories
if [ ! -d $HOME/releases ]; then
  echo "Setting up environment"
  cp $SRC/Capfile $HOME
  # Listing 2
  su - $USER -c "cd $HOME && sh $SRC/setup_environment"

echo "Deploying the application"
su - $USER -c "cd $HOME && /opt/ree/bin/cap deploy"

# Grab the user supplied data. returns data unique to the instance.
USER_DATA=`/usr/bin/curl -s`
DBHOST=`echo $USER_DATA | sed 's/.*DB=\([0-9\.]*\).*/\1/'`
ROLE=`echo $USER_DATA | sed 's/.*ROLE=\([a-zA-Z]*\).*/\1/'`
logger "Starting application with DBHOST=$DBHOST and ROLE=$ROLE"

# If available, put the dbhost in /etc/hosts
if [ "$DBHOST" != "" ]; then
  sed -i '/dbhost/d' /etc/hosts
  echo "$DBHOST dbhost" >> /etc/hosts

# Depending on the role...
case "$ROLE" in
    # Web server... start up mongrel
    su - $USER -c "mongrel_rails cluster::start \
      -C $HOME/current/config/mongrel_cluster.yml"
    logger "$ROLE doesn't make sense to me"

Listing 6 starts by initializing some variables. Next, the application user's home directory is checked to see if the cap deploy:setup task has been run before, and if not, the task is run. Next, a deployment is run so that the latest code is available.

Now that the code is available, the script checks the metadata that was passed to the instance, and with some sed magic, extracts the components to variables. If the DBHOST variable is set, this value is put into /etc/hosts so that the application can know where to find the database. The role is checked, and if the server is destined to be an application server, then the mongrel servers are started.

Together, Listings 5 and 6 are a fair bit of code, but they set the groundwork for automated starting and stopping of any kind of servers. With the crontab from Listing 4 in place, the extra server will come online during peak periods and turn off when the site is less busy. Next you will extend this framework to launch different kinds of servers.

Asynchronous job processing

One common technique for making dynamic web sites more efficient is to move long running requests to a background process. These long running jobs are usually not as time sensitive as a real request. An application might send a request to run a report to a job processing system, and using Asynchronous Javascript and XML (AJAX), poll for completion in the background. The user sees some sort of spinner that indicates the application is working, but the user is not tying up a mongrel process that could serve more interactive requests.

This approach does not remove the need to have ample resources available to process the jobs. If the job queue gets too long, users will get tired of waiting for their reports. This seems to be an ideal use case for dynamic server launching. Something will monitor the backlog of jobs, and if the backlog crosses a certain threshold, a new server will be launched to help out. After some time, the server will be torn down.

No free lunches

Running new servers isn't free, so there are some economic considerations to take into account when you're deciding when to launch new servers.

The m1.small instance that this series has been using costs about 8.5 cents (USD) per hour, or part of an hour. So when you launch an instance, even for a minute, you're buying an hour. If you decide to use bigger instances to do more work the hourly costs go up.

Make sure you understand the costs of launching a new server and the costs of not doing so. Slower service means unhappy customers. Faster service means happier customers, but more costs.

Background processing is provided by the excellent delayed_job gem, specifically the collectiveidea fork (see resources). This gem lets you fire off jobs in one line of code and implements a priority queue so that your important jobs don't wait behind routine jobs. The job processing daemon runs out of the rails application directory and uses the database to request work. This means that you can extend the current scripts to handle delayed_job daemons.

Updating the init script to support job processing servers

Recall from Listing 6, the script checks its instance metadata to see what was passed from the servercontrol script. The ROLE parameter dictates the server's job, with app meaning an application server. The instructions for each server type are wrapped in a case statement. Listing 7 extends this case statement to handle delayed_job roles.

Listing 7. Handling the startup of a delayed_job server
case "$ROLE" in
    # For an application server, start the mongrels
    su - payroll -c "/opt/ree/bin/mongrel_rails cluster::start \
      -C /home/payroll/current/config/mongrel_cluster.yml"
    # For a job server, figure out what kind of machine this is, and run
    # an appropriate number of job processing daemons
    TYPE=`curl -s`
    case "$TYPE" in
      'm1.small') NUM=2 ;; # 2 per ECU * 1 ECU
      'm1.large') NUM=8 ;; # 2 per ECU * 4 ECUs
      *) NUM=2 ;;
    su - payroll -c "RAILS_ENV=production $HOME/current/script/delayed_job -n $NUM start"
  logger "$ROLE doesn't make sense to me"

The script checks the role of the server. If the server is an application server, then mongrels are started. If the server is a job server, the script checks to find out what kind of instance it is running on. This information is available from another URL on the virtual host. As a rough estimate, the script launches two delayed_job workers per Elastic Compute Unit (ECU). Your workload may differ.

At this point, the servercontrol script can launch a new job server by passing launch job on the command line.

Monitoring the queue

You have several ways to monitor the queue backlog. You could add a job and time how long it takes to be processed, starting a new server if the time is outside of a threshold. The downside to this method is that if the queue really gets backed up, it will take a long time to determine that you're backed up. The simplest solution is to query the database for the number of outstanding requests, and to serve this in a controller. Listing 8 shows such a controller.

Listing 8. A controller that shows the length of the queue
class QueueController < ApplicationController
  def length
    render :text => Delayed::Job.count(
      :conditions => "priority > 0 AND failed_at IS NULL").to_s

Listing 8 simply shows the length of the queue, specifically jobs with a priority greater than 0 and those which have not been processed (i.e., haven't failed), and renders this number directly rather than passing to a template. Browsing to /queue/length will give you the current queue backlog.

Launching new job servers in response to demand

Now that the length of the queue can be easily determined, you need a script to act on this data. Listing 9 shows such a script.

Listing 9. Launching more job servers if needed

# What's the length of the queue
QUEUE=`curl -s`
# How many servers are running now? (zero out if file doesn't exist)
SERVERS=`wc -l $HOME/.ec2-job`
if [ "$SERVERS" == "" ]; then SERVERS=0; fi

# launch up to two servers while the queue is over 20
if [ $SERVERS -le 2 -a $QUEUE -gt 20 ]; then
  servercontrol launch job

# Terminate one instance if the queue is under 5
if [ $SERVERS -gt 0 -a $QUEUE -lt 5 ]; then
  export TZ=/usr/share/zoneinfo/UTC 
  LAST=`tail -1 $HOME/.ec2-job`
  # But only if the server has run for at least 45 minutes
  UPTIME=`ec2-describe-instances $LAST | \
    awk '/INSTANCE/ {t=$10; gsub(/[\-:T\+]/, " ", t); print systime() - mktime(t) }'`
  if [ $UPTIME -gt 2700 ]; then
    servercontrol terminate job

The code in Listing 9 should be run from cron every 5 minutes. The code first gets the length of the queue and the number of job servers currently running. The number of job servers is gleaned from the length of the .ec2-job file that contains the instance IDs of the dynamically run servers. If the length of the queue is more than 20, and there are fewer than 2 extra job servers running, then a server is launched. If there is more than one server running, and the queue is less than 5, the script does some more checking to see if it should terminate an instance.

The script first sets the timezone to UTC by setting the TZ environment variable. It then gets the instance id of the last run job server and queries to get the startup time. This is fed into some awk substitution to arrive at the time, in seconds, that the server has been alive. If the server has been alive for more than 45 minutes, then the instance can be turned off. If not, the server stays alive.

Business logic

Listing 9 implements a very simple algorithm. It suffices to demonstrate the principles behind dynamically launching servers, but could stand to have more intelligence. For example, the script could watch the length of the queue over time and turn off servers after a period of stability, rather than the simple version that exists now.

The 45 minute hurdle is there to prevent against prematurely turning off a server. If the queue subsides and then backs up again, the server will still be there.

Using a Content Distribution Network

When someone goes to your website they are also loading images, cascading style sheets, and Javascript. Services called Content Distribution Networks (CDN) cache the static assets and distribute them across many servers across the Internet. As a result, you can provide faster access to these assets for your users, and allow them to download multiple files in parallel. Amazon has provided a CDN service called CloudFront which is a pay-as-you-go offering like their other services.

Assets that are to be served from a CDN are requested from a different server, so the URL will change. As an example, will load the stylesheet from the application server, but will load from the CDN. If the CDN does not have the asset, the CDN will pull the asset from the origin before caching and passing the asset along to the user.

CloudFront is slightly different than other CDNs in that the origin is an S3 bucket. To use CloudFront, you must first populate an S3 bucket with your static assets, and then rewrite your URLs to use CloudFront hosts.

Setting up CloudFront

From the CloudFront main page (see resources), click the button to sign up. You will have to wait for an activation email before continuing.

Once you have received confirmation of your activation, go to your AWS console page and click the Amazon CloudFront tab. You will see the page shown in Figure 1.

Figure 1. The CloudFront Distributions dashboard

Note that no distributions have been created. A distribution is merely a bucket for files that will be tied back to an S3 bucket. Click the Create Distribution button to get to the screen shown in Figure 2.

Figure 2. Creating a distribution

You will need to enter the following information:

  • Delivery Method: Download
  • Origin: Select the name of an S3 bucket, or make up one now that you will create with your S3 tool of choice.
  • Logging: Choose off unless you plan on using these logs.
  • CNAMEs: Enter 4 names underneath your domain, such as through

You can enter comments if you wish. Make sure the distribution status is enabled, and click Create. You will be taken back to the CloudFront tab where you will see the distribution you just created, which is shown in Figure 3.

Figure 3. A configured distribution

On this page is a domain name, such as You must now go to your DNS server and configure the 4 CNAMEs and have them point to the domain name of your distribution. This is just like the way the Elastic Load Balancer (ELB) works, except you are creating 4 numbered names.

Place a test file in the S3 bucket you associated with the distribution. You should be able to see this document by browsing to your distribution's domain name, such as to view a file called test.htm inside your bucket. If you get an access denied error, make sure you have public access enabled on the files (instructions vary depending on which tool you use to manage your S3 bucket).

If the previous test worked, then you can try using the CNAMEs you created earlier, such as

Synchronizing the files

You must copy the contents of your public directory to the S3 origin bucket. The simplest way is to run s3sync.rb -r -p public/ smallpayroll-cdn: from the root directory of your Rails application. The -r option means the copy is to be recursive, and the -p option makes all files publicly readable.

If you wish to automate this procedure, look in the resources for a gem that handles the task.

Updating your application

At a very simple level you can change all your image, Javascript, and CSS links to point to one of your CDN links instead of your web server. If you used the Rails URL helpers such as image_tag you can have Rails do the work for you. Add the following line to config/environments/production.rb:

Listing 10. Configuring Rails for the CDN
ActionController::Base.asset_host = ""

Listing 10 adds a single line to the production configuration, namely that static assets are to be served from the host defined as asset_hosts. The %d is, by default, expanded to the numbers 0 through 3, so you are telling Rails to rotate between,,, and These are the same hosts that you configured CloudFront to respond to. With 4 hosts, you can expect browsers to download up to 8 assets at a time, as browsers are generally limited to two connections per host.

Now, your application will make use of the CDN where possible. Enjoy the improved speed!


Your application makes use of a Content Distribution Network called Amazon CloudFront. This will speed up page loads by making downloads faster and allowing for parallel downloading by the client.

You have also updated your application to dynamically grow and shrink computing resources. Some of the launching and terminating is done on a schedule, and some is done in response to load. You also have some scripts that can be extended to handle many more cases.

One item lacking is management. At any given point you probably don't know exactly how many computers are running. Deployments aren't automated yet because the server list is always in flux. You really don't know how well the servers themselves are performing. Look for the final article in this series to address these problems.



  • AssetTagHelper - The Ruby documentation for the AssetTagHelper module. If you use these functions to generate links and images, you'll find moving to a CDN is a breeze.

  • Preparing a Rails Application for SVN - shows how to take any Rails application and manage it through Subversion. It's trickier than you think because of various log files.

  • Performance tuning considerations in your application server environment (developerWorks) - discusses ways to make your application faster, including using job servers.

  • An EC2 instance has several pieces of instance metadata that can help the instance learn about the environment. Browse through this chapter of the EC2 documentation, along with the list of metadata categories, and get some ideas of what can be done.

  • Alfa Jango's blog: Take CloudFront one step further and learn about how to handle compressed assets and how to selectively serve assets from either your application or the CDN.

Get products and technologies

  • Now that you've got multiple AMIs inside S3, you might want to prune some old ones. S3 File Manager is a web based file manager that rivals the features of many standalone applications or browser plugins. If you delete an AMI, don't forget to ec2-deregister it.

  • S3Sync is a helpful tool for copying files too and from S3, and manipulating your buckets.

  • The S3 File Manager is better than anything else out there for navigating your S3 buckets, and it doesn't even involve installing software. With a good browser, you can even drag and drop files from your desktop to S3.

  • Capistrano is a popular deployment package that acts in a similar manner to Rake.

  • delayed_job - A background job server that integrates well with Rails and ActiveRecord. This link is to the collectiveidea fork of the project, which seems to be the currently maintained stream.

  • synch_s3_asset_host - A gem that makes synchronizing your S3 origin bucket and your applications static files a breeze.


About the author

Author1 photo

Sean Walberg is a network engineer and the author of two books on networking. He has worked in several different verticals from healthcare to media.



Zone=Linux, Sample IT projects
ArticleTitle=Migrate your Linux application to the Amazon cloud, Part 3: Building scalability