Amazon Web Services




Using Parameterized Launches to Customize Your AMIs

Click for a printer friendly version of this document Printer Friendly Save to del.icio.us
Average Review:

PJ Cabrera explains how to use parameterized launches and a simple Ruby script to easily configure Amazon Machine Image (AMI) instances.

AWS Products Used: Amazon EC2
Language(s): Ruby
Date Published: 2007-12-07

By PJ Cabrera, freelance software developer

Amazon Elastic Compute Cloud (Beta) (Amazon EC2™), an Amazon Web Services (AWS) tool, lets developers create several instances of an AMI. Although you can use Amazon EC2 to create multiple instances almost at will, rarely does it makes sense for each of these instances to be configured identically. Parameterized launches let you pass the instances custom configuration parameters that they can retrieve and act on as they are started or at any other time during the life of the instance. In this article, I give you several examples of how to use parameterized launches and create a simple generic AMI that you can extend and use along with your own startup scripts and other software.

User Data Defined

You'll use the Amazon EC2 command-line utility ec2-run-instances to launch instances of an AMI, booting Amazon EC2 virtual machines with your choice of Linux distro and installed software. The command takes several parameters, among them the so-called user data parameters, -d and -f. You can use either of these parameters to send specific data to your AMI instances at launch time.

Following are several examples that show how to use the ec2-run-instances command and the -d and -f parameters to send user data at launch time. I will be using a fake AMI ID, ami-1234567. Substitute this for an AMI you can experiment with to run these examples.

The simplest example is a string of delimited-value assignments, in which you use commas as delimiters. You can parse this on your instance by using a script and configure your software according to the values you pass at launch time. I will discuss how to do this shortly.

ec2-run-instances ami-1234567 -d service=httpd,cache=320K

Sometimes you need to include white space and other special characters in your user data. Most command shells can treat long strings of text surrounded by double quotes as a single parameter, as I demonstrate in the following example.

ec2-run-instances ami-1234567 -d "services='httpd mysql'"

Unix shells can pass a parameter that spans multiple lines by using double quotes to mark the start and the end of the data.

ec2-run-instances ami-1234567 -n 2 -d \ 
"service=httd,domainname=www.mydomain.com
service=mongrel_cluster,domainname=mongrel1.mydomain.com"

In the previous example, the -n parameter tells Amazon EC2 to start more than one instance of the AMI at the same time--in this case, two instances. Imagine that the first line in the user data example is meant for the first instance, and the second line is meant for the second instance. I will explain how each instance can retrieve its specific user data later in this tutorial.

So, that is it for -d parameter examples. The -f parameter can be quite handy, too. You'll use it to send a file as user data instead of sending strings of text from the command prompt. Windows users can use the -f parameter to send data on multiple lines because the Windows command shell cannot process multi-line command lines. To demonstrate, let's reproduce the previous example using the -f parameter.

-- user-data.ini file contents ---
service=httd,domainname=www.mydomain.com
service=mongrel_cluster,domainname=mongrel1.mydomain.com

-- instance launching command-line --
ec2-run-instances ami-1234567 -n 2 -f user-data.ini

According to the Amazon EC2 Developer Guide, the -f parameter can even handle binary files! The data is encoded to base64 in transit and arrives verbatim on the other side. The Amazon EC2 Developer Guide also points out that the user data has a 16 KB limit. That's quite generous if you restrict your user data to heavily compressed text configuration files.

ec2-run-instances ami-1234567 -f bunchOfiles.zip

To retrieve this user-data (as the Amazon EC2 Developer Guide calls it) you run your instance by sending an HTTP GET request to the following URL: http://169.254.169.254/1.0/user-data. From your Amazon EC2 instance's command prompt, you can check this data by using cURL, like this:

curl http://169.254.169.254/1.0/user-data

When using the -f parameter to send binary files at launch time , you can use wget to download the binary file, like this:

wget http://169.254.169.254/1.0/user-data -O bunchOfiles.zip

A Practical Example

We've looked at ways to send user data at launch time and seen how to retrieve this user data at the command line from our instance. But the real power of this feature is in automating the retrieval of the user data from the instance at launch and applying the user data to configure our instances.

For example, imagine that you want to use one AMI to run your web site, employing several instances for performance, reliability, and fail tolerance. You want to have an instance of your AMI run Apache as a load balancer, and two other instances of the same AMI running mongrel_cluster. You also need the dynamic DNS utility ddclient to assign your instances' IP addresses to the domains used by your web site. You need dynamic DNS because the IP of an instance is likely to change every time you need to relaunch an instance.

I'm going to gloss over the installation and specific configuration steps of this software, and instead discuss how to automate the services that each of your instances will run and the setting of configuration parameters from user data sent at instance launch time. One specific configuration change I made in this case was to disable Apache, mongrel_cluster, and ddclient from running by default. You want them to be started only by the automated configuration scripts created for this example, and only if the configuration parameters of a specific instance's user data tell the scripts to start one of the services.

For this example, I am going to send a .zip file to three AMI instances by using the following command:

ec2-run-instances ami-1234567 -n 3 -f payload.zip

The payload.zip archive contains a shell script called autorun.sh, a Ruby script called get-launch-params.rb, and a plain-text configuration file called launch-params. I will show the contents of each of these files shortly.

The key to automating the configuration of services in an AMI at launch time is to modify the init scripts on your AMI to retrieve the user data and act on the configuration parameters sent at launch time. You need to modify the last script to run at startup. In a Red Hat-based distribution such as Fedora Core or Red Hat Enterprise Linux, this script is located at /etc/rc.local. For this example, you will modify /etc/rc.local by adding the following code at the end:

####### These lines go at the end of /etc/rc.local #######

wget http://169.254.169.254/1.0/user-data \ 
  -O /tmp/payload.zip

# if wget error code is 0, there was no error
if [ "$?" -e "0" ]; then

  mkdir /tmp/payload
  unzip /tmp/payload.zip -d /tmp/payload/ -o

  # if unzip error code is 0, there was no error
  if [ "$?" -e "0" ]; then
	
    # if the autorun.sh script exists, run it
    if [ -x /tmp/payload/autorun.sh ]; then

      sh /tmp/payload/autorun.sh

    else
      echo rc.local : No autorun script to run
    fi

  else
    echo rc.local : payload.zip is corrupted
  fi
	
else
  echo rc.local : error retrieving user data
fi

This script retrieves the user data, which I will call "the payload" for the remainder of this article, assuming it is a .zip archive. The file /etc/rc.local unpacks the payload to the folder /tmp/payload and runs a script called autorun.sh, which, as you might remember, we packed into the payload along with the other files mentioned earlier. The autorun.sh file contains the following code:

#! /usr/bin/bash

### autorun.sh

# get the launch parameters for this Amazon EC2 instance and 
# export the launch parameters out as shell variables
`/usr/bin/ruby /tmp/payload/get-launch-params.rb -e`

# if the 'service' environment variable is set, then start
# the service specified at launch for this instance
if [ "$service" != "" ]; then
  /etc/init.d/$service start
fi

# if the 'domainname' environment variable is set, then
# append the domain name at the end of the ddclient config
# and run the ddclient service to update the DNS records
if [ "$domainname" != "" ]; then

  # don't forget to update /etc/ddclient/ddclient.conf
  # with the credentials for your dynamic DNS provider,
  # and to create your own AMI so your changes will "take"

  echo $domainname >> /etc/ddclient/ddclient.conf
  /etc/init.d/ddclient start
fi

The autorun.sh script runs another script packed in the payload--the Ruby script called get-launch-params.rb--and then examines two specific environment variables, service and domainname, and takes action depending on whether the variables are set. Here are the contents of get-launch-params.rb:

#! /usr/bin/ruby

### get-launch-params.rb

# The following script obtains the launch parameters from 
# the file /tmp/payload/launch-params, then parses out the 
# parameters for this instance by using the launch index
# of this particular EC2 instance.
#
# Pass the command the -e flag to output the instance 
# parameters as exports of shell variables. Any other 
# arguments are ignored.

require 'net/http'
require 'uri'

def get_http_request(uri_suffix)
  uri_str = "http://169.254.169.254/1.0/" + uri_suffix
  response = Net::HTTP.get_response(URI.parse(uri_str))
  case response
  when Net::HTTPSuccess then 
    response.body
  else
    nil
  end
end

def get_launch_params(launch_params_file)
  IO.readlines launch_params_file
end

def get_launch_instance
  get_http_request "meta-data/ami-launch-index"
end

export_stmt = ""

launch_params = get_launch_params(
  "/tmp/payload/launch-params")

if launch_params.length > 0
  launch_instance = get_launch_instance

  instance_params_str = launch_params[launch_instance.to_i]

  instance_params = instance_params_str.split(',')

  export_stmt = "export " if ARGV.length > 0 && 
    ARGV.include?("-e")

  instance_params.each { |param| 
    puts export_stmt + param 
  }

end

The get-launch-params.rb script parses out the contents of the file called launch-params, which was also packed in the payload and which contains one line of configuration parameters per instance to be launched. The script obtains the AMI instance's launch index from the Amazon EC2 metadata web service, then outputs each environment variable that corresponds to this particular instance. Here are the contents of the launch-params file:

service=httpd,domainname=www.mydomain.com
service=mongrel_cluster,domainname=mongrel1.mydomain.com
service=mongrel_cluster,domainname=mongrel2.mydomain.com

Putting It All Together

Using /etc/rc.local and the contents of the payload archive along with autorun.sh, get-launch-params.rb, and launch-params makes for a very flexible and potent combination. The AMI has been kept as generic and simple as possible, except for the modified /etc/rc.local file. The payload archive contains all the configuration files and scripts we need to tailor each instance as we like.

Keep in mind that starting up a few services and setting a domain is only the tip of the iceberg. The autorun.sh script could have downloaded and installed packages, overwritten configuration files with other files from the payload, and turned a generic instance into anything I wanted. I chose to preinstall the Apache, mongrel_cluster, and ddclient packages for faster startup times, but I could have used an even more generic AMI and made autorun.sh install services according to the settings of various different launch parameters in the payload.

The possibilities of what you can do with parameterized launches are indeed endless. I hope this article whetted your appetite and started you thinking of ways to make your AMIs more flexible and reusable for many tasks and over many instances.

PJ Cabrera is a freelance software developer specializing in Ruby on Rails e-commerce and content management systems development. PJ's interests include Ruby on Rails and open-source scripting languages and frameworks, agile development practices, mesh networks, compute clouds, XML parsing and processing technologies, microformats for more semantic web content, and research into innovative uses of Bayesian filtering and symbolic processing for improved information retrieval, question answering, text categorization, and extraction. You can reach him at pjcabrera at pobox dot com, and read his weblog at pjtrix.com/blawg/



Related Documents
Document Type: Sample Code Code Samples for the article "Using Parameterized Launches to Customize your AMIs"

Discussion
Click to start a discussion on this document Create a New Discussion
No discussion has been created for this document.

Reviews
Create Review Write a Review

Very interesting..., Jan 23, 2008 1:41 PM
Reviewer: Gabriel Kent
Article does a good job @ showing how generic AMIs can be and how much can be done @ runtime...exciting really.

doesn't work under rightscale images?, Feb 21, 2008 2:53 PM
Reviewer: Shiv Ramamurthi
hmm - this used to work until recently: % ec2-run-instances -f payload.zip but now when I log into my (rightscale provided OS) instance I see: !!! Your EC2 Instance has failed installation. !!! !!! Please check /var/log/install for details. !!! Running boot RightScripts Thu Feb 21 14:47:38 PST 2008 /opt/rightscale/bin/runrightscripts.rb:7:in `require': /var/spool/ec2/user-data.rb:9: Invalid char `\257' in expression (SyntaxError) /var/spool/ec2/user-data.rb:9: Invalid char `\265' in expression /var/spool/ec2/user-data.rb:9: Invalid char `\177' in expression % file /var/spool/ec2/user-data.rb /var/spool/ec2/user-data.rb: data

Very helpful overview, Sep 10, 2008 6:28 AM
Reviewer: "sublime1"
This is a nice overview of how to handle instance-specific configuration automatically. Thanks!

Well written!, Oct 13, 2008 4:22 PM
Reviewer: cpstaff
Thanks for all the notes and example.

variable scoping problem..., Oct 30, 2008 11:36 AM
Reviewer: dd132
Good article. In the rc.local code sample I had to use the following in order to get the exported variables to be seen by my calling script: source /tmp/payload/autorun.sh instead of: sh /tmp/payload/autorun.sh Thanks for your good work! Daren
Welcome, Guest Help
Login Login