Secure salted passwords with ruby

Step one, Understand

So I’ve been playing with Sinatra and Redis over the past few months and as part of my more professional side I am creating a blog platform for my other website and as a result I wanted user authentication to ensure I and only I could update it, there’s a chance that at some point I may want to allow others to sign up and login but quite frankly not yet and this is over kill but none the less we learn an develop, so here’s the first step in building it from scratch.

Understanding some key concepts when it comes to authentication is key, so first some history lessons and why it is a bad idea to use those approaches today. Back in the day, way back when, people were trusting and assumed no evil of this world, we call these people naive; predominantly they relied on servers being nice and snug behind firewalls and locked cabinets. As such the passwords were saved in plain text in a database, or a file who cares, it’s human readable, so the attack on this is trivial. Let’s assume you are not silly enough to run the database over a network with no encryption and are instead running it locally well if I have access to your machine I will probably find it in a matter of minuets and if I have physical access within an hour it’s mine, all of them. On a side note, if you’re using plain text passwords anywhere you’re an idiot or in early stages of testing…

After realising this approach was bad, people thought, I can protect this and I will hash the password! wonderful, so you use md5, or sha it doesn’t matter which but for this example let’s say you chose md5. What you have done here is very cunningly created a password that is not human readable. However, be ashamed of your self if you still think this is secure, and here’s why. There are a lot of combinations (2^128 = 340 trillion, trillion, trillion) which to be honest is a lot! the chances of two clashes are so slim why worry about it! Wrong just plain wrong. So here’s why it’s bad, 1, people are idiots and for some strange reason we use typically words to make up passwords so if your password was “fridge” I’d imagine, what this means is if I as mr Hacker get hold of your DB I sit there while I run a dictionary style attack generating thousands of common words into md5 sums and before long I have a word that matches the same md5 sum as your password see this what’s better is because you’re a human it’ll probably work on all sites you use, Nice. The second problem is I don’t have to actually do that grunt work, people have already done it for me and they are called Rainbow tables so trivially I can download it and just do a simple query to find a phrase that matches yours. Don’t think it’s an issue, LinedIn did they fixed it perfectly.

Excellent, So now we’re getting to a point where we need something better, and this is where Salts come in. The basic concept is you type in a password and I as a server concatenate a random string to it and generate a hash. This over comes the rainbow table, because the likely hood of someone having generated a rainbow table from my random salt is highly unlikely, certainly a lot less likely than 2^128. However, lets assume I’m big web provider that got hacked and lost everyones password, “Ha! I used a salt, Good luck cracking that!” they say. Sure, a few points, 1, the salt is stored in the DB, 2, Computers are quicker than they use to be. With the advances in graphic processor cracking and amazon boxes Password cracking is more or less a matter of time & money, but how much is the key.

Using a single salt on all passwords is still bad, the best that I know of today is to use a unique salt for every password. Even if all users have the same password they all have unique hashes, and this is the corner stone. For every user that joins up, generate a secure random salt, add it to the password and generate a hash. So for every user a massive amount of time would be needed to crack just one password, so unique hashes very good.

The code

So history lesson over, now some code; as identified above the best way was to generate a unique salt for every user and then hash their password. This isn’t hard but you do need to ensure the salt is securely random, else you may just be generating something not as secure as you thought.

Have a look at this gist Auth.rb, so useful point, the salt should be large, so if you produce a 32 bytes hash, your salt should be at least 32 bytes as well. It’s a straight forward lib that will generate and allow you to get passwords back out.

Here’s an example of it in use.

require_relative 'auth'
#This is from the web front end so you can see how to use the Lib to check a hash
def user_auth_ok?(user,pass)
  #Get user's Hash from DB
  user_hash = $db.get_user_hash(user)
  #Validate hash
  if Auth.hash_ok?(user_hash,pass)
    #User authenticated
    return true
  else
    #user is not authenticated
    return false
  end
end

#As A side point This bit os from the backend that writes it to the DB 
#If it was the chunk from what the website used it would simply call a method that took username and password
#so probably not useful...
def add_user(username,password)
  #Poke the appropriate keys, create default users, example post etc.
  uid = @redis.incr('users')
  salt = Auth.gen_salt()
  hash = Auth.gen_hash(salt,password)
  $logr.debug("Salt = #{salt}, hash = #{hash}")
  @redis.set("users:#{uid}:#{username}",hash)
end

def edit_user_password(username, password, newPassword)
  user = @redis.keys("users:*:#{username}")
  if user.size < 1
    $logr.debug("No user #{username} found")
  else
    #Validate old password
    hash = @redis.get(user[0])
    if Auth.hash_ok?(hash, password)
      $logr.info("Setting a new password for #{username}")
      newHash = Auth.gen_hash(Auth.get_salt(hash),newPassword)
      @redis.set(user[0],newHash)
    else
      $logr.info("password incorrect")
      #TODO - Need exception classes to raise Auth failure
    end
  end
end

Deploying Sinatra based apps to Heroku, a beginners guide

A bit of background

So last week, I was triumphant, I conquered an almighty task, I managed to migrate my companies website from a static site to a Sinatra backed site using partial templates. I migrated because I was getting fed up of modifying all of the pages, all two of them, when ever I wanted to update the footers or headers, this is where the partial templates came in; Sinatra came in because it had decent documentation and seemed good…

So, at this time feeling rather pleased with myself I set about working out how to put this online with my current hosting provider, who I have a few domains name with but only through an acquisition they made. I thought I’d give them a shot when setting up my company site, better the devil you know etc and they supported php, ruby and python which was fantastic as I knew I would be using python or ruby at some point to manage the site. After a frustrating hour of reading trying to work how to deploy the ruby app and finding no docs with the hosting provider I logged a support ticket asking for help; to which the reply was along the lines of “I’m afraid our support for Ruby is very limited”. I chased them on Friday to try and get a response on when it would be available, no progress, some excuses because of the platform, so I asked “Do you currently have any servers that do run ruby?” to which the reply was “I’m afraid we have no servers that run Ruby, it shouldn’t be listed on our site, I didn’t know it was there.”

By this point alarm bells were ringing and I thought I best think about alternatives.

Getting started

sinatra-logo

Before even signing up to heroku it’s worth getting a few things sorted in your build environment, I had to implement a lot of this to get it working and it makes sense to have it before hand. So for starters, you need to be able to get your application running using rackup, and I came across this guide (I suggest reading it all). In short, you use Bundler to manage what gems you need installed, and you do this by creating a Gemfile with the gems in, and specifying the ruby version (2.0.0 for Heroku)

My Gemfile looks like this:

source 'https://rubygems.org'
ruby '2.0.0'

# Gems
gem 'erubis'
gem 'log4r'
gem 'sinatra'
gem 'sinatra-partial'
gem 'sinatra-static-assets'
gem 'split'

It simply tells rack / bundler what is needed to make your environment work, and with this you can do something I wish I found sooner, you can execute your project in a container so you can test you have the dependancies correct before you push the site by running a command like this:

bundle exec rackup -p 9292 config.ru &

NB You will need to run

bundler install

first.

By now you should have a directory with a Gemfile, Gemfile.lock, app.rb, config.ru and various directories for your app. The only other thing you need before deploying to heroku is a Procfile with something like the following in it:

web: bundle exec rackup config.ru -p $PORT

This tells Heroku how to run your app, which combined with the Gemfile, Bundler and the config.ru means you have a nicely contained app.

Signing up

Application hosting

Application hosting

Now, Why would I look at Heroku when I’ve already spent money on hosting. Well, for one, it will run ruby, two, it’s free for the same level of service I have with my current provider, three, it’s 7 times quicker serving the ruby app in Heroku than the static files with my current host. So step one, Sign up it’s free, no credit card 1 dyno (think of it as a fraction of a cpu, not convinced you get a whole one)

Create a new app, now, a good tip here, if you don’t already have a github account, Heroku is going to give you a git repo for free, granted no fancy graphs, but a nice place to store a website in with out forking out for private repos in github. Now once your site is in the Heroku git repo you just need to push it up and watch it deploy, at this point you amy need to fix a few things but… it’ll be worth it.

Performance

I don’t want to say it’s the best, so I’m going to balance up the awesomeness of what follows with this I suggest you read it so you can form your own opinions.

So using Pingdom’s tool for web performance I tested the performance of my site, hosted in the UK, vs Heroku in AWS’s European (Ireland) and here’s the results:

The current site, is behind a CDN provided by Cloudflare and already had a few tweaks made to make it quicker, so this is as good as it gets for the static site: results

Now the new site, unpublished due to the aforementioned hosting challenge, doe snot have a CDN, it is not using any compression yet unless Heroku is doing it, but it’s performance is significantly quicker as seen in the results

Now for those of you who can’t be bothered to click the link, the current site loads in 3.43 seconds which is slow but still faster than most sites, the Heroku based site loads in 459ms so 7 times quicker and it’s not CDN’d yet, or white space optimised, that’s pretty darn quick.

Sinatra – partial templates

Singing a different song

Firstly apologies, it’s been over a month since my last blog post but unfortunately with holidays, illness and change of jobs I’ve been struggling to find any time to write about what I’ve been doing.

A few months back I did a little research into micro web frameworks, did quite a bit of reading around Sinatra, Bottle and Flask. To be honest they all seem good, and I want to play with Flask or bottle at some point too, but Sinatra is the one I’ve gone for so far as he documentation was the best and it seemed the easiest to use and he easiest to extend, not that I’ll ever be doing that!

Either way I thought I’d have a bit of a play with it and see if I could get something up and working and, locally for now due to a lack of documentation from my hosting provider… I have re-created my website Practical DevOps within Sinatra using rackup, bundler and ERB templates.

Now there’s a few reasons I did this, one, I wanted to stop updating every page when ever I needed to update the header or footer of a page, two, I want to implement Split testing (also know as A/B Testing) using something like Split. With all of this in mind and the necessity of having a bit more programmatic control over the website it seemed like a good idea to go with Sinatra.

The Basics of Sinatra

By default Sinatra looks for a static content in a directory called “public” and will look for templates in a folder called “views” which if needed can be configured within the app. So for basic sites this works fine and would have worked fine for me, but I really wanted partial templates to save having to enter the same details on multiple pages, and this can be done with Sinatra partial.

!/usr/bin/ruby

require 'sinatra'
require 'sinatra/partial'
require 'erb'


module Sinatra
  class App < Sinatra::Base

   register Sinatra::Partial
   set :partial_template_engine, :erb

    #Index page
    ['/?', '/index.html'].each do |path|
      get path do
        erb :index, :locals => {:js_plugins => ["assets/plugins/parallax-slider/js/modernizr.js", "assets/plugins/parallax-slider/js/jquery.cslider.js", "assets/js/pages/index.js"], :js_init => '<script type="text/javascript">
        jQuery(document).ready(function() {
            App.init();
            App.initSliders();
            Index.initParallaxSlider();
        });
        </script>', :css_plugins => ['assets/plugins/parallax-slider/css/parallax-slider.css'], :home_active => true}
      end
    end
  end
end

So let’s look at the above which is simply to serve the index of the site.

   register Sinatra::Partial
   set :partial_template_engine, :erb

The register command is how you extend Sinatra, so the sinatra-partial gem when it is installed simply drops it’s code in the sinatra area and when you call register all of the public methods are registered, this allows you to do stuff like this with magic, or you can use it in the ERB template like this. The next line simple tells sinatra to use ERB rather than haml, I chose this because of puppet and chef all using erb and as a result i’m a lot more familiar with that.

 #Index page
    ['/?', '/index.html'].each do |path|
      get path do
        erb :index, :locals => {:js_plugins => ["assets/plugins/parallax-slider/js/modernizr.js", "assets/plugins/parallax-slider/js/jquery.cslider.js", "assets/js/pages/index.js"], :js_init => '<script type="text/javascript">
        jQuery(document).ready(function() {
            App.init();
            App.initSliders();
            Index.initParallaxSlider();
        });
        </script>', :css_plugins => ['assets/plugins/parallax-slider/css/parallax-slider.css'], :home_active => true}
      end
    end

One of the nice things with sinatra is it’s simple to use the same provider for multiple routes, and the easiest way of doing this is to define an array and simply iterate over it for each path. Sinatra uses the http methods to define it’s own functions of what should happen, so a http get requires a route to be defined using the “get [path] block” style syntax and likewise the same for post, delete etc, see the Routes section of the sinatra docs for more info.

The last section is calling the template, so typically the syntax could just be “erb :page_minus_extension” which would load the erb template from the “views” directory created earlier. If you wanted to pass in variables to this you would define a signal ‘:locals’ which takes a hash of variables. All of these variables are only available to the the template that was called at the beginning, so to get the variables to the partial requires some work within the template.

Now within the the views/index.erb file I have the following:

<%= #include header
partial :"partials/header", :locals => {:css_plugins => css_plugins, :home_active => home_active}
%>

Partial calls another template within the views directory, so as I have a partial called header.erb in views/partials/ it loads that, and by defining the locals again from within the template I am able to pass the variables from index into the header or any other partial as needed.

Okay, that’s all folks, Hopefully that’s enough to get people up and running, have a good look at the examples in the git projects they’re useful, and be sure to read the entire Intro to sinatra, very useful!

To IDE or not

What’s in an IDE!

Over the last 15 years or so I have used a few IDE’s for programming “proper” applications in Pascal, Delphie, C/C++ or Java but since moving to the darker side of computing, the linux world I never really needed an IDE and they were just bloated text editors. I don’t think much has changed but there are some good points and I’m at a cross roads as to what to do.

Should I spend some time making Vim, the best text editor the world will ever see better or give up and use an IDE, it’s not as straight forward as I would like to think or as anyone else would probably think so I wonder how it will turn out.

The basics

I’m currently, and probably most likely to be doing web based development and Ruby / python. Up until recently I had just used Vim to get any Ruby programming I needed done, and to be honest it worked okay. I typically just opened a new terminal tab and then changed to the repo directory and started opening files, So I had to cd to a directory before opening the file it wasn’t killing me. I was able to just open and get on with making changes with no real hassle and with syntax high lighting it worked out quite well. Recently I have started to do more web based things and have started using an IDE for that and that’s what’s lead to this blog, there’s a few features of an IDE that I think I’m missing from Vim.

1, Being able to see the entire project directory easily and visually navigating to supporting files to have a quick view / edit is nice.
2, Refactoring, simple things like changing a variable name can be a real bugger in Vim, yes you can find and replace but that is not the same as refactoring, to refactor like an IDE does Vim would need to understand to some degree the codes structure and not to treat it all like text
3, Hiding chunks of code, being able to rol up / down sections of code can make it a lot easier to focus on what’s actually going on
4, Word completion, IDE’s know about the libs being used and the other variabels so can offer hints to complete the words being typed which is very handy
5, Short hand, being able to type short hand to do lots of things, like starting a html tag and having it auto completed speeds up development which must be good

I think if I could get the above features in Vim I probably would struggle to justify not using Vim, lets see what happens.

Vim IDE

1, This is seemingly an easy oen to address with something like NERD tree Initially this looks good so I’m saying point 1 is solved, just need some better aliases (:help 40.2)

2, In short, no, I’ve only seen normal find / replace or basically using grep, There does seem to be language specific ones like This but who wants to find a new refactoring plugin (if someone wrote it) for every new language…

3, This is again easy if you know what to look for Folding is not new and from my quick play seems good.

4, Again language specific it seems possible: this one for ruby, but what about python? javascript? etc etc

5, This one is a simple one that vim does do: Here but it’s almost like you have to hand crank it all!

Summary

So from what I can see I can spend some time making Vim do what I need, and no doubt there’s some stuff there for Ruby and Python that would make it a lot more useful to use as an IDE but it still doesn’t make it easy and will be rather language specific in terms of making it work well out of the box. It does seem that IDE’s are better at those specific tasks above, not to mention debugging and break points etc, maybe Vim can do more of that too and I just don’t know how.

Either way it coped better with the 5 original points than I thought it would and I have found some things out that suggest it may be worth tweaking my current vim to make it better. In the mean time I think I will continue to persevere with learning the IDE and it’s short cuts as it appears that it could dramatically speed up my development by just simply using the tools already there.

I wasn’t expecting to find vi to be as good as it was but in my head even though I prefer Vim at the moment I think the IDE will cope better in the long run, time will tell.

I did it, Plugins

I said I couldn’t do it

It was not long ago I said in the Sentinel Update I didn’t know how to do plugins. Well less than a week after writing it I was reading a few articles by Gregory Brown on modules and Mixins. These are the first time I’ve read something that explains them in a way I actually understand.

I was doing research into modules and mixins as they seemed a bit pointless but thanks to the articles by Gregory I was able to understand them and right in the middle of reading some of the examples and having a play a lightning bolt struck me, it all became clear on how to implement the a plugin manager.

Some Bad Code

Based on some of the stuff I saw I came up with the following, ignore most of it I was just hacking around to see if I could get it to work the names meant more in a previous iteration.

module PluginManager
#Just seeing if this works like magic...
    class LoadPlugin
        def initialize
            #The key is a plugin_name the &block is the code so in theory when initialzed it can be run
            @plugins={} unless !@plugins.nil?
        end

        def add_plugin (key,&block)
            @plugins.merge!({key=>block})
        end

        def run_plugin (key)
            puts "Plugin to Run = #{key}"
            puts "Plugin does:\n"
            @plugins[key].call

        end

        def list_plugins
            @plugins.each_key {|key| puts key}
        end
    end

end

plugins = PluginManager::LoadPlugin.new

plugins.add_plugin (:say_hello) do
    puts "Hello"
end

plugins.add_plugin (:count_to_10) do
    for i in 0..10
        puts "#{i}"
    end
end

plugins.add_plugin (:woop) do
    puts "Wooop!"
end

plugins.add_plugin (:say_goodbye) do
    puts "Good Bye :D"
end

puts "in theory... Multiple plugins have been loaded"
puts "listing plugins:"
plugins.list_plugins
puts "running plugins:"
plugins.run_plugin (:say_hello)
plugins.run_plugin (:woop)
plugins.run_plugin (:count_to_10)
plugins.run_plugin (:say_goodbye)

And when it runs:

in theory... Multiple plugins have been loaded
listing plugins:
say_hello
count_to_10
woop
say_goodbye
running plugins:
Plugin to Run = say_hello
Plugin does:
Hello
Plugin to Run = woop
Plugin does:
Wooop!
Plugin to Run = count_to_10
Plugin does:
0
1
2
3
4
5
6
7
8
9
10
Plugin to Run = say_goodbye
Plugin does:
Good Bye :D

This is good news I and I really like the site, I will be using it a lot more as I learn more about ruby it explains things really well, and it looks like if you can afford the $8/month you can get a lot more articles by the same guy at practicingruby.com

Summary

So in short… Sentinel will have plugins, I like the blogs at ruby best practices and This blog will also be short :D

AWS CopySnapshot – A regional DR backup

Finally!

After many months of talking with Amazon about better ways of getting backups from one region to another they sneak in a sneaky little update on their blog I will say it here, World changing update! The ability to easily and readily sync your EBS data between regions is game changing, I kid you not, in my tests I synced 100GB from us-east-1 to us-west-1 in such a quick time it was done before I’d switched to that region to see it! However… sometimes it is a little slower… Thinking about it, it could have been a blank volume I don’t really know :/

So at Alfresco we do not heavily use EBS as explained Here when we survived a major amazon issue that affected many larger websites than our own. We do still have EBS volumes as it is almost impossible to get rid of them, and by the very nature the data that is on these EBS volumes is very precious so obviously we want it backed up. A few weeks ago I started writing a backup script for EBS volumes, well the script wasn’t well tested it was quickly written but it worked. I decided that I would very quickly, well to be fair I spent ages on it, update the script with the new CopySnapshot feature.

At the time of writing, the CopySnapshot exists in one place, the deepest, darkest place known to man, the Amazon Query API interface; this basically means that rather than simply doing some method call you have to throw all the data to it and back again to make it hang together, for the real programmers out there this is just an inconvenience for me it is a nightmare, it was an epic battle between my motivation, my understanding and my google prowess, in short I won.

It was never going to be easy…

In the past I have done some basic stuff with REST type API’s, set some header, put some variable on the params of the url and just let it go, all very simple, Amazon’s was slightly more advanced to say the least.

So I had to use this complicated encoded, parsed encrypted and back to back handshake as described here with that and the CopySnapshot docs I was ready to rock!

So after failing for an hour to even get my head around the authentication I decided to cheat, and use google. The biggest break through was thanks to James Murty the AWS class he has written is perfect, the only issue was my understanding on how to use modules in ruby which were very new to me. On a side note i thought Modules were meant to fix issues with name space but for some reason even though I included the class in script it seemed to conflict with the ruby aws-sdk I already had so I just had to rename the class / file from AWS to AWSAPI and all was then fine. That and I also had to add a parameter to pass in the AWS_ACCESS_KEY which was a little annoying as I thought the class would have taken care of that, but to be fair it wasn’t hard to work out in the end.

So first things first, have a look at the AWS.rb file on the site it does the whole signature signing bit well and saves me the hassle of doing or thinking about it. On a side note, this all uses version 2 of the signing which I imagine will be deprecated at some point as version 4 is out and about Here

If you were proactive you’ve already read the CopySnapshot docs and noticed that in plain english or complicated that page does not tell you how to copy between regions. I imagine it’s because I don’t know how to really use the tools but it’s not clear to me… I had noticed that th wording they used was identical to the params being passed in the example so I tried using Region, DestinationRegion, DestRegion all failing, kind of as expected seeing as I was left to guess my way through; I was there, that point where you’ve had enough and it doesn’t look like it is ever going to work so I started writing a support ticket for Amazon so they could point out what ever it was I was missing at the moment of just about to hit submit I had a brainwave. If the only option is to specify the source then how do you know the destination? well, I realised that each region has its own API url, so would that work as the destination? YES!

The journey was challenging, epic even for this sysadmin to tackle and yet here we are, a blog post about regional DR backups of EBS snapshots so without further ado, and no more gilding the lily I present some install notes and code…

Make it work

The first thing you will need to do is get the appropriate files, the AWS.rb from James Murty. Once you have this You will need to make the following changes:

21c21
< module AWS
---
> module AWSAPI

Next you will need to steal the code for the backup script:

#!/usr/bin/ruby

require 'rubygems'
require 'aws-sdk'
require 'uri'
require 'crack'

#Get options
ENV['AWS_ACCESS_KEY']=ARGV[0]
ENV['AWS_SECRET_KEY']=ARGV[1]
volumes_file=ARGV[2]
source_region=ARGV[3]
source_region ||= "us-east-1"

#Create a class for the aws module
class CopySnapshot
  #This allows me to initalize the module with out re-writing it
  require 'awsapi'
  include AWSAPI

end

def get_dest_url (region)
  case region
  when "us-east-1"
    url = "ec2.us-east-1.amazonaws.com"
  when "us-west-2"
    url = "ec2.us-west-2.amazonaws.com"
  when "us-west-1"
    url = "ec2.us-west-1.amazonaws.com"
  when "eu-west-1"
    url = "ec2.eu-west-1.amazonaws.com"
  when "ap-southeast-1"
    url = "ec2.ap-southeast-1.amazonaws.com"
  when "ap-southeast-2"
    url = "ec2.ap-southeast-2.amazonaws.com"
  when "ap-northeast-1"
    url = "ec2.ap-northeast-1.amazonaws.com"
  when "sa-east-1"
    url = "ec2.sa-east-1.amazonaws.com"
  end
  return url
end

def copy_to_region(description,dest_region,snapshotid, src_region)

  cs = CopySnapshot.new

  #Gen URL
  
  url= get_dest_url(dest_region)
  uri="https://#{url}"

  #Set up Params
  params = Hash.new
  params["Action"] = "CopySnapshot"
  params["Version"] = "2012-12-01"
  params["SignatureVersion"] = "2"
  params["Description"] = description
  params["SourceRegion"] = src_region
  params["SourceSnapshotId"] = snapshotid
  params["Timestamp"] = Time.now.iso8601(10)
  params["AWSAccessKeyId"] = ENV['AWS_ACCESS_KEY']

  resp = begin
    cs.do_query("POST",URI(uri),params)
  rescue Exception => e
    puts e.message
  end

  if resp.is_a?(Net::HTTPSuccess)
    response = Crack::XML.parse(resp.body)
    if response["CopySnapshotResponse"].has_key?('snapshotId')
      puts "Snapshot ID in #{dest_region} is #{response["CopySnapshotResponse"]["snapshotId"]}" 
    end
  else
    puts "Something went wrong: #{resp.class}"
  end
  
end

if File.exist?(volumes_file)
  puts "File found, loading content"
  #Fix contributed by Justin Smith: https://soimasysadmin.com/2013/01/09/aws-copysnapshot-a-regional-dr-backup/#comment-379
  ec2 = AWS::EC2.new(:access_key_id => ENV['AWS_ACCESS_KEY'], :secret_access_key=> ENV['AWS_SECRET_KEY']).regions[source_region]
  File.open(volumes_file, "r") do |fh|
    fh.each do |line|
      volume_id=line.split(',')[0].chomp
      volume_desc=line.split(',')[1].chomp
      if line.split(',').size >2
        volume_dest_region=line.split(',')[2].to_s.chomp
      end
      puts "Volume ID = #{volume_id} Volume Description = #{volume_desc}"
      v = ec2.volumes["#{volume_id}"]
      if v.exists? 
        puts "creating snapshot"
        date = Time.now
        backup_string="Backup of #{volume_id} - #{date.day}-#{date.month}-#{date.year}"
        puts "#{backup_string}" 
        snapshot = v.create_snapshot(backup_string)
        sleep 1 until [:completed, :error].include?(snapshot.status)
        snapshot.tag("Name", :value =>"#{volume_desc} #{volume_id}")
        # if it should be backed up to another region do so now
        if !volume_dest_region.nil? 
          if !volume_dest_region.match(/\s/) ? true : false
            puts "Backing up to #{volume_dest_region}"
            puts "Snapshot ID = #{snapshot.id}"
            copy_to_region(volume_desc,volume_dest_region,snapshot.id,source_region)
          end
        end
      else
        puts "Volume #{volume_id} no longer exists"
      end
    end
  end
else
  puts "no file #{volumes_file}"
end

Once you have that you will need to create a file with the volume sin to backup, in the following format:

vol-127facd,My vol,us-west-1
vol-1ac123d,My vol2
vol-cd1245f,My vol3,us-west-2

The format is “volume id, description,region” the region is where you want to backup to. once you have these details you just call the file as follows:

ruby ebs_snapshot.rb <Access key> <secret key> <volumes file>

I don’t recommend putting your key’s on the CLI or even in a cron job but it wouldn’t take much to re-facter this into a class if needed and if you were bothered about that.
It should work quite well if anyone has any problems let me know and I’ll see what I can do :)

EBS Snapshot script

Like it say’s

Over the last 18 months, the one key thingI have learnt about amazon is don’t use EBS, in any way shape or form, in most cases it will be okay, but if you start relying on it it can ruin even the best architected service and reduce it to a rubble. So you can imagine how pleased I was to find I’d need to write something to make EBS snapshots.

For those of you that don’t know, Alfresco Enterprise has an amp to connect to S3 which is fantastic and makes use of a local cache while it’s waiting for the s3 buckets to actually write the data and if you’re hosting in Amazon this is the way to go. It means you can separate the application from the OS & data, which is important for the following reasons:

1, EBS volumes suck, so where possible don’t use them for storing data, or for your OS,
2, Having data else where means you can, with out prejudice delete nodes and your data is safe
3, It forces you to build an environment that can be rapidly re-built

So in short, data off of the server means you can scale up and down easily and you can rebuild easily, the key is always to keep the distinctively different areas separate and do not merge them together.

So facing this need to backup EBS volumes I’d thought I’d start with snapshots, I did a bit of googling and came across a few ebs snapshot programs that seem to do the job, but I wanted one in Ruby and I’ve used amazon’s SDK’s before so why not write my own.

The script

#!/usr/bin/ruby

require 'rubygems'
require 'aws-sdk'

#Get options
access_key_id=ARGV[0]
secret_access_key=ARGV[1]



if File.exist?("/usr/local/bin/backup_volumes.txt")
  puts "File found, loading content"
  ec2 = AWS::EC2.new(:access_key_id => access_key_id, :secret_access_key=> secret_access_key)
  File.open("/usr/local/bin/backup_volumes.txt", "r") do |fh|
    fh.each do |line|
      volume_id=line.split(',')[0].chomp
      volume_desc=line.split(',')[1].chomp
      puts "Volume ID = #{volume_id} Volume Description = #{volume_desc}}"
      v = ec2.volumes["#{volume_id}"]
      if v.exists? 
        puts "creating snapshot"
        date = Time.now
        backup_string="Backup of #{volume_id} - #{date.day}-#{date.month}-#{date.year}"
        puts "#{backup_string}" 
        snapshot = v.create_snapshot(backup_string)
        sleep 1 until [:completed, :error].include?(snapshot.status)
        snapshot.tag("Name", :value =>"#{volume_desc} #{volume_id}")
      else
        puts "Volume #{volume_id} no longer exists"
      end
    end
  end
else
  puts "no file backup_volumes.txt"
end

I started writing it with the idea of having it just backup all EBS volumes that ever existed, but I thought better of it. So I added a file “backup_volumes.txt” so instead it will lead this and look for a volume id and a name for it, i.e.

vol-1264asde,Data Volume

if you wanted to backup everything it wouldn’t take much to extend this, i.e. change the following:

v = ec2.volumes[&quot;#{volume_id}&quot;]

To

ec2.volumes.each do |v|

or at least something like that…

Anyway, the file takes the keys via the cli as params to the script so it makes it quite easy to run the script on one server in several cron jobs with different keys if needed.

It’s worth mentioning at this point that within AWS you should be using IAM to restrict the EBS policy down to the bear minimum something like this is a good start:

{
  &quot;Statement&quot;: [
    {
      &quot;Sid&quot;: &quot;Stmt1353967821525&quot;,
      &quot;Action&quot;: [
        &quot;ec2:CreateSnapshot&quot;,
        &quot;ec2:CreateTags&quot;,
        &quot;ec2:DescribeSnapshots&quot;,
        &quot;ec2:DescribeTags&quot;,
        &quot;ec2:DescribeVolumeAttribute&quot;,
        &quot;ec2:DescribeVolumeStatus&quot;,
        &quot;ec2:DescribeVolumes&quot;
      ],
      &quot;Effect&quot;: &quot;Allow&quot;,
      &quot;Resource&quot;: &quot;*&quot;
    }
  ]
}

For what it’s worth, you can spend a while reading loads of stuff to work out how to set up the policy or just use the policy generator

Right, slightly back on topic, It also tags the name of the volume for you because we all know the description field isn’t good enough.

Well that is it. Short and something, something… Now the disclaimer, I ran the script a handful of times and it seemed good, So please test it :)