Flexible monitoring, going up and down

The other day…

I wrote a post the other week about how much monitoring sucked and there was a number of people on the internet (hello people) that just didn’t get it so I thought more detail would be good. One point that was raised was about the scaling up and down of servers and how that affected the monitoring platform. I wanted to cover this specifically as it is an important topic to understand why I said I think Dataloop.IO was the answer.

Nagios + Puppet

Lets look at a typical Puppet / Nagios approach. Puppet has the concept of exported resources, an exported resource can be collected by another server and then actioned so a cool thing to do is to have a manifest that describes a webserver that looks like this:

# /etc/puppetlabs/puppet/modules/nagios/manifests/target/apache.pp
class nagios::target::apache {
   @@nagios_host { $fqdn:
        ensure => present,
        alias => $hostname,
        address => $ipaddress,
        use => "generic-host",
   }
   @@nagios_service { "check_ping_${hostname}":
        check_command => "check_ping!100.0,20%!500.0,60%",
        use => "generic-service",
        host_name => "$fqdn",
        notification_period => "24x7",
        service_description => "${hostname}_check_ping"
   }
}

The double @ tells puppet to send this resource to the puppet database where something looking for it can pick it up later, so the configuration needed to define a host and to add a ping check is. Once the resource is exported it waits on the server until it is collected, the collection looks like this:

# /etc/puppetlabs/puppet/modules/nagios/manifests/monitor.pp
class nagios::monitor {
    package { [ nagios, nagios-plugins ]: ensure => installed, }
    service { nagios:
        ensure => running,
        enable => true,
        #subscribe => File[$nagios_cfgdir],
        require => Package[nagios],
    }
    # collect resources and populate /etc/nagios/nagios_*.cfg
    Nagios_host <<||>>
    Nagios_service <<||>>
}

The Spaceship (<<||>>) tells Puppet to look for that defined resource in the exported resources puppet database, so in this case a resource of Nagios_host or Nagios_service. This is cool, it means a server that previously had no information about another can now do something useful with the specific information that server now provides. This is a good fit for adding new hosts or service checks to Nagios, so lets look at how you remove them next:

N/A

Seriously… If you want to remove it you would have to do the following, reconfigure the host in puppet so it no longer exports, then purge the DB of previous exports, then re-run puppet on the nagios server to re-add all resources again except the one you removed… sounds fun, you could probably make it work if you knew the server was going to be shutdown. If you don’t believe me see this That’s as good as it gets, sorry.

The real problem

With the uptake of utility based computing servers come and go and we should no longer be precious about them. I always give the same answer when someone in the team asks what we call the new server.

These are farm animals not pets

What do I mean by that? well I don’t care what it’s called or even if it exists, if it causes me any problems I will shoot it in the head and get a new one. Lets look at webservers in an auto scaling group, I sometimes have 3, sometimes 3000. Trying to manage that flexibility in puppet will work for scaling up, and I’m sure there’s a way to manage the scale down (if anyone has a way I’d be interested in hearing it)

So why is Dataloop.IO better? well I think it’s better because I can draw a simple hierarchy in the web UI and take a tag, say ‘web’ and add it to the ‘web servers’ service. When I now install Dataloop.IO using puppet or chef or the setup.sh method I have to provide a few details an API Key and an optional tag or list of tags. So assuming that the configuration is done correctly there will be a ‘web server’ role that all web servers collect from and I just put the tag in there and hay presto the server(s) connect to Dataloop.IO in the right container and then they download all of their checks. Lets cover a few examples:

name "web"
description "Web server Role for configuring servers"
run_list(
  'recipe[apache]',
  'recipe[dataloop]'
)
default_attributes(  { "dataloop" =>
                          { "agent" => {
                              "api_key" => "someapikey",
                              "tags" => "web"
                            }
                          }
                      }
                    )

I on purpose made this more verbose, the reality is that Dataloop.IO should be included in a base and there should be a simple override of the tags attribute here. The above is the entire configuration needed to have servers dynamically add all checks and have them spin up / down and de-register themselves as needed from the central service so you only have servers in Dataloop.IO that are turned on. So what happens when the power is yanked? I hear you cry, well, you get an alert as you’d expect, it is only when the server is shutdown and not power cord yanked to turn off that it de-registers.

Lets look at the bash equivalent, lets say you need a server to have monitoring on it in the next 5 seconds!

sudo curl -s https://download.dataloop.io/setup.sh | bash -s <API_KEY> web

That achieves the same as the chef example above; because the configuration of the monitoring is done in Dataloop the agents are all simple, they just need some auth to connect back in (api key), from there you can either drag them into service groups, add tags or whatever plugins you need. If you tag the group and apply the plugins to the tag then as long as that tag is specified it will get all the relevant plugins. You can also layer as many of these tags on top of each other as you like, the agent will just work it out in real time.

Summary

Yes you can scale dynamically up and down with nagios and Puppet or Chef, but most of these tools all rely on being on all the time, i.e. not cloud centric, more enterprise where they still name their pets… Dataloop.IO doesn’t come with that sort of baggage, no firewall rules, quick and easy to setup and use as it should be. If you’re still not convinced I understand, watch this video first:

Monitoring sucks, really

Have you noticed…

In short, all monitoring out there sucks. I promised a few months back to do a review, I was wrong, it is not possible. Let us consider the review of industry standard tools like Nagios, after only several hours of install I may have a server installed not in config management and no users or servers to monitor… This is why these type of on premise apps will die out.

Who wants to spend weeks working out the config and management of a system that is meant to make your life easier? Monitoring tools are very simply put, meant to let you know if serverX is on… or off.. advanced… details like on for service X, or off for service Y come later.

The basic monitoring life cycles should go like this.

Day 1 is the server on or off,
Day 2 are the services I care about running
Day 3 in X days Y may happen

These 3 things are important to monitoring, they allow you some predictability in your service so the sooner you have them the better. A good monitoring tool should be one that allows you to answer these questions as quickly as possible from the time you purchase / download it to the moment its on your server, quicker is better!

Bang for buck

I am acutely aware that monitoring tools that promise the world cost arms, legs, souls and pride; worse yet fail to deliver anything of value that you need. In the past I have seen £100k hp open view systems replaced in a couple of weeks by Nagios and I’ve seen Nagios + munin replaced by Opsview because it is easier to manage and config than both individual tools. For those that don’t know Opsview is a nice front end and config piece for nagios.

I have even, unfortunately seen £2k a month wasted on 10 servers with New Relic. I guess the point is… monitoring is anything from free to ridiculous the key is always what does it do for you?
Does it make your life easier?
Can you work quicker with or without your monitoring tool?

On a side note… New relic’s product is awesome, but if you are not using Java why bother, If you are, you may find like me your engineers find it useful but not irreplaceable… All I can say is it wasn’t as good as Nagios for the alerting and monitoring of the hosts but was definitely better at the application.

Where is the happy ground? You need something as configurable as Nagios, as cheap as Nagios but most importantly not Nagios and this leaves you in an awkward position.

Nagios is awesome and has some cool features, good support, many plugins etc. However the server doesn’t scale easily, configuration is not as simple as it should be and quite frankly the web UI looks like a child vomited hatred on it, just plain ugly. So you naturally lean to OpsView which takes away the config hassle of Nagios by providing puppet modules and decent web ui config but now you have to pay. Is it worth while? Definitely it’s better than Nagios, but that isn’t good enough is it? Certainly it’s a step in the right direction but it’s not the killer tool.

Likewise New Relic was meant to be that killer tool,
designed for devs by devs. So, in short, complicated, non standards compliant and lacking in os monitoring. So what is a sysadmin to do? Give up? I think not.

It comes down to this, you install tools like Opsview or CheckMK as they at least give you a better interface, but they don’t solve the issues of nrpe or firewall rules having to be opened in all directions. It’s for this reason I think there has to be a better way, I don’t want to think or spend my time opening up rules, I want something simple and powerful.

There’s new tools coming onto the market that to me sound better, imagine being able to leverage the Nagios community while having a easy to drive UI on a monitoring tool that gave you the same power as chef knife or puppet marionette while being able to update all of this through simple git commits or the web UI as you see fit. Writing a new monitoring check is done while in the analysis process rather than as backlog or you can simply utilise the RPC nature of the tool to debug issues in prod and write checks on the fly. Did I mention while doing this it is also able to act like Pingdom and provide dashboards to management.

So where does this leave us? well looking to tools like Datalooo.IO for solutions. I have had the privilege of using this while they are in closed beta and they’ve been really good at taking on feedback to make it the monitoring platform I need it to be and it is getting close to being ready and I’m genuinely excited about what is going to happen to this platform over the next year or two.

Foundation building is important

The man who built his house on sand

You are probably familiar with the proverb about the man who built his house on sand, if not read this. It’s important to have a solid foundation to work from when you want to start considering Continuous Delivery (CD) or Continuous Integration (CI).

From an IT perspective this would be like a CTO dictating that CD is the only way to do things; which when poorly managed leads to something that is poorly tested, poorly structured and hard to innovate on. By the time the pessimistic IT bod mentioned it to his boss and it was turned into management speak, then translated to senior management speak it ended up being mistranslated into something completely different.

IT bod “It’s taking ages because the puppet manifests are a complete mess where we had to keep rushing stuff”
IT Bods’ Manager “It’s taking longer than expected as the work is more complicated but it will be done soon”
IT Director “We are spending our time making sure we do this right, we don’t cut corners”
CTO “We have a really stable well produced system”

Yay. I’m 90% sure this is how it works… People become afraid to say how bad it is, but from experience I can honestly say when you start telling people bluntly they stop hassling you, they also stop talking to you so it is a hard thing to make better, it’s harder when the whole train of people desperately want to come across as having done an awesome job.

Imagining that situation, and adding in people that are brought in to deliver just that, while being asked to do lots of other stuff that isn’t in scope, you can end up with something that with lots of careful hand holding produces a build, maybe it even builds an environment with only 2 or 3 hours hand holding, maybe it’s good enough for production using virtual box Who knows.

Typically these nightmarish situations exist only because someone wasn’t clear in defining what the problem was, or when they did they allowed themselves to be pushed over. Well I’m saying it’s not good enough, everyone in the chain has a responsibility to make sure they communicate in clear and uncertain terms what the problem is so there is no ambiguity about how bad a situation is.

foundations

The latest trend at the moment is all towards Continuous Delivery (CD) and Continuous Integration (CI) and all these over wonderful DevOps words. Although it is possible for you to take code and deploy it automatically it is stupid to do so without a sufficient understanding of what the consequences could be. As such it is important to identify what you need to be able to deliver effectively before working out what you need to do to achieve CI or CD.

So before considering CD or CI you need to be able to do the following things, minimum:

  • Easily differentiate between each configuration release
  • Easily differentiate between each infrastructure release
  • Easily differentiate between each application release
  • Be able to build each application server from scratch
  • Be able to build the infrastructure from scratch
  • Be able to track work through a process i.e. request to release for new Infrastructure, Application code or configuration
  • Have an agreed process for peer review of changes
  • Have an agreed release process
  • Be able to manually follow the processes that are in place
  • Adequate test coverage of infrastructure
  • Adequate test coverage of Configuration
  • Adequate test coverage of Application

Once you have those basics in place you can start to look at automating each step, Skip the list at your peril. Let’s touch on a few for clarities sake. “Easily differentiate between XXX” The reason for these is that at some point someone will say “it’s not working and you broke it” and you want to turn that from an opinion based approach into a factual one, and the easiest way to do that is a simple diff between the previous and the current release, no ambiguity, only facts.

Lets look at the “Be able to build XXX from scratch” This is really important, the only way to guarantee that your box is in the state you know it to be in is to build from scratch, use an golden disk, AMI or plain OS, it doesn’t matter as long as you bring that box up from scratch and build it through to a working state (ands off). I’ve had conversations with people that don’t get it, some times the arguments go like this… “We don’t need to because everything is is puppet” well, Lies… No one puts everything in puppet and even if you did, I logged on and stopped the process or I installed a package that wasn’t in puppet or I started a service or I changed a file that was’t etc etc etc… No excuses, build from scratch; it’s really important for the message this sends to the rest of the business which is consistency through process.

Processes are important, they describe the things you will and won’t do, they need to be public, they need to be really simple and they can then be automated, Starting without a process is just going to mean re-working steps as others in the business have different opinions about how it should be done so it’s good practice to sort that out as soon as possible.

The last set “Adequate test coverage of XXX” This needs to be in place beforehand, these tests will become your computerised approver so at the very least it should do everything the human counterpart does to check the system and they need to evolve as time goes on to include more and more tests, when the confidence is in the testing it shouldn’t matter when you release or ho often as you have a set of tests that you and the business trusts.

Summary

It’s important to try and not rush into the final solution, everybody wants it, it’s everyone responsibility to check and cross check that the process is being done sensibly and to call foul if anyone tries to change the process or the requirements. The only way to do this is with some sort of consistency and that should be the driving force, the business needs to accept that if the pipeline is broken the releases don’t happen. but when the pipe line is fixed they should all go fine. This turns the whole release cycle into a maintenance process rather than an active involvement in each release and that will over time be more and more stable and beneficial to the business as a whole. So before trying to do CD or CI, make sure you can put ticks next to the bulleted list above else you’re just wasting time.

AWOL – Sorry!

An Apology to you all

I thought it was time I apologised for not being around much for the last few months. The new job I took on in September has had some challenges and by that I mean problems, and by problems I mean evolutionary screw ups. For the battle hardened sysadmin this is nothing but ordinary, this is the first time I had started somewhere that from the scratch wanted to do “DevOps” and it was all about continuous delivery but unfortunately they made a few mistakes which I want to cover to ensure that not only you as fellow readers can look to avoid these but also so you can understand the steps we have taken to start on the path of fixing it; and do not be delussioned to think we are near the fix we have simply turned the boat to face the right direction while we work out how to keep it on course is a conversation for next month.

I’m sure this is going to be a hard battle to win but I’m certain in 3 months time we will have some of the basics under control while stretching for nirvana as all good teams should, now on with the fundermentals that no one should fail, really, well at least try not to.

Now do pay attention 007

  1. Magic can not happen with out hard work – Buying a book, like the Lean startup and preaching it to the masses as the right thing to do is fine, doing that and then failing to follow what you preached is bad. Large organisations looking to do the lean startup should not simply spin up a department sent a huge budget and then expect magic. That is only part of the journey, trying to do the lean startup with out measuring and learning is asking for trouble and for you to loose focus on why it was a good idea.
  2. Don’t implement the end goal first – I mean, Seriously! I’m sure there was a famous saying, “Run before you can walk”? okay I’m obviously being facetious and you should know “We must learn to walk before we can run” an not trying to quote Tony Stark With IT Operations DevOps there is a cost paid for every sticky plaster and ever ‘good enough’ solution, that toll is really easy to fix early on. People understand this with software development but not operations, a silly 2 min decision about going live before the system is ready can take years of a lot of people trying to fix it vs delaying by a week or two, the cost benefit analysis would look hysterical. Continuous integration and delivery are built on good foundations, if you can’t build a system, or manage a release manually successfully you are not ready, try harder. I’m not saying don’t push your self but you all need to understand where the line is and not compromise on that unless there’s a good cause to do so.
  3. Accountability and structure are key – Someone within the business needs to own the whole operational lifecycle before the system is released, In fact while having the idea for running a service, Employ someone then to work out what they need to make it a success 3 – 6 months from now. The operational involvement in releasing a service should be iterative and inclusive from the off set else you’ll end up with some code that is not deployable.
  4. Third parties don’t care about your problems – I’m not saying it’s true for all cases as they should care, but they are driven by different things, different requirements that makes it easy for them to move on. Just because the third party can run the software they made doesn’t mean you can, and to said third parties, Release well tested / versioned code or run a service, don’t do both badly, which you are, sorry. At least by running a service you can hide the fact your code is bad, but start giving it to people when it’s clearly a version 1 and not suitable to be run anywhere is bad.

We can fix it

One of the challenges I faced first was not having the right structure for any real push back on ‘crazyness’ i.e. no management level buy in, no operational seat on the table. This is quite important as it allows the team to push back on work without being distracted on other tasks, someone needs to make the hard call about live site down and releasing code.

Stabilisation of the core fundamentals is critical, get to a stage where reproducibly building the system from scratch is possible. Ensure that the update and releases can be performed reliably manually. Make sure that the system is supportable end to end.

Have an end goal, work out what utopia is and narrow down in on it as time goes on, start executing to a plan, think of this more as playing civilization V or something, if you have a strategy you will be fine, shooting in the dark will end badly for everyone.

Summary

There is always a way out of the most horrid situations, it does require some compromising on the solutions. the Goal is to do what ever it takes to build up slack, enough slack that some of the bits that were done badly can at least be done properly to build up yet more slack, Hopefully by the time there is a few months of slack built up some sort of system can be implemented to ensure ongoing operations remain focused and within plan. Just remember the end goal and aim for it (roughly).

Secure salted passwords with ruby

Step one, Understand

So I’ve been playing with Sinatra and Redis over the past few months and as part of my more professional side I am creating a blog platform for my other website and as a result I wanted user authentication to ensure I and only I could update it, there’s a chance that at some point I may want to allow others to sign up and login but quite frankly not yet and this is over kill but none the less we learn an develop, so here’s the first step in building it from scratch.

Understanding some key concepts when it comes to authentication is key, so first some history lessons and why it is a bad idea to use those approaches today. Back in the day, way back when, people were trusting and assumed no evil of this world, we call these people naive; predominantly they relied on servers being nice and snug behind firewalls and locked cabinets. As such the passwords were saved in plain text in a database, or a file who cares, it’s human readable, so the attack on this is trivial. Let’s assume you are not silly enough to run the database over a network with no encryption and are instead running it locally well if I have access to your machine I will probably find it in a matter of minuets and if I have physical access within an hour it’s mine, all of them. On a side note, if you’re using plain text passwords anywhere you’re an idiot or in early stages of testing…

After realising this approach was bad, people thought, I can protect this and I will hash the password! wonderful, so you use md5, or sha it doesn’t matter which but for this example let’s say you chose md5. What you have done here is very cunningly created a password that is not human readable. However, be ashamed of your self if you still think this is secure, and here’s why. There are a lot of combinations (2^128 = 340 trillion, trillion, trillion) which to be honest is a lot! the chances of two clashes are so slim why worry about it! Wrong just plain wrong. So here’s why it’s bad, 1, people are idiots and for some strange reason we use typically words to make up passwords so if your password was “fridge” I’d imagine, what this means is if I as mr Hacker get hold of your DB I sit there while I run a dictionary style attack generating thousands of common words into md5 sums and before long I have a word that matches the same md5 sum as your password see this what’s better is because you’re a human it’ll probably work on all sites you use, Nice. The second problem is I don’t have to actually do that grunt work, people have already done it for me and they are called Rainbow tables so trivially I can download it and just do a simple query to find a phrase that matches yours. Don’t think it’s an issue, LinedIn did they fixed it perfectly.

Excellent, So now we’re getting to a point where we need something better, and this is where Salts come in. The basic concept is you type in a password and I as a server concatenate a random string to it and generate a hash. This over comes the rainbow table, because the likely hood of someone having generated a rainbow table from my random salt is highly unlikely, certainly a lot less likely than 2^128. However, lets assume I’m big web provider that got hacked and lost everyones password, “Ha! I used a salt, Good luck cracking that!” they say. Sure, a few points, 1, the salt is stored in the DB, 2, Computers are quicker than they use to be. With the advances in graphic processor cracking and amazon boxes Password cracking is more or less a matter of time & money, but how much is the key.

Using a single salt on all passwords is still bad, the best that I know of today is to use a unique salt for every password. Even if all users have the same password they all have unique hashes, and this is the corner stone. For every user that joins up, generate a secure random salt, add it to the password and generate a hash. So for every user a massive amount of time would be needed to crack just one password, so unique hashes very good.

The code

So history lesson over, now some code; as identified above the best way was to generate a unique salt for every user and then hash their password. This isn’t hard but you do need to ensure the salt is securely random, else you may just be generating something not as secure as you thought.

Have a look at this gist Auth.rb, so useful point, the salt should be large, so if you produce a 32 bytes hash, your salt should be at least 32 bytes as well. It’s a straight forward lib that will generate and allow you to get passwords back out.

Here’s an example of it in use.

require_relative 'auth'
#This is from the web front end so you can see how to use the Lib to check a hash
def user_auth_ok?(user,pass)
  #Get user's Hash from DB
  user_hash = $db.get_user_hash(user)
  #Validate hash
  if Auth.hash_ok?(user_hash,pass)
    #User authenticated
    return true
  else
    #user is not authenticated
    return false
  end
end

#As A side point This bit os from the backend that writes it to the DB 
#If it was the chunk from what the website used it would simply call a method that took username and password
#so probably not useful...
def add_user(username,password)
  #Poke the appropriate keys, create default users, example post etc.
  uid = @redis.incr('users')
  salt = Auth.gen_salt()
  hash = Auth.gen_hash(salt,password)
  $logr.debug("Salt = #{salt}, hash = #{hash}")
  @redis.set("users:#{uid}:#{username}",hash)
end

def edit_user_password(username, password, newPassword)
  user = @redis.keys("users:*:#{username}")
  if user.size < 1
    $logr.debug("No user #{username} found")
  else
    #Validate old password
    hash = @redis.get(user[0])
    if Auth.hash_ok?(hash, password)
      $logr.info("Setting a new password for #{username}")
      newHash = Auth.gen_hash(Auth.get_salt(hash),newPassword)
      @redis.set(user[0],newHash)
    else
      $logr.info("password incorrect")
      #TODO - Need exception classes to raise Auth failure
    end
  end
end

Releasing your first Devops Application

First the worry

When it comes to releasing the first version of an application it’s always worth weighing up the constraints of your environment and the time frame in which the task was delivered versus the skill set available. Inevitably as a skilled DevOps professional you want to do a good job, well done you; however you have to be strong and realise it is not about delivering perfection from day one but about the journey you must take to get there.

I recall the first deployment I did for a version 1 and every time I do one since then I get better, be it a bit more focused or a better starting point. The very first one I did was all over the place, no real configuration management, quite a few manual steps but a well written process, unfortunately that project remained in the depths of secrecy and I ended up moving on.

Constantly I see over engineering and complication added to projects and the root cause of this is worry, I know, I use to be there doing it, it is difficult to step back and be objective to what the business needs, but as a DevOps professional that is your job. When delivering a solution try and remember these things to help you worry less and focus more:

  1. Before being perfect you must first just “be”
  2. When in doubt, do less
  3. If you do not know when the site is down you will not have a job
  4. Always have a backup

Then the delivery

The above list is rather quite useful, use it as a bit of guidance. Starting with point 1, some elaboration; when delivering a solution the most important thing is to deliver the solution, so many people forget this part and focus on the technicalities or whether or not it is the “best” way to deliver the solution. In reality, who cares, no one will care when you are in that meeting explaining why you’re late and have not got a working solution.

Getting stuck in the detail is a horrible place to be and sometimes it gets too involved or too complicated leading to much discussion and inevitably the solution comes out complicated and will take a while to deliver, in these situations point 2 comes in, just do less. It sounds silly but if you’re rushing around struggling to meet a deadline then you need to take things out of scope, and focus on what the actual solution needs to be, maybe you have to have a manual step, then at a later point you can automate it.

The last two points are along the same lines, and those lines are things that get you fired. If your site is down and you don’t know that it is totally down, that’s a bad thing; likewise loosing data is considered pretty poor. However do not get stuck in the trap of assuming you must have full monitoring of every server or that the backup needs to be anything more than a cron job for now.

The “trick” is always around identifying what needs to be done and could be done, by focusing on what needs to be done first you can then come back to improve the rest.

build, improve, rinse, repeat

As touched on earlier You are allowed to cut corners and focus on what is necessary, failure to do this will just lead to delays and a business that is getting rapidly turned off of DevOps. The first release you do can be complete and utter crap, it can be all manual, with nothing more than a simple web check on port 80, that is okay. The important thing is you deliver to the deadline, You have mitigated the main risks of not knowing when the site is down or the potential loss of data, heck even having single points failure are allowed as long as you can clearly identify what the risk is and a solution if that were to happen. In fact, I’d almost go as far to say this is expected.

The key is as always to improve, little and often. Step 1, Manual, Step 2, automate what is easy, Step 3, automate the rest. It has never been and will never be about perfection from version 0.1 onwards you just need to improve a little each time in line with that golden view of what perfection is. As long as you know what the end goal is you can work towards it, just don’t get carried away by trying to deliver it all for the first version.

Deploying Sinatra based apps to Heroku, a beginners guide

A bit of background

So last week, I was triumphant, I conquered an almighty task, I managed to migrate my companies website from a static site to a Sinatra backed site using partial templates. I migrated because I was getting fed up of modifying all of the pages, all two of them, when ever I wanted to update the footers or headers, this is where the partial templates came in; Sinatra came in because it had decent documentation and seemed good…

So, at this time feeling rather pleased with myself I set about working out how to put this online with my current hosting provider, who I have a few domains name with but only through an acquisition they made. I thought I’d give them a shot when setting up my company site, better the devil you know etc and they supported php, ruby and python which was fantastic as I knew I would be using python or ruby at some point to manage the site. After a frustrating hour of reading trying to work how to deploy the ruby app and finding no docs with the hosting provider I logged a support ticket asking for help; to which the reply was along the lines of “I’m afraid our support for Ruby is very limited”. I chased them on Friday to try and get a response on when it would be available, no progress, some excuses because of the platform, so I asked “Do you currently have any servers that do run ruby?” to which the reply was “I’m afraid we have no servers that run Ruby, it shouldn’t be listed on our site, I didn’t know it was there.”

By this point alarm bells were ringing and I thought I best think about alternatives.

Getting started

sinatra-logo

Before even signing up to heroku it’s worth getting a few things sorted in your build environment, I had to implement a lot of this to get it working and it makes sense to have it before hand. So for starters, you need to be able to get your application running using rackup, and I came across this guide (I suggest reading it all). In short, you use Bundler to manage what gems you need installed, and you do this by creating a Gemfile with the gems in, and specifying the ruby version (2.0.0 for Heroku)

My Gemfile looks like this:

source 'https://rubygems.org'
ruby '2.0.0'

# Gems
gem 'erubis'
gem 'log4r'
gem 'sinatra'
gem 'sinatra-partial'
gem 'sinatra-static-assets'
gem 'split'

It simply tells rack / bundler what is needed to make your environment work, and with this you can do something I wish I found sooner, you can execute your project in a container so you can test you have the dependancies correct before you push the site by running a command like this:

bundle exec rackup -p 9292 config.ru &

NB You will need to run

bundler install

first.

By now you should have a directory with a Gemfile, Gemfile.lock, app.rb, config.ru and various directories for your app. The only other thing you need before deploying to heroku is a Procfile with something like the following in it:

web: bundle exec rackup config.ru -p $PORT

This tells Heroku how to run your app, which combined with the Gemfile, Bundler and the config.ru means you have a nicely contained app.

Signing up

Application hosting

Application hosting

Now, Why would I look at Heroku when I’ve already spent money on hosting. Well, for one, it will run ruby, two, it’s free for the same level of service I have with my current provider, three, it’s 7 times quicker serving the ruby app in Heroku than the static files with my current host. So step one, Sign up it’s free, no credit card 1 dyno (think of it as a fraction of a cpu, not convinced you get a whole one)

Create a new app, now, a good tip here, if you don’t already have a github account, Heroku is going to give you a git repo for free, granted no fancy graphs, but a nice place to store a website in with out forking out for private repos in github. Now once your site is in the Heroku git repo you just need to push it up and watch it deploy, at this point you amy need to fix a few things but… it’ll be worth it.

Performance

I don’t want to say it’s the best, so I’m going to balance up the awesomeness of what follows with this I suggest you read it so you can form your own opinions.

So using Pingdom’s tool for web performance I tested the performance of my site, hosted in the UK, vs Heroku in AWS’s European (Ireland) and here’s the results:

The current site, is behind a CDN provided by Cloudflare and already had a few tweaks made to make it quicker, so this is as good as it gets for the static site: results

Now the new site, unpublished due to the aforementioned hosting challenge, doe snot have a CDN, it is not using any compression yet unless Heroku is doing it, but it’s performance is significantly quicker as seen in the results

Now for those of you who can’t be bothered to click the link, the current site loads in 3.43 seconds which is slow but still faster than most sites, the Heroku based site loads in 459ms so 7 times quicker and it’s not CDN’d yet, or white space optimised, that’s pretty darn quick.

Follow

Get every new post delivered to your Inbox.

Join 76 other followers