DNS results in AWS aren’t always right

A bit of background

For various reasons that are not too interesting we have a requirement to run our own local DNS servers that simply hold the forward and reverse DNS zones for a number of instances. I should point out that the nature of AWS means that doing this approach is not really ideal, specifically if you are not using EIP’s and there are better ways, however thanks to various technologies it is possible to get this solution to work, but don’t overlook the elephant in the room.

What elephant?

A few months ago while doing some proof of concept work I hit a specific issue relating to RDS security groups, specifically where I had added in the security group that my instance was in to grant it access to the DB. One day after the proof of concept had been running for a number of weeks access to the DB suddenly disappeared with no good reason and it was noticed that by adding in the public IP of the instance to the RDS security group access was restored, odd. The issue happened once and it was not seen again for several months, it then came back, odd again, luckily the original ticket was still there and another ticket with AWS was raised, to no avail.

So a bit of a diversion here; if you are using Multi-AZ RDS instances you can’t afford to cache the DNS record, at some random moment it may flip over to a new instance (I have no evidence to support this, but also can’t find any to disprove) so the safest way to get the correct IP address for the DB instance is to ask Amazon for it every time. So you can’t simply take what ever the last IP returned was and set up a local host file or a private DNS record for it, that’s kinda asking for trouble.

So we had a DNS configuration that worked 99.995% of the time flawlessly, and at some random unpredictable time it would flake out, just a matter of time. As everyone should we run multiple DNS servers, which made tracking down the issue a little harder… however eventually I did. Depending on which one of our name servers the instance went to, and how busy AWS’s name server was when which ever of our name servers queried it depended on the results we got back. Occasionally one of the name servers would return the public IP address for the RDS instance, causing the instance to hit the DB on the wrong interface so the mechanism that does the security group lookup within the RDS’s security group was failing; it was expecting the private IP address.

The fix

It took a few mins of looking at the DNS server configuration, and all looked fine, and if it was running in a corporate network that would be fine, but it is not, it’s effectively running in a private network which has a DNS server already running split views. The very simple mistake that was made was the way the forwarders had been set up in the config.

See the following excerpt from here

forward
This option is only meaningful if the forwarders list is not empty. A value of first, the default, causes the server to query the forwarders first, and if that doesn’t answer the question the server will then look for the answer itself. If only is specified, the server will only query the forwarders.

The Forward option had been set to first, which for a DNS server in an enterprise is fine, it will use its forwarders first, if they don’t respond quick enough it will lookup the record on the root name servers. This is typically fine as when you’re looking up a public IP address it doesn’t matter, however when you’re looking up a private IP address against a name server that uses split views it makes a big difference in terms of routing.

What we were seeing was that when AWS name servers were under load / not able to respond quick enough, our Name Server got a reply from the root name servers which were only able to get the public IP address, therefore, our instance routes out to the internet, hits Amazons internet router, turns around and hits the public interface for the RDS security group on its NAT’d public IP and thus not seen as within the security group, Doh!

Luckily the fix is easy, set it to “forward only” and ta-daa, it may mean that you have to wait a few milliseconds longer now and then, but you will get the right result 100% of the time. I think this is a relatively easy mistake for people to make, but can be annoying to track down if you don’t have the understanding of the wider environment.

Summary

Be careful, if you’re running a DNS server in AWS right now I suggest you double check your config.

Probably also worth learning to use “nslookup <domain> <name server ip>” as well to help debug any potential issues with your name servers, but be aware that because of the nature of the problem you are not likely to see this for a long long time, seriously we went months without noticing any issue and then it just happens and if you’re not monitoring the solution it could go un-noticed for a very long time.

Sentinel – An open source start

An open source start

Last week I introduced a concept of Self Healing Systems Which then lead me on to have a tiny tony bit of a think and I decided that I would write one, the decision took all of 5 mins but it gives me an excuse to do something a bit more complex than your every day script.

I created a very simple website here which outlines my goals, as of writing I have got most of the features coded up for the MVP, and I do need to finish it off which will hopefully be soon, which will hopefully be by the time this is published, but let’s see.

I decided to take this on as a project for a number of reasons:

  1. More ruby programming experience
  2. Other than Monit there doesn’t seem to be any other tools, and I had to be told about this one…
  3. It’s a project with just the right amount of programming challenge for me
  4. I like making things work
  5. It may stop me getting called out as often at work if it does what it’s meant to

So a number of reasons, and I’ve already come across a number of things that I don’t know how to solve or what the right way of doing it is. Which is good I get to do a bit of googling and work out a reasonable way, but to be honest that is not going to be enough in the long run. hopefully as time goes on my programming experience will increase sufficiently that I can make improvements to the code as time goes by.

Why continue if there’s products out there that do the same thing?

Why not? Quite often there’s someone doing the same thing even if you can’t find evidence of it, competition should not be a barrier to entry, especially as people like choice.

I guess the important thing is that it becomes usefully different, Take a look at systems management tools, a personal favourite of mine, you have things like RHN Satellite, Puppet and Chef 3 tools, 1 very different from the other two and another only slightly different. People like choice, different tools work differently for different people.

I guess what I mean by that is that some people strike an accord with one or another application and as a result become FanBoys, normally for no good reason.

There’s also the other side of it, I’ve not used monit, I probably should, I probably won’t; but it doesn’t sound like where I want to go with Sentinel. Quite simply I want to replace junior systems administrators, I don’t want another tool to be used, I want a tool that can provide real benefit by doing the checks on the system, by making logical deterministic decisions based on logic and raw data, and not just by looking at the systems it’s on but by considering the whole environment in which it is part of. I think that is a relatively ambitious goal, but I also think it is a useful one, and hopefully it will get to a point where the application is more useful than the MVP product and it can do more than just look after one system.

Like any good open source product it will probably stay at version 0.X for a long time until it has a reasonable set of feature sin it that make it more than just a simple ruby programme.

A call for help

So I’ve started on this path, I intend to continue regardless at the moment and one thing that will help keep me focused is user participation either through using the script and logging bugs at the github site it’s hosted on.

I think at the moment what I need is some guidance on the programming of the project, it’s clear to me that in a matter of months if not weeks this single file application will become overly complicated to maintain and would benefit from being split out into classes. Although I know that, I do not know the right way of doing it I don’t have any experience of larger applications and the right way to do it so if anyone knows that would be good!

In addition to the architecture of the application there is just some programming issues which I’m sure I can overcome at some point but I will probably achieve the solution by having a punt and seeing what sticks. There’s a wonderful switch in the code for processor states which needs to change. I need to iterate through each character of the state and report back on it’s status where as at the moment it is just looking for a combination. To start with I took the pragmatic option, Add all of the processor states mys system has to the witch and hope that’s enough.

So if anyone feels like contributing, or can even see a simple way of fixing some dodgy coding, I’d appreciate it, I guess the only thing I ask is if you are making changes, See the README, Log a ticket in github and commit the changes with reference to the ticket so I know what’s happened and why.

So please, please, please get involved with Sentinel

Self healing systems

An odd beginning

So I’m writing this having just spent the last 10 days on my back in pain and finally starting to feel better, it’s not come at a good time as another member of the same team as me decided they had a “better opportunity” This is the second person to have left this organisation without as much as a passing comment to myself that they were even leaving, how strage; but I digress.

Either way it opens up a void, a team of 2 and a manager now down to a team of one, with the one having back pain that could at any moment take me out of action. Unfortunately up to the day before I was not able to make it to work the system we look after has been surprisingly stable, rock like in-fact; as soon as I say “I’m not going to make it in” the system starts having application issues (JVM crashes).

Obviously the cause needs a bit of looking into and a proper fix etc etc, but in the mean time what do we do? I had an idea, A crazy idea which I don’t think is a fix to any problems but it at least a starting point.

Sentinel

I have spent a bit of time exploring Ruby a few weeks back so I started to look at ways of writing something that would do a simple check; is process X running? In the simple version I wrote it just checked that tomcat was running more than one instance (our application always runs 2) if it was 2, do nothing, if it was more than 2 do nothing (something else crazy has happened so it just logs to that affect) but if it was less than 2 it would try a graceful-ish restart of the service.

So this obviously works in the one specific case that we have, but isn’t extensible and it doesn’t do any better checks, which all got me thinking. Why isn’t there something to do this for us? I don’t know of anything that does this, if anyone does I’d appreciate knowing, there’s a number tools that could be muddled together to do the same sort of function.

Nagios would monitor the system, cucumber could monitor the application interactions, Swatch could monitor the logs, but in most cases these are monitoring, I’m sure there’s ways to get them to carry out actions based on certain outcomes but why use so many tools.

Yes, the individual tools probably do the job better than a single tool, but as a sysadmin, I’d rather have one tool to do everything but that isn’t practical either. So can we some how get the benefits of monitoring with nagios but have a tool that is specifically monitoring the application performance nagios is gathering information about and then making decisions based on that?

The big Idea

So I wonder if it’d be possible to write a simple ruby application that every now and then did a number of actions:

  1. Check the service on the box, right number of processes, not zombied etc, etc
  2. Check the disk capacities
  3. Check the CPU utilisation
  4. Check the memory utilisation
  5. Probe the application from the local box, a loopback test of sorts
  6. Integrate with nagios or another monitoring tool to validate the state it thinks the box is in compared witht he locally gathered stats
  7. Depending on the outcome of all the checks carry out a number of actions
  8. Hooks int ticketing systems

When I was thinking this through the other day, it seemed like a good idea, the biggest issue I have is not being a programmer, So I have a steep learning curve, it’s a complicated application, so requires some thought. I would also probably have to ignore everyone that thinks it is a waste of time, which isn’t too hard to do.

I guess what I’m thinking of is something like FBAR. As a system scales up to hundreds of servers the up time and reliability becomes more important, it is sometimes necessary to take a short term view to keep a system working. The most important thing is that those short term views are totaled up and then logged as tickets, 1% of your severs crashing and needing a restart isn’t an issue, but if that 1% becomes 5% and then 10% it’s panic stations!

Summary

I think my mind is made up, a sentinel is needed to keep watch over a solution, and what’s crazy is that the more I think of it the more useful it seems and the more complicated it seems to become. As such I think I’m going to need help!

A bold new Ruby world

It’s something different

For a long time now I’ve been put off by Ruby, my interactions have been limited and most of my understanding of Ruby comes from Puppet. I’ve found it a bit of a pain, but the truth is that has nothing to do with Ruby as a language, it was more the packaging of gems and so forth. I really like the idea of yum repos and packaging systems, but Ruby uses gems, I still have no idea what they really are other than libraries to be used by your Ruby program, either way sometimes because of the quirks of yum repos and the lack of maintenance you aren’t always able to get the right version of the rubygems that you need for the application you are running. This alone was enough for me to avoid looking at Ruby as a go to programming language of choice.

In the past I’ve traditionally done my scripting in Bash, if things got difficult in Bash or it wasn’t quite suitable I’d fall back to Perl or PHP (PHP is definitely my go-to language) but with that said I can count the number of scripts I’ve had to write in Perl on my hands, and where possible I always go with Bash. Why? Well Why not? It’s easier for most sysadmins without programming backgrounds to follow as in most cases you are using system commands combined within a framework of programming.

Which leads onto an interesting side note, Why are there sysadmins that can’t program! I guess it happens, and to be honest I was once described as having “no natural programming ability” by one of the college tutors, so i’m not saying I’m good. I do think that every sysadmin needs a fundamental understanding of conditionals, operators, looping and scoping… Again not saying I’m brilliant but I’ve had to learn and I also force myself to learn by writing scripts for things. A sysadmin who can’t write a script is as good as a lithium coated paper umbrella in a thunderstorm.

Moving along

So what was my first venture into Ruby? A simple monitoring script for Solr indexing. I thought about doing it in Bash, then quickly changed my mind, in short I was dealing with JSON and needed a slightly better way of dealing with the output and an easy way to deal with getting the JSON.

This is something I’ve done in the past but within PHP, so I thought it’d be a good comparison. I can honestly say I was rather surprised at how easy it was to get it working, I managed to Google for some code that got the JSON data and understand its use really easily, it wasn’t all obfuscated like some Perl stuff can be.

From what I can see thus far it is quite a reasonable language, its got some useful features and some flexibility but rather than being like Perl with hundreds of ways to do the same thing it has a small selection of ways to do each thing, so you can choose an appropriate style or just one that suits your coding style.

I’m tempted to start writing something a little more complicated to see how it is with that, I have no doubts it’ll be okay, but until I try I will not know.

So what did my first adventure into Ruby look like:

#!/usr/bin/ruby
require 'rubygems'
require 'json'
require 'net/http'

def get_metric(query, base_url)
	url = base_url + "?" + query
	resp = Net::HTTP.get_response(URI.parse(url))
	data = resp.body
   
	# we convert the returned JSON data to native Ruby
	# data structure - a hash
	result = JSON.parse(data)

	# if the hash has 'Error' as a key, we raise an error
	if result.has_key? 'Error'
		raise "web service error"
	end
	return result
end

#
#	Get Arguments or default
#
url=nil
index=nil
query="action=SUMMARY&wt=json"

if ARGV.length == 0
	print "You must specify one of the following options\n\n-u\thttp://example.com/path\tREQUIRED\n\n-q\taction=REPORT&wt=xml\n\n-i\tindexname\tREQUIRED\n"
	exit
else
	for count in 0..ARGV.length
		case ARGV[count]
		when "-i"
			if ARGV[count+1] != nil
				index=ARGV[count+1]
				count += 1
			else
				print "No argument for option -i\n"
				exit
			end
		when "-q"
			if ARGV[count+1] != nil
				query=ARGV[count+1]
				count += 1
			end
		when "-u"
			if ARGV[count+1] != nil
				url=ARGV[count+1]
				count += 1
			else
				print "No argument for option -u\n"
				exit
			end
		end
	end
	if (url == nil || index == nil)
		print "You must specify a URL with -u <url> and -i index\n"
		exit
	end
end

rs = get_metric(query, url)
lag=nil
begin
	lag = rs["Summary"][index]["Lag"]
rescue
	print "Invalid Index\n"
end
regex = Regexp.new(/\d*/)
lag_number = regex.match(lag)
print lag_number, "\n" if lag_number !=nil

Something like that. I Know its not brill, but it’s a starting point, the next thing I write is probably going to make this look rather small by comparison.

Anywho, that’s all on Ruby, wonder if it’ll catch on.

Puppet inheritance, revisited

A calmer approach

A few weeks ago I wrote an article which was more of a rant than necessary. I was trying to drastically alter the way we right one of our puppet modules, and not being a simple module like ntp it required a bit of flexibility. To give you an understanding of what our puppet module does, we deploy Alfresco, including the automation of upgrades and configuring varios aspects of the application including bespoke overrides here and there and all sorts of wizzardy in the middle. The original puppet module was written by Ken Barber and then re-written by Adrian and myself. Needless to say there was some big puppet style brains working on it and the time had come to make some fundamental changes, mainly to make it easier to deploy directly from built code. So a lot of the work I’ve been doing on the module meant that it was more important to have the module deploy code form the build servers as well as automatically update itself.

Well, I achieved that within the old module, but quite frankly, the module was about 10 times larger than it needed to be specifically for an operational deployment into the cloud. With this in mind I thought it would eb a good use of time to re-write it and make easier to maintain and hopefully easier to extend, hence the rant around inheritance; in my head I had the perfect solution, which due to some bugs didn’t work. Well Time has moved on and as always progress must continue.

Puppet modules, the easy way.

Okay, so What did I do to overcome the lack of inheritance yet not have all of the duplication that was in the old modules. Simples! combine the two. I gave it some thought and I realsed that the best way out of the situation is to make it so that the variables were set from one place, params, so even though there is some duplication, you still only have to set the variables in one place. As a result I wrote a simple puppet module which has not been tested…. as a demonstration, this is the same structure as a live module so in theory will work the same way.

Manifests

[matthew@rincewind manifests]$ ll
total 20
-rw-r--r--. 1 root root 1981 Feb 17 21:26 config.pp
-rw-r--r--. 1 root root 1757 Feb 17 21:06 init.pp
-rw-r--r--. 1 root root 1578 Feb 17 21:12 install.pp
-rw-r--r--. 1 root root  921 Feb 17 21:51 params.pp
-rw-r--r--. 1 root root 2480 Feb 17 21:57 soimaapp.pp

Let’s start at the top,

init.pp

# = Class: soimasysadmin
#
#   This is the init class for the soimasysadmin module, it loads other required clases
#
# == Parameters:
#
#   *application_container_class*             = Puppet application container class deploying soimasysadmin i.e. tomcat
#   *application_container_cache*             = Location of the tomcat cache directory
#   *application_container_home*              = Application container home directory i.e. /var/lib/tomcat6/
#   *application_container_user*              = User for application container i.e. tomcat
#   *application_container_group*             = Group for application container i.e. tomcat
#
# == Actions:
#   Include any classes that are needed i.e. params, install etc
#
# == Requires:
#
# == Sample Usage:
#
#   class	{ 
#			"soimasysadmin":
#				application_container_class		=> 	"tomcat6",
#				application_container_home		=>	"/srv/tomcat"
#		}
#
class soimasysadmin  ( 	$application_container_class     =	$soimasysadmin::params::application_container_class,
			                  $application_container_cache     =	$soimasysadmin::params::application_container_cache,
                  			$application_container_home      =	$soimasysadmin::params::application_container_home,
                  			$application_container_user      =	$soimasysadmin::params::application_container_user,
                  			$application_container_group     =	$soimasysadmin::params::application_container_group,
		                ) 	inherits soimasysadmin::params {

include soimasysadmin::install

# The application_container variables are set here, but the defaults are all in params, which is available thanks to inheritance, Because this is the top of the tree, these variables can be accessed by other classes as needed

}

install.pp

# = Class: soimasysadmin::install
#
#   This class will install and configure directories and system wide settings that are relavent to soimasysadmin
#
# == Parameters:
#
#   All Parameters should have defaults set in params
#
#   *logrotate_days*      = Number of days to keep soimasysadmin logs for
#
# == Actions:
#   - Set up the application_container
#   - Set up soimasysadmin directories as needed
#
# == Requires:
#   - ${::soimasysadmin::application_container_class},
#
# == Sample Usage:
#
#   class	{ 
#			"soimasysadmin":
#				application_container_class		=> 	"tomcat6",
#				application_container_home		=>	"/srv/tomcat"
#		}
#
#

class soimasysadmin::install 	(	$logrotate_days = $soimasysadmin::params::logrotate_days,
                        			) inherits soimasysadmin::params	{

  File  {
    require =>  Class["${::soimasysadmin::application_container_class}"],
    owner   =>  "${::soimasysadmin::application_container_user}",
    group   =>  "${::soimasysadmin::application_container_group}"
  }

  #
  # Create folders
  #
  define shared_classpath ( $application_container_conf_dir = "$::soimasysadmin::application_container_conf_dir") {

    file {
      "${::soimasysadmin::application_container_home}/${name}/shared":
      ensure  =>  directory,
      notify  =>  undef,
    }
    file {
      "${::soimasysadmin::application_container_home}/${name}/shared/classes":
      ensure  =>  directory,
      notify  =>  undef,
    }
	}

	file {
		"/etc/logrotate.d/soimasysadmin":
		source		=>	"pupet:///modules/soimasysadmin/soimasysadmin.logrotate",
	}

}

params.pp

# = Class: soimasysadmin::params
#
# This class sets the defaults for soimasysadmin parameters
#
# == Parameters:
#
#
# == Actions:
#   Set parameters that are needed globally across the soimasysadmin class
#
# == Requires:
#
# == Sample Usage:
#
#   class soimasysadmin::foo inherits soimasysadmin:params ($foo  = $soimasysadmin::params::foo,
#                                                 					$bar  = $soimasysadmin::params::bar) {
#   }
#
#
class soimasysadmin::params  ( ) {

#
# Application Container details
#

  $application_container_class            = "tomcat"
  $application_container_cache            = "/var/cache/tomcat6"
  $application_container_home             = "/srv/tomcat"
  $application_container_user             = "tomcat"
  $application_container_group            = "tomcat"


#
#	soimapp config
#

	$dev_override_enabled   = "false"
  $custom_dev_signup_url	= "http://$hostname/soimaapp" 

}

Okay that’s all that’s needed to do the install, Notice that each class inherits params. Interestingly enough Adrian informs me this probably shouldn’t work and is possibly a bug, as params is being inherited multiple times. I think the only reason it works is that we have no resources in the params.pp if we add a file resource in ti will fail. but nonetheless, this works (for now)

config.pp

# = Class: soimasysadmin::config
#
#		This class configures the soimasysadmin repository
#
# == Parameters:
#
#		All Parameters should have defaults set in params
#
#		*db_name*												= The Database name 
#		*db_username*										= User to connect to the DB with
#		*db_password*										= Password for the user
#		*db_server*											= The DB server host 
#		*db_port*												= DB port i.e. 3306
#		*db_pool_min*										= Min DB connection pool
#		*db_pool_max*										= Max DB connection pool
#		*db_pool_initial*								= Initial DB connection pool size
#
# == Actions:
#   Install and configure soimasysadmin repository component into appropriate
#   application container
#
# == Requires:
#   - Class["${soimasysadmin::params::application_container_class}"],
#
# == Sample Usage:
# 
#   class	{ 
#			"soimasysadmin::config":
#				db_password 	=>	"Hdy^D7D6fvndsakj(*80",
#		}
#
#

class soimasysadmin::config	(	$db_name					=	$soimasysadmin::params::db_name,
															$db_username			=	$soimasysadmin::params::db_username,
															$db_password,
															$db_server				=	$soimasysadmin::params::db_server,
															$db_port					=	$soimasysadmin::params::db_port,
															$db_pool_min			=	$soimasysadmin::params::db_pool_min,
															$db_pool_max			=	$soimasysadmin::params::db_pool_max,
															$db_pool_initial	=	$soimasysadmin::params::db_pool_initial,
														)	inherits soimasysadmin::params	{
	File {
    owner   =>  "${::soimasysadmin::application_container_user}",
		group		=>	"${::soimasysadmin::application_container_group}",
		notify	=>	Service["${::soimasysadmin::soimasysadmin::application_container_service}"],
	}

	#
	# properties file
	#

	# If extra config set do concatinate
  file {
    "${::soimasysadmin::application_container_home}/${::soimasysadmin::application_container_instance}/properties":
    content	=> template("soimasysadmin/properties"),
    mode		=> 0640,
  }
}

NB In this instance I use config.pp as a way of setting up default config for the application, things common, for example the DB, where as application specific config is done separately, this obviously depends on your application…

soimaapp.pp

# = Class: soimasysadmin::soimaapp
#
# This class installs the soimaapp front end for the soimasysadmin repository
#
# == Parameters:
#
#		All Parameters should have defaults set in params
#
#		*dev_override_enabled*						= Enable dev configuration override
#		*custom_dev_signup_url*						=	Custom dev signup url
#		*application_container_instance*	= Application container instance i.e. tomcat1, tomcat2, tomcatN
#		*application_container_conf_dir*	= Application container conf directory i.e. /etc/tomcat6/conf/
#		*application_container_service*		= Service name of application container i.e. tomcat6
#
# == Actions:
#   Install and configure soimaapp into appropriate application container
#
# == Requires:
#		- Class["${soimasysadmin::params::soimasysadmin_application_container_class}",
#						"soimasysadmin::install"],
#
# == Sample Usage:
# 
#		include soimasysadmin::soimaapp
#
class soimasysadmin::soimaapp	(	$dev_override_enabled								$soimasysadmin::params::dev_override_enabled,
															$custom_dev_signup_url							=	$soimasysadmin::params::custom_dev_signup_url,
															$application_container_instance			= $soimasysadmin::params::application_container_instance,
															$application_container_conf_dir			= $soimasysadmin::params::application_container_conf_dir,
															$application_container_service			= $soimasysadmin::params::application_container_service
															) inherits  soimasysadmin::params  {

	# Set Default actions for soimasysadmin files
	File	{
		notify	=>	Class["${::soimasysadmin::application_container_class}"],
		owner		=>	"${::soimasysadmin::application_container_user}",
    group   =>  "${::soimasysadmin::application_container_group}",
	}


	#
	#	Install / Upgrade War
	#
	
	# Push war file to application container
  file {
    "${::soimasysadmin::application_container_home}/${application_container_instance}/webapps/soimaapp.war":
    source	=> "puppet:///modules/soimasysadmin/soimaapp.war"
    mode		=> "0644",
  }

	#
	#	Shared class created
	#
	soimasysadmin::install::shared_classpath {
		"${application_container_instance}":
			application_container_conf_dir	=>	"${application_container_conf_dir}"
	}
	
	if	( $dev_override_enabled == "true" ) {	
		file {
			"${::soimasysadmin::application_container_home}/${application_container_instance}/shared/classes/soimasysadmin/web-extension/soimaapp-config-custom.xml":
			content => template("soimasysadmin/soimaapp-config-custom.xml"),
		}
	}	
}

Okay that’s the basics, in the examples above, soimaapp does application specific config, any generic settings are in params.pp, any global settings are in init.pp; hopefully the rest is self explanatory.

I haven’t included any templates or files, you can work that out, you are smart after all.

Now the bad news

I will say the above does work, i’ve even tested it with inheriting soimaapp.pp and applying even more specific config over the top and all seems well, so what exactly is the bad news…

I tested all of this on the latest version of puppet 2.7.10 and then came across this, luckily 2.7.11 is out and available but I just haven’t tried it yet. let’s hope it works!

RHN Satellite vs Puppet, A clear victory?

Is It Clear?

No! It never is, it is all about what your environment looks like and what you are trying to achieve. Both solutions provide configuration management, both have a paid for and a free version, although you’d be forgiven for not realising you can get RHN Satellite (or close to it) for free, At the end of the day they are different tools aimed at different audiences and it is important that you, yes you the sysadmin, works out what your environment is, not what it might be, not what think it is, but actually what it is.

Step 1

Before we even consider the features and the pro’s and con’s of each solution we (and I mean you) have to work out what the solution needs to provide, and more importantly the skill set of those managing it. There’s a quick way to do this and a slightly more laborious one, lets look at what technology is used to run each solution.

It’s worth noting the following skills are what I would say is needed to produce an effective system, that is not to say that you could do it with less or that you shouldn’t have more skills….

RHN Satellite required skills

  • Understanding of python – Especially if you want to interact with the API’s or configure templates; if you are not going to use it for templating configuration files you will need no skills in this.
  • Basic sysadmin skills, so RHCT or higher
  • Understanding of networking
  • That’s it, it is designed to be easy to use, the installation and configuration is not overly easy, if you know what you are doing / As good as the RH Consultants, You’re looking at 3 days, if you have no idea what you’re doing, allow 2 weeks. Personally I would get Red Hat in to set it up and train you at the same time you’ll be in a better position for it.

    Puppet

  • Good Ruby skills or Very good programming skills
  • Understanding of object orientated programming languages
  • Good sysadmin skills, RHCE, if you can’t set up DHCP, DNS, Yum Repos, Apache, NFS, SVN or basic web applications from the CLI then this is not you.
  • Understanding of networking
  • A slightly higher set of skills, but achievable by most, it is not going to be difficult to install and get working, I’d think if you had an RHCE you’d have a working puppet server in a matter of mins (you would need to add a yum repo and install a package, done)

    Who’s running the service?

    Okay so you, the super dooper sysadmin, manage to convince yourself that you are the right person to run puppet, You get it installed set up and configured, write a few basic modules and then pass it off to some other team to “run with it”, their skill set is closer in line to those needed for RHN Satellite. In short, Bad sysadmin!

    You have to consider who will be using the system, the skill set and the aptitude to learn and progress, just because you want to do puppet doesn’t make it the right solution. You could always go with the more simple RHN Satellite server to start with and as skills develop look back at something like puppet in a couple of years.

    Step 2

    What features do you need? Not what features you want… So, what does that mean… Do you need to have dynamic configuration files, files where depending on what the state of the node or configuration around the node change their configuration using if statements, loops, case statements etc?
    Do you want to easily be able to build and deploy servers from “bare metal”?

    Hopefully by this point you will have a good understanding of the skill set to support it and what the business actually needs, now you’ve done this I can happily tell you that either solution is able to do what ever you are thinking of (in reason) but it was important to get a fuller understanding of what was needed.

    Puppet Features, a totally non-exhaustive list

  • Dynamic configuration of files through very powerful ruby templating, if you can think of it, it can do it
  • Powerful node inheritance for ease of managing set’s of servers
  • Classes to manage applications with parameters and basic inheritance (See Last Weeks post)
  • I did say it was non-exhaustive, for a full list look here but be warned, just because it say’s it can do something doesn’t mean it can do it as well as you might think, Doubly true of RHN Satellite server!

    The important thing for puppet is the community behind it and the fact it is extensible, you can do anything with it. You can create your own providers, resource types, facter variables etc etc there’s always new developments so you really do need to subscribe to their Blogs

    You can get an Enterprise version, which comes with support, a fancy gui and all that warm fuzzy stuff, you can even get the “bare metal” build by using something like Foreman

    Enough of Puppet what about the RHN Satellite!

    RHN Satellite Features

  • Repository management through Channels
  • Organisational management – You can create sub-organisations that have access to specific nodes or profiles to apply builds, so they appear to have their own RHN
  • Security control – You can easily manage access to the web interface, nodes, access keys
  • Easy to use – Really it is, anyone with a little tech savvy and some time on their hands could work it out
  • A few more features can be found here, But the real benefit with the RHN satellite system is the ease of use, if the people running the service are more RHCT than RHCE then it’s worth considering.

    I will say it’s easier to manage your patch release cycle, although in reality it isn’t; the RHN Satellite does allow anyone to manage the flow, move systems between different channels etc etc.

    One of the features I liked the most was the ability to group servers together and apply patches to those groups and manage a group of servers as single server, and migrate them from dev to staging etc etc.

    The ups and the downs

    So we’ve looked at what you need and what you want and who should be looking after it and even touched on a few features. With all this in mind I never once mentioned the downsides of either.

    The biggest downside with puppet is its flexibility and its pure focus on configuration management, as a result it doesn’t fire up a PXE boot service or easily integrate OS install to configuration with out additional tools, it just does configuration. As a result you have to provide all of these ancillary services in addition to the configuration management to achieve the same completeness of service that you get from the RHN Satellite. It is for this reason that you need the additional skills and experience to cope with it or Foreman

    So what about the RHN Satellite server? The biggest let down with this is the configuration management, if you want to push a static file out it’s really straight forward, and when I was looking at this a few years back you could put variables into the files but from memory the set of options were limited, like you could add the local IP address or the hostname of the server, but you couldn’t pass in custom settings.

    The biggest benefit of puppet is the combination of generic modules and well written template files, the principles behind it is that you may very well have a complicated module, but you should be able to switch the configuration at a moments notice. This provides a very flexible approach to delivering the configuration. For example You can have a simple apache module which you can add additional complexity to through inheritance, parameters and defines. With RHN Satellite you just won’t get that unless you re-package your apache into its own RPM, for each type of web service.

    With the RHN Satellite the biggest advantage is purely the easy of use, it is a jack of all trades and a master of none, but if your aims are simple and your staff not quite up there on the pillars of excellence it is a good solution that you will be able to do most of what you want with.

    Summary

    For me I’d boil it down to the simple way of determining this.

    If your company is predominately Microsoft Windows, or the sysadmin’s are not dedicated (and yes that’s plural sysadmin’s…) to Linux then I would recommend RHN Satellite, unless you have a very specific use case that can not be solved by the RHN satellite it is worth giving up some flexibility. For example, if you need to manage Red Hat and Debian, rule out the RHN Satellite, or if you know that you are going to be growing a team (know not think…) skills and numbers to be dedicated Linux sysadmins.

    If you are an open source company, or have dedicated Linux Sysadmins who have been there, done that, brought the T-shirt, ruined the T-shirt redecorating the house and know the difference between nc -z -t and nc -z -w 10. Then I would consider puppet your first choice, it is young and upcoming, it’s easy to forget that it has not been around 5 years. It has some rough edges but they are getting better, and with support from puppet it makes total sense.

    It’s worth touching on the training and availability of skills, Good Luck RHN satellite skills are not well-known and mainly retained within Red Hat. Puppet skills are in very high demand and people claiming the experience and understanding may be pulling your leg (a lot in some cases). However training is available for both the RHN Satellite and Puppet

    These are not two products to compare evenly, both can be done for free, both can be very complicated, the only recommendation is to not choose either one based on their technical merit alone but more so on which one fits best with the aims of the project, the hearts and minds of those using the system and good ol’fashioned gut-feel.

    For me, having used both, I would lean more towards Puppet, but I’m lucky, Where I am we have a lot of very technical people who are able to understand and work with puppet.

    Puppet inheritance, The insane mine field

    I love Puppet

    I’m what you would call a newbie when it comes to puppet, I’ve be fortunate enough to work with it for the last 2 years but I only started using it 15 months back. I’ve used other configuration management tools in the past, and one day maybe I will bore you with the inner workings os RHN Satellite Server, but puppet was different.

    For me it had great power over the RHN Satellite Server I had used, it allowed for me to have dynamic configuration that was structured and sensible, although the Satellite server allows you to replace certain variables it was nowhere near as powerful.

    Luckily for me I had my own personal puppet guru in the form of Adrian Bridgett who has been working on puppet since the early days. It enabled for me to progress quite quickly through some of the less interesting features and straight into the world of classes and parameters.

    As time progressed I learnt all about the different resources and various structures I could utilise, the various quirks of working with puppet, such as loops, if you ever need to do a loop, good luck. In short I have had the privilege of working with some very interesting puppet code created by people who knew puppet a lot better than I, all the time with a resource on hand to help me learn.

    I hate duplication!

    One of my pet peeves with puppet at the moment is the way you have to pass variables around to make things work and that proper inheritance of the classes is just so poor.

    For example, The following is a simple set of classes that works with no issues.

    init.pp

    class frank ( $foo ="bar" ) {
    
    include frank::bob
    
      class {
        "frank::joe":
        foo => $foo
      }
    }
    

    bob.pp

    class frank::bob {
    
      file {
        "/var/lib/bob":
        ensure => "directory"
      }
    }
    

    joe.pp

    class frank::joe ($foo="bob") {
    
      file {
        "/var/lib/bob/${foo}.txt":
        content => template("module/file.txt"),
        owner => "frank",
      }
    }
    

    Okay, so we have some class called frank, which takes a param $foo, includes a class bob and calls another class called joe and passes a param foo to it. This is a pretty standard way to pass variables around and it works flawlessly. The downside? It’s rubbish, on simple classes as Puppetlabs always shows you it’s fine you have a handful of parameters and it’s not too much hassle to maintain or they show you a very simple ntp class that doesn’t give you a feel for the more complicated modules you may be writing yourself. You can of course just reference the variable directly or not pass them into the child classes… Wrong If you are using these in templates you will end up having variables that are not in the scoped path for the template. You could reference the nicely scoped variable i.e. $frank::joe::foo this works as long you want the default or you can spend ages working out how to get the variables around, if you actually want to set the variable you need to call the class directly so you end up with a more complicated nodes manifest which isn’t the end of the world, but in the world of keep it simple less is more.

    So imagine the above but where you want to pass in 25 or 40 parameters, believe it or not, we have some complicated configuration files and to help us keep the flexibility in the module as much as possible. You basically end up duplicating a lot of variables around, and your init.pp will be huge as a result.

    It’s probably worth saying that we try to keep where possible the most simplest node manifest, so typically the largest node manifest would be maybe 10 lines.

    So here’s a better way to achieve the same things with less configuration.

    init.pp

    class frank ( ) inherits frank::joe {
    
      include frank::bob
    }
    

    bob.pp

    class frank::bob {
    
      file {
        "/var/lib/bob":
        ensure => "directory"
      }
    }
    

    joe.pp

    class frank::joe ($foo="bob") {
    
      file {
        "/var/lib/bob/${foo}.txt":
        content => template("module/file.txt"),
        owner => "frank",
      }
    }
    

    One simple change, inherit! it makes so much sense, the init.pp has has everything that is in the frank::joe and some more, wouldn’t it be nice if that worked. Well it Doesn’t thanks to this and a number of other bugs, the inheritance within puppet just isn’t powerful enough, and more importantly doesn’t work as expected.

    It seems the only way to get around this is the first method, which means that even if you are splitting your classes out you end up with duplication and a more confusing modules.

    I encourage everyone to give their own personal view-point on this as it’s an area I’d like to learn more about.

    Summary

    I want an easy life, I want puppet modules that allow for proper inheritance and dependency management, I don’t want to be working around quirks of an application, I want to work with the application and I want it to be simple.

    I am not a puppet expert, but I do think that you should be able to pass parameters through to the inherited class without having the variables re-defined at every level, What do you think?