Distributed Puppet

Some might say…

Some might say that running puppet as a server is the right way to go, it certainly provides some advantages like puppet db, sotred configs, external resources etc etc but is that really what you want?

If you have a centralised puppet server with 200 or so clients, there’s some fancy things that can be done to ensure that not all nodes hit the server at the same time but that requires setting up and configuring additional tools etc ect…

What if you just didn’t need any of that? what if you just needed a git repo with your manifests and modules in and puppet to be installed?
Have the script download / install puppet, pull down the git repo and then run it locally. This method puts more overhead on a per node basis but not much, it had to run puppet anyway, and in all cases this can still provide the same level of configuration as server client method, you just need to think out side of the server.

Don’t do it it’s stupid!

My response to my boss some 10 months ago when he said we should ditch puppet servers, manifests per server and make all variables outlawed. Our mission was to be able to scale our service to over 1 million users and we realised that manually having to add extra node manifests to puppet was not sustainable so we started on a journey to get rid of the puppet server and redo our entire puppet infrastructure.

Step 1 Set up a git repo, You should already be using one, if you aren’t Shame on you! We chose github, why do something yourself when there are better people out there doing a better job and are dedicated to doing just one thing, spend your time looking after your service not your infrastructure!

Step 2 Remove all manifests based on nodes, replace with a manifest per tier / role. For us this meant consolidation of our prod web role with our qa, test and dev roles so it was just one role file regardless of environment. This forces the management of the environment specific bits into variables.

Step 3 Implement hiera – Hiera gives puppet the ability to externalise variables into configuration files so we now end up with a configuration file per environment and only one role manifest. This, as my boss would say “removes the madness” Now if someone says “what’s the differences between prod and test you diff two files regardless of how complicated you want to make your manifests inherited or not. It’s probably worth noting you can set default variables for Hiera… hiera(“my_var”,”default value”)

Step 4 Parameterise everything – We had lengthy talks about parameterising modules vs just using hiera, but to help keep the modules transparent to what ever is coming into them, and that I was writing them, we kept parameters, I did however move all parameters for all manifests in a module into a “params.pp” file and inherit that everywhere to re-use the variables, within each manifest that always defaults to the params.pp value or is blank (to make it mandatory) This means that if you have sensible defaults you can set them here and reduce the size of your hiera files, which in turn makes it easier to see what is happening. Remember most people don’t care about the underlying technology just the top level settings and trust that the rest is magic… for the lower level bits see these: Puppet with out barriers part one for a generic overview Puppet with out barriers part two for manifest consolidation and Puppet with out barriers part three for params & hiera

This is all good, But what if you were in Amazon? and you don’t know what your box is? Well it’s in a security group but that is not enough information, especially if your security groups are dynamic, you can also Tag your boxes and you should make use, where possible of the aws cli tools to do this. We decided a long time ago to set n a per node basis a few details, Env, Role & Name From this we know what to set the hostname, what puppet manifests to apply and what set of hiera variables to apply as follows…

Step 5 Facts are cool – Write your own custom facts for facter. We did this in two ways, the first was to just pull down the tags from amazon (where we host) and return them as ec2_<tag>, this works but AWS has issues so it fails occasionally, Version2, was to get the tags, cache them locally in files and then facter can pull it from the files locally… something like this…

#!/bin/bash
# Load the AWS config
source /tmp/awsconfig.inc

# Grab all tags locally
IFS=$'\n'
for i in $($EC2_HOME/bin/ec2-describe-tags --filter "resource-type=instance" --filter "resource-id=`facter ec2_instance_id`" | grep -v cloudformation | cut -f 4-)
do
        key=$(echo $i | cut -f1)
        value=$(echo $i | cut -f2-)

        if [ ! -d "/opt/facts/tags/" ]
        then
                mkdir -p /opt/facts/tags
        fi
        if [ -n $value ]
        then
                echo $value > /opt/facts/tags/$key
        /usr/bin/logger set fact $key to $value
        fi
done

The AWS config file just contais the same info you would use to set up any of the CLI tools on linux and you can turn them to tags with this:

tags=`ls /opt/facts/tags/`

tags.each do |keys|
        value = `cat /opt/facts/tags/#{keys}`
        fact = "ec2_#{keys.chomp}"
        Facter.add(fact) { setcode { value.chomp } }
end

Also see: Simple facts with puppet

Step 6 Write your own boot scripts – This is a good one, scripts make the world run. Make a script that installs puppet, make a script that pulls down your git repo, then run puppet at the end (like the following)

The node_name_fact is awesome, as it kicks everything into gear and hooks your deployed boxes in a security group with the appropriate tags to become fully built servers.

Summary

So now, puppet is on each box, every box from the time it’s built knows what it is (thanks to tags) and bootstraps it’s self to a fully working box thanks to your own boot script and puppet. With some well written scripts you can cron the pulling of git and a re-run of puppet if so desired. The main advantage of this method is the distribution, as long as it manages to pull that git repo it will build a box. and if something changes on the box, it’ll put it back, because it has everything locally so no network issues to worry about.

Puppet with out barriers -part three

The examples

Over the last two weeks (part one & part two) I have slowly going into detail about module set up and some architecture, well nows the time for real world.

To save me writing loads of puppet code I am going to abbreviate and leave some bits out. First things first a simple module.

init.pp

class javaapp ( $conf_dir = $javaapp::params::conf_dir) inherits javaapp::params {

  class {
    "javaapp::install":
    conf_dir => $conf_dir
  }

}

install.pp

class javaapp::install (conf_dir = $javaapp::params::conf_dir ) inherits javaapp::params {

 package {
    "javaapp":
    name => "$javaapp::params::package",
    ensure => installed,
    before => Class["javaapp::config"],
  }

  file {
    "/var/lib/tomcat6/shared":
    ensure => directory,
  }

}

config.pp

class javaapp::config (app_var1 = $javaapp::params::app_var1,
                       app_var2 = $javaapp::params::app_var2) inherits javaapp::params {

  file {
    "/etc/javaapp/javaapp.conf":
    content => template("javaapp/javaapp.conf"),
    owner   => 'tomcat',
    group   => 'tomcat'
  }
}

params.pp

class javaapp::params ( ) {

$conf_dir = "/etc/javaapp"
$app_var1 = "1.2.3.4/32"
$app_var2 = "host.domain.com"

}

One simple module in the module directory. As you can see I have put all parameters into one file, it use to be that you’d specify the same defaults in every file so in init and config you would duplicate the same variables. Well that is just insane and if you have complicated modules with several manifests in each one it gets difficult to maintain all the defaults. This way they are all in one file and are easy to identify and maintain, it by far isn’t perfect, it does work though and i’m not even sure if puppet supports it and if it doesn’t that is a failing of puppet but it does work with the latest 2.7.18 and i’m sure i’ve had it on all 2.7 variants at some point.

You should be aiming to set sensible defaults set every parameter regardless, but make sure it’s sensible, if you want to enforce the variable is set you can still not put an entry in params and just specify it without a default.

Now the /etc/puppet directory

Matthew-Smiths-MacBook-Pro-2:puppet soimafreak$ ls
auth.conf	extdata		hiera.yaml	modules
autosign.conf	fileserver.conf	manifests	puppet.conf

the auth, autosign and fileserver configs will depend on your infrastructure, but the two important configurations here are puppet.conf and hiera.conf

puppet.conf

[master]
certname=server.domain.com
modulepath = /etc/puppet/modules 
[main]
    # The Puppet log directory.
    # The default value is '$vardir/log'.
    logdir = /var/log/puppet

    # Where Puppet PID files are kept.
    # The default value is '$vardir/run'.
    rundir = /var/run/puppet

    # Where SSL certificates are kept.
    # The default value is '$confdir/ssl'.
    ssldir = $vardir/ssl

		autosign = true
		autosign = /etc/puppet/autosign.conf

[agent]
    # The file in which puppetd stores a list of the classes
    # associated with the retrieved configuratiion.  Can be loaded in
    # the separate ``puppet`` executable using the ``--loadclasses``
    # option.
    # The default value is '$confdir/classes.txt'.
    classfile = $vardir/classes.txt

    # Where puppetd caches the local configuration.  An
    # extension indicating the cache format is added automatically.
    # The default value is '$confdir/localconfig'.
    localconfig = $vardir/localconfig
    pluginsync = true

The only real change worth making to this is in the agent sector, plugin sync ensures that any plugins you install in puppet, like Firewall, VCSRepo, hiera etc are loaded by the agent, obviously on the agent you do not want all of the master config at the top.

Now the hiera.yaml file

hiera.yaml

---
:hierarchy:
      - %{env}
:backends:
    - yaml
:yaml:
    :datadir: '/etc/puppet/extdata'

Okay, to the trained eye this is sort of pointless, it tells puppet that it should look in a directory called /etc/puppet/extdata for a file called %{env}.yaml so in this case if env were to equal bob it would like for a file /etc/puppet/extdata/bob.yaml The advantage to this is that at some point if needed that file could be changed to for example

hiera.yaml

---
:hierarchy:
      - common
      - %{location}
      - %{env}
      - %{hostname}
:backends:
    - yaml
:yaml:
    :datadir: '/etc/puppet/extdata'

This basically provides a location for all variables that you are not able to tie down to a role which will be defined by the manifests.

Matthew-Smiths-MacBook-Pro-2:puppet soimafreak$ ls manifests/roles/
tomcat.pp	default.pp	app.pp	apache.pp

So we’ll look at the default node and tomcat to get a full picture of the variables being passed around.

default.pp

node default {
	
	#
	# Default node - base packages for all systems
	#

  # Define stages
  class {
    "sshd":     stage =>  first;
    "ntp":      stage =>  first;
  }
	
  # Needed for Facter to generate OS related information
  package {
    "redhat-lsb":
    ensure => "installed"
  }

  # mcollective
  class {
    "mcollective":
    mc_password   => "bobbypassword6",
    puppet_server => "puppet.domain.com",
    activemq_host => "mq.domain.com",
  }

  # Manage puppet
  include puppet
}

As you can see, this default node sets up some classes that must be on every box and ensures that packages that are vital are also installed. If you so feel the need to extrapolate this further you could have the default node inherit another node, for example you may have a company manifest as follows:

company.pp

node company {
$mc_password = "bobbypassword6"
$activemq_host = "mq.domain.com"

$puppet_server = $env ? {
    "bob" => 'bobpuppet.domain.com',
    default => 'puppet.domain.com',
  }
}

This company node manifest could be inherited by the default and then instead of having puppet_server => “puppet.domain.com”, you could have puppet_server => $puppet_server, which I think is nice and clear. The only recommendation is to keep your default and your role manifests as simple as possible, try and keep if statements out of them, can you push the decision into hiera? do you have a company.pp that would be sensible to have some env logic in it? are you able to take some existing logic and turn it into a fact ?

Be ruthless and push as much logic out as possible, use the tools to do the leg work and keep puppet manifests and modules simple to maintain.

Now finally the role,

tomcat.pp

node /*tomcat.*/ inherits default {

  include tomcat6

  # Installs java app using the init/install classes and default params, 
  include javaapp

  class {
    "javaapp::config"
    app_var1 => hiera('app_var1') 
    app_var2 => $fqdn
  }
}

The role should be “simple” but it also needs to make it clear what it’s setting, if you notice that several roles use the same class and in most cases the params are the same, change the params file, remove the options from the roles, try and keep it so what is in the roles is only overrides and as minimal as possible. The hiera vars and any set in the default / other node inheritance can all be referenced here.

Summary

Hopefully that helps some of you understand the options for placing variables within puppet in different locations. As I mentioned in the other posts, this method has 3 files, the params, the role the hiera file that’s it, all variables are in one fo those three so there’s no need to hunt through all of the manifests in a module to identify where the the variable may or may not be set, it is either defaulted of overridden, if it’s overridden it will be in the role manifest, from there you can work out if it’s in your default or hiera and so on.

Puppet inheritance, revisited

A calmer approach

A few weeks ago I wrote an article which was more of a rant than necessary. I was trying to drastically alter the way we right one of our puppet modules, and not being a simple module like ntp it required a bit of flexibility. To give you an understanding of what our puppet module does, we deploy Alfresco, including the automation of upgrades and configuring varios aspects of the application including bespoke overrides here and there and all sorts of wizzardy in the middle. The original puppet module was written by Ken Barber and then re-written by Adrian and myself. Needless to say there was some big puppet style brains working on it and the time had come to make some fundamental changes, mainly to make it easier to deploy directly from built code. So a lot of the work I’ve been doing on the module meant that it was more important to have the module deploy code form the build servers as well as automatically update itself.

Well, I achieved that within the old module, but quite frankly, the module was about 10 times larger than it needed to be specifically for an operational deployment into the cloud. With this in mind I thought it would eb a good use of time to re-write it and make easier to maintain and hopefully easier to extend, hence the rant around inheritance; in my head I had the perfect solution, which due to some bugs didn’t work. Well Time has moved on and as always progress must continue.

Puppet modules, the easy way.

Okay, so What did I do to overcome the lack of inheritance yet not have all of the duplication that was in the old modules. Simples! combine the two. I gave it some thought and I realsed that the best way out of the situation is to make it so that the variables were set from one place, params, so even though there is some duplication, you still only have to set the variables in one place. As a result I wrote a simple puppet module which has not been tested…. as a demonstration, this is the same structure as a live module so in theory will work the same way.

Manifests

[matthew@rincewind manifests]$ ll
total 20
-rw-r--r--. 1 root root 1981 Feb 17 21:26 config.pp
-rw-r--r--. 1 root root 1757 Feb 17 21:06 init.pp
-rw-r--r--. 1 root root 1578 Feb 17 21:12 install.pp
-rw-r--r--. 1 root root  921 Feb 17 21:51 params.pp
-rw-r--r--. 1 root root 2480 Feb 17 21:57 soimaapp.pp

Let’s start at the top,

init.pp

# = Class: soimasysadmin
#
#   This is the init class for the soimasysadmin module, it loads other required clases
#
# == Parameters:
#
#   *application_container_class*             = Puppet application container class deploying soimasysadmin i.e. tomcat
#   *application_container_cache*             = Location of the tomcat cache directory
#   *application_container_home*              = Application container home directory i.e. /var/lib/tomcat6/
#   *application_container_user*              = User for application container i.e. tomcat
#   *application_container_group*             = Group for application container i.e. tomcat
#
# == Actions:
#   Include any classes that are needed i.e. params, install etc
#
# == Requires:
#
# == Sample Usage:
#
#   class	{ 
#			"soimasysadmin":
#				application_container_class		=> 	"tomcat6",
#				application_container_home		=>	"/srv/tomcat"
#		}
#
class soimasysadmin  ( 	$application_container_class     =	$soimasysadmin::params::application_container_class,
			                  $application_container_cache     =	$soimasysadmin::params::application_container_cache,
                  			$application_container_home      =	$soimasysadmin::params::application_container_home,
                  			$application_container_user      =	$soimasysadmin::params::application_container_user,
                  			$application_container_group     =	$soimasysadmin::params::application_container_group,
		                ) 	inherits soimasysadmin::params {

include soimasysadmin::install

# The application_container variables are set here, but the defaults are all in params, which is available thanks to inheritance, Because this is the top of the tree, these variables can be accessed by other classes as needed

}

install.pp

# = Class: soimasysadmin::install
#
#   This class will install and configure directories and system wide settings that are relavent to soimasysadmin
#
# == Parameters:
#
#   All Parameters should have defaults set in params
#
#   *logrotate_days*      = Number of days to keep soimasysadmin logs for
#
# == Actions:
#   - Set up the application_container
#   - Set up soimasysadmin directories as needed
#
# == Requires:
#   - ${::soimasysadmin::application_container_class},
#
# == Sample Usage:
#
#   class	{ 
#			"soimasysadmin":
#				application_container_class		=> 	"tomcat6",
#				application_container_home		=>	"/srv/tomcat"
#		}
#
#

class soimasysadmin::install 	(	$logrotate_days = $soimasysadmin::params::logrotate_days,
                        			) inherits soimasysadmin::params	{

  File  {
    require =>  Class["${::soimasysadmin::application_container_class}"],
    owner   =>  "${::soimasysadmin::application_container_user}",
    group   =>  "${::soimasysadmin::application_container_group}"
  }

  #
  # Create folders
  #
  define shared_classpath ( $application_container_conf_dir = "$::soimasysadmin::application_container_conf_dir") {

    file {
      "${::soimasysadmin::application_container_home}/${name}/shared":
      ensure  =>  directory,
      notify  =>  undef,
    }
    file {
      "${::soimasysadmin::application_container_home}/${name}/shared/classes":
      ensure  =>  directory,
      notify  =>  undef,
    }
	}

	file {
		"/etc/logrotate.d/soimasysadmin":
		source		=>	"pupet:///modules/soimasysadmin/soimasysadmin.logrotate",
	}

}

params.pp

# = Class: soimasysadmin::params
#
# This class sets the defaults for soimasysadmin parameters
#
# == Parameters:
#
#
# == Actions:
#   Set parameters that are needed globally across the soimasysadmin class
#
# == Requires:
#
# == Sample Usage:
#
#   class soimasysadmin::foo inherits soimasysadmin:params ($foo  = $soimasysadmin::params::foo,
#                                                 					$bar  = $soimasysadmin::params::bar) {
#   }
#
#
class soimasysadmin::params  ( ) {

#
# Application Container details
#

  $application_container_class            = "tomcat"
  $application_container_cache            = "/var/cache/tomcat6"
  $application_container_home             = "/srv/tomcat"
  $application_container_user             = "tomcat"
  $application_container_group            = "tomcat"


#
#	soimapp config
#

	$dev_override_enabled   = "false"
  $custom_dev_signup_url	= "http://$hostname/soimaapp" 

}

Okay that’s all that’s needed to do the install, Notice that each class inherits params. Interestingly enough Adrian informs me this probably shouldn’t work and is possibly a bug, as params is being inherited multiple times. I think the only reason it works is that we have no resources in the params.pp if we add a file resource in ti will fail. but nonetheless, this works (for now)

config.pp

# = Class: soimasysadmin::config
#
#		This class configures the soimasysadmin repository
#
# == Parameters:
#
#		All Parameters should have defaults set in params
#
#		*db_name*												= The Database name 
#		*db_username*										= User to connect to the DB with
#		*db_password*										= Password for the user
#		*db_server*											= The DB server host 
#		*db_port*												= DB port i.e. 3306
#		*db_pool_min*										= Min DB connection pool
#		*db_pool_max*										= Max DB connection pool
#		*db_pool_initial*								= Initial DB connection pool size
#
# == Actions:
#   Install and configure soimasysadmin repository component into appropriate
#   application container
#
# == Requires:
#   - Class["${soimasysadmin::params::application_container_class}"],
#
# == Sample Usage:
# 
#   class	{ 
#			"soimasysadmin::config":
#				db_password 	=>	"Hdy^D7D6fvndsakj(*80",
#		}
#
#

class soimasysadmin::config	(	$db_name					=	$soimasysadmin::params::db_name,
															$db_username			=	$soimasysadmin::params::db_username,
															$db_password,
															$db_server				=	$soimasysadmin::params::db_server,
															$db_port					=	$soimasysadmin::params::db_port,
															$db_pool_min			=	$soimasysadmin::params::db_pool_min,
															$db_pool_max			=	$soimasysadmin::params::db_pool_max,
															$db_pool_initial	=	$soimasysadmin::params::db_pool_initial,
														)	inherits soimasysadmin::params	{
	File {
    owner   =>  "${::soimasysadmin::application_container_user}",
		group		=>	"${::soimasysadmin::application_container_group}",
		notify	=>	Service["${::soimasysadmin::soimasysadmin::application_container_service}"],
	}

	#
	# properties file
	#

	# If extra config set do concatinate
  file {
    "${::soimasysadmin::application_container_home}/${::soimasysadmin::application_container_instance}/properties":
    content	=> template("soimasysadmin/properties"),
    mode		=> 0640,
  }
}

NB In this instance I use config.pp as a way of setting up default config for the application, things common, for example the DB, where as application specific config is done separately, this obviously depends on your application…

soimaapp.pp

# = Class: soimasysadmin::soimaapp
#
# This class installs the soimaapp front end for the soimasysadmin repository
#
# == Parameters:
#
#		All Parameters should have defaults set in params
#
#		*dev_override_enabled*						= Enable dev configuration override
#		*custom_dev_signup_url*						=	Custom dev signup url
#		*application_container_instance*	= Application container instance i.e. tomcat1, tomcat2, tomcatN
#		*application_container_conf_dir*	= Application container conf directory i.e. /etc/tomcat6/conf/
#		*application_container_service*		= Service name of application container i.e. tomcat6
#
# == Actions:
#   Install and configure soimaapp into appropriate application container
#
# == Requires:
#		- Class["${soimasysadmin::params::soimasysadmin_application_container_class}",
#						"soimasysadmin::install"],
#
# == Sample Usage:
# 
#		include soimasysadmin::soimaapp
#
class soimasysadmin::soimaapp	(	$dev_override_enabled								$soimasysadmin::params::dev_override_enabled,
															$custom_dev_signup_url							=	$soimasysadmin::params::custom_dev_signup_url,
															$application_container_instance			= $soimasysadmin::params::application_container_instance,
															$application_container_conf_dir			= $soimasysadmin::params::application_container_conf_dir,
															$application_container_service			= $soimasysadmin::params::application_container_service
															) inherits  soimasysadmin::params  {

	# Set Default actions for soimasysadmin files
	File	{
		notify	=>	Class["${::soimasysadmin::application_container_class}"],
		owner		=>	"${::soimasysadmin::application_container_user}",
    group   =>  "${::soimasysadmin::application_container_group}",
	}


	#
	#	Install / Upgrade War
	#
	
	# Push war file to application container
  file {
    "${::soimasysadmin::application_container_home}/${application_container_instance}/webapps/soimaapp.war":
    source	=> "puppet:///modules/soimasysadmin/soimaapp.war"
    mode		=> "0644",
  }

	#
	#	Shared class created
	#
	soimasysadmin::install::shared_classpath {
		"${application_container_instance}":
			application_container_conf_dir	=>	"${application_container_conf_dir}"
	}
	
	if	( $dev_override_enabled == "true" ) {	
		file {
			"${::soimasysadmin::application_container_home}/${application_container_instance}/shared/classes/soimasysadmin/web-extension/soimaapp-config-custom.xml":
			content => template("soimasysadmin/soimaapp-config-custom.xml"),
		}
	}	
}

Okay that’s the basics, in the examples above, soimaapp does application specific config, any generic settings are in params.pp, any global settings are in init.pp; hopefully the rest is self explanatory.

I haven’t included any templates or files, you can work that out, you are smart after all.

Now the bad news

I will say the above does work, i’ve even tested it with inheriting soimaapp.pp and applying even more specific config over the top and all seems well, so what exactly is the bad news…

I tested all of this on the latest version of puppet 2.7.10 and then came across this, luckily 2.7.11 is out and available but I just haven’t tried it yet. let’s hope it works!