Everything in one place
Normally when businesses start out building s product, especially those that don’t have the pre-existing knowledge of configuration management, tend to just throw the config on the server and then forget what it is. This is all fine, it’s a way of life and progression and sometime just bashing it out could prove very valuable indeed, but typically this becomes a nightmare to manage. Very quickly when there is then 100 servers all manually built it’s a pain in the arse so then everyone jumps into configuration management.
This is sort of phase 1, everything has become too complicated to manage, no one knows what settings are on what boxes and more time is spent working out if box 1 is the same as box 2. This leads to the need to have some consistency which leads to configuration management, the sensible approach is to move an application at a time into configuration management fully, not just the configuration files.
During this phase of execution it is critical to be pedantic and get as much as possible into configuration management, if you only do certain components there will always be the question of does X affect Y which isn’t in configuration management? and quite frankly, every time you have that conversation a sysadmin dies due to embarrassment.
Reduce & Reuse
After getting to Phase 1, probably in a hack and slash way, the same problems that caused the need for Phase 1 happen. 100 servers in configuration management lots of environments with variables set in them, and servers, and in the manifests themselves and the question starts to be come well is that variable overriding that one, why is there settings for var X in 5 places, which one wins? Granted in configuration management systems there are hierarchies that determine what takes precedence but that requires someone to always look through multiple definitions. On top of having the variables set in multiple locations, it is probably becoming clear that more variables are needed, more logic is needed, what was once a sensible default is now crazy.
This is where phase 2 comes in, aim to move 80%+ of each configuration into variables, have chunks of configuration turned on or off through key variables being set and set sensible defaults inside a module/cookbook. This is half of phase 2, the second half and probably the more important side is to reduce the definitions of the systems down to as few as possible. Back in the day, we use to have a server manifest, an environment manifest and a role manifest each of these set different variables in different places, how do you make sure that your 5 web servers in prod have the same config as the 5 in staging? that’s 14 manifests! why not have 1? just define a role and set the variables appropriately, this can then contain the sensible defaults for that role, all other variables would need to be externalised in something like hiera, or you would need to push them into Facter / ohai.
By taking this approach to minimising the definitions of what a server should be and reducing it down to one you are able to reuse the same configuration so all of your roleX servers are now identical except what ever variables are set in your external data store which can now easily be diff’d.
build, don’t configure
By this point, phase 1 & 2 are done, all is well with the world but still there’s some oddities Box X has a patch level y and box A has a patch level z, or there’s some left over hack to solve a prod issue which causes a problem on one of the servers. Well treat your servers as configurable and throw-away-able, There’s many technologies to help with this be it cloud based with Amazon and OpenStack or maybe VMWare, even physical servers with cobbler. This is Phase 3, build everything from scratch every time, at this point the consistency of the environment is pretty good leaving only the data in each environment to contend with.
Try and treat configuration management as something more than just config files on servers and be persistent about making everything as simple as possible while trying to get everything into it. If you’re only going to manage the files you might as well use tar’s and if that sounds crazy it’s the same level as phase 1 which is why you have to get everything in and I realise it can seem a massive task but start with the application stack you’re running and then cherry pick the modules/cookbooks that already exist for the main OS components like ntp, ssh etc
While it is fresh in my mind, you have missed Phase 0 because it is ingrained in your nature.
Phase 0: Version Control
Before you can even consider configuration management, you need to stop splattering dot_back or dot_DATE files all over your servers.
When you have gotten used to putting files in Git (svn.,cvs etc) it becomes second nature, but it a fundamental step to tracking changes across environments.