We’ve all been on a archeological dig from time to time, faced with an environment that no one knows or an installation no one understands and all we’re left to do is dig our way through the evidence to uncover the truth; and much like real archeological digs we only ever come to a truth that we understand not the actual truth.
I don’t think there’s really anything that can be done to mitigate these situations, sure, you could make sure you document the system fully, write every little detail down and have a structured document a child could follow; no one will read it until after the catastrophe it could have saved happens; so you can’t document your self out of the problem.
Well there’s always hand overs and training, and for that to work you just need to get someone to have the same experiences and history as yourself and then two or three months of going over the same stuff and then there’s a small chance that they may remember some of that knowledge when they need to.
So you can’t write it down, you can’t hand it over and you can’t be around to deal with the problem yourself, so what can you do?
Not much, it depends on how much time you have, but the only real way to hand things over is when they fail let someone else deal with it.
Planning for digs
So how can this be mitigated, well I believe standardisation, problem solving and starting over are the best things to do. Lets look at them in a little more detail.
Everyone has standards, some more than others, what ever the standard is, stick to it, whether you agree or not. By having a standard way of implementing cron jobs, i.e. Always putting them in /etc/cron.d/ rather than crontab or a mixture of cron.d and cron.daily, putting all of your shell scripts into a common directory. Always writing scripts in a certain language, even if it means it will take longer and you have to learn something new…
This doesn’t sound like much, but it means everyone knows where to look to start dealing with problems, there’s no special hidden away super squirrel scripts somewhere that people won’t find.
Quite simply, Get better. Being able to problem solve quickly is a difficult skill to do accurately but practice makes perfect and the more you do it the easier it gets. This is useful for dealing with problems that are unexpected and the best way to get better is to make sure the person that set it up isn’t the person debugging it all the time, share the workload around and let people be in the deep end struggling while you’re around to help rather than struggling on their own.
The biggest barrier to “adopting” a legacy system and owning it for problem analysis is that the people that are trying to support it no longer care, they didn’t put it in, it was put in by a bunch of cowboys badly, with no documentation or poorly written docs and no hand over! It doesn’t matter how good the docs are, you can’t document everything so you always assume a basic understanding, and you can’t tell people that don’t want to listen. So the easiest way to get new people to want to look after systems they didn’t set up or had no part in is to let them do it their way.
So, during the first few months introduce them to the system explain it all and accept the little chirps about there being a better way; at this point they should have a good understanding of the current system and maybe the chirps will disappear, maybe they won’t. If there’s still a lack of adoption it may be worth pretending there are problems with the system and that it’s worth looking at a “better” way of doing it, as such, let them lead the technology charge and just make sure that the system provides the same functionality as before, now all you need to do is learn their system!
I’m not saying don’t document… I am saying don’t spend a long time on it, bullet points, basic pointers and directions, backup / restore enough that if someone skim reads it looking for information it at least takes them to the next step. I’m also not saying don’t do handovers, do them! just accept that it’s in oen ear and out the other, but at least you tried.
Bear in mind that people that are new to the group have big ideas (like you did) and want the best solution, the thing they’re missing normally is the history, “why was it done that way?”, just remember that they need the history so they can understand why something may be bad and why it was not made better at the time.
Hopefully these little things will help the technological archeological dig not be as deep or take as long.