Let’s Chat

It’s good to talk

Over the last few weeks I’ve been playing with some forum based technology, trying to find an ideal platform to take and add lots of modifications too to integrate fully with another app. The idea is that in said other app you will click “help” and it will take your code and post it directly to a forum, where people can offer up code solutions which you as the original poster can click the “try it” button that then puts it back in the other app and allows you to test it against your test systems.

There’s a whole other bunch of features and gamification that needs to happen to help build up a desire to help people with rewards for those that help the most, needless to say there will be a lot of customisations so I spent a little bit of time playing around with a few technologies and the one we’re going to use is NodeBB. It’s also worth mentioning they have a nice indiegogo campaign going and more support is required!

Everyone we’ve shown the forum to likes it, it is still quite immature, they only started in mid May (2013) and the code has come a long way, and it is continuously improving. With help from me ;)

Building a stable platform for the future

I have some constraints around what I’m doing, one is I’m still new to Javascript, learning, but new; luckily I have people I can bug if I get stuck which really helpful. I also have to make the discussion boards easy to support for the people that will be looking after it and easy to integrate it with our exiting app as and when needed.

The next biggest issue with building a solid platform is a good choice of frameworks, using a frame work like PassportJS rather than writing individual login methods, implementing configurable logging with something like Winston and using templating solutions like Dust (Dust seems really god but not maintained any more, I guess it’s perfect?) or Jade (Jade is only server side but ClientJade fills that niche)

All of these frameworks will just make my life easier in the log run, I’ll hopefully be able to work quicker and it should give other sone thing less to learn assuming I choose libraries they are familiar with anyway.

So there’s a few things that need to be looked at from a supportability point of view and to enable easier development by us and then there’s all the extras we need to work on, because of the level of integration and needing to poke almost all elements it will be interesting to see how the plugins will work for NodeBB. I’m desperately trying to work out how I can maintain heavy customisation of the code and still contribute back shared goals, I’d prefer not to fork and become so separated that we can’t push back contributions, hopefully we’ll be able to work out how we do that!

What challenges you?

Over the last few weeks

I have been wondering what most people find challenging in the “modern” IT world. There’s been a recent upsurge in tools and technology that address most problems which only leaves me to wonder what is filling that gap? What is the current big annoying problem, maybe it’s not being able to push your architecture into multiple clouds, or having to live with the constraints of small root disk volumes; Who knows? Hence the poll :)

A week in the Valley

While out and about…

Over the last week I’ve been out in the bay area meeting with an important client talking about their needs and how we’re going to make things better for them and for us, all in all a good trip (apart from the plane crash). This was my first time to the bay area and it seems like a nice enough place, it lives up to expectations in some areas and not others, I’m sure with more local knowledge it’s possible to overcome some of the issues I had with the area. The main issue I could see (granted it was only a week) is that it’s not as nice to live there as it is in the UK or even as nice to be around and in like London.

In London, everything is a walk away or a short tube journey, and better yet if you’re willing to travel more than an hour each way each day you can live in the countryside and just commute in; but in the valley everything is a short car journey away, the public transport seems a bit hit and miss and Taxi’s aren’t cheap!
I think its things like this which will be the end of the bay area over the next 10 years unless it changes, and I’m not the only one to think this and as time moves on I think we’ll see a shift in tech start-ups away from silicon valley into areas that are nicer to live.

Which is what brings me back to London, there’s a good start-up culture, there’s more investment going on and there’s some good companies starting to appear, unfortunately for tech startups the UK still isn’t brilliant, but it will get there in time and it probably just needs a few more years and some brave people to trail blaze.

I think London has the makings of a nice tech hub for Europe and will over the next few years start exceeding the bay area, the only thing it’s really missing at the moment is the massive success stories that appear in the bay area every few years, sure there’s some good companies but none are a apple, google or facebook.

I think I could survive out there to live for a while but not forever, it’s nice being able to go to the beach, forrest, mountains what ever you want all within a reasonable drive but there’s too much convenience stuff, like fast food, corner shops and drive throughs. Like Walmart, it’s got a purpose but not for me, Trader Joes seemed better, but no soft drinks just fresh goods and booze… Maybe in time I would have found stuff that felt a little more “me” and a little less American but I’d have to go and give it a go to find out!

For me personally I don’t really want to live in London, it’s just too busy but living out in Hampshire makes London an awkward commute, do able but not every day. As time goes on I’m still hopefully that more start ups will start offering flexible working like we have at Alfresco where going in for 2-3 days a week is the norm and everyone is trusted to do the work, and who knows over the next 10 years maybe more will start filling the M3/M4 corridor which will make living in a nice place and commuting to a nice tech company is all possible.

It will certainly be interesting over the next few years how the tech industry in the UK changes but I’m certain it’s picking up speed.

Doing it the hard way

Lets make it really complicated

Over the last two weeks i’ve been playing with some open source project that has a bit of a kick starter going to fund their idea. I came across it through my boss who probably found it on redit but in essence it’s a forum using nodejs as the backend and some Jquery at the front end but all in all it looks pretty awesome; sure it has a few flaws but it’s less than 6 months old.

In my first venture into using it on my laptop I came across a bug with the title of a new post, when you reply it locks the title to stop you editing it on reply, unfortunately it didn’t unlock it so when you went to post a new topic you were unable to set a title. I decided that I could raise a ticket or I could just have a look at it so I had a lock and submitted a fix to them; to my amazement they accepted my fix! We decided that this is a good platform to use for what we’re trying to do at work but it needs a few core fixes followed by quite a bit of integration and customisations, the core things we need to fix in the product (in order of importance) is

  1. Deployment to bespoke path
  2. Increased logging for debugging
  3. Additional authentication routes

So I decided that not knowing the code I should start at working on point one, it makes it useful and gets me involved in a lot of the code so hopefully I’ll learn something. It seemed a sensible place to start, it was a sensible place to start; unfortunately being a new project there isn’t a lot of documents or sites to google for this stuff so I’ve been learning the hard way.

The main challenge is learning the code, the other is working out why things were done in a certain way. So one of the issues I have is that currently there’s something like 6 config.json files all used for different things, with different config so I need to backwards engineer all of it, a little annoying seeing as with some better technology choices it could just be one config file, but then I also don’t know why things have been done that way.

Challenging me more!

Up until recently my experience to nodeJS had been rather limited but I had used some cool things like express, Winston and Jade but I has always had the luxury of talking to the developer that wrote it to help me understand why it was done that way and how I should use it; this time I’m on hard mode, I have to understand it from reading and I have questions! Hopefully the people running the project will have some time to spend helping me get up to speed and answering my stupid question, I read through the code and I just don’t know what’s going on, I think this is partly down to not really being a programmer and partly to not knowing the language very well so everything is a little odd; at least I hope thats the case, if it is just bat shit crazy then at least I’m confused for a reason :)

I’m definitely going to persevere as I’m, sure it will be useful and it is in the project teams best interest to help me understand what it does so I can start submitting “awesome” changes back to the project even if they don’t want them :)

Either way I’m looking forward to diving into the code a bit more and trying to guess what it’s doing; hopefully I can make it work for our purposes while providing useful (although not needed) features back to the project. I’m also looking forward to increasing what I’m doing in complication so I can start doing the more bespoke work we needed with integrations and maybe adding an achievements framework or some sort of gamification to the forum tool

Configuration management alone is not the answer

Everything in one place

Normally when businesses start out building s product, especially those that don’t have the pre-existing knowledge of configuration management, tend to just throw the config on the server and then forget what it is. This is all fine, it’s a way of life and progression and sometime just bashing it out could prove very valuable indeed, but typically this becomes a nightmare to manage. Very quickly when there is then 100 servers all manually built it’s a pain in the arse so then everyone jumps into configuration management.

This is sort of phase 1, everything has become too complicated to manage, no one knows what settings are on what boxes and more time is spent working out if box 1 is the same as box 2. This leads to the need to have some consistency which leads to configuration management, the sensible approach is to move an application at a time into configuration management fully, not just the configuration files.

During this phase of execution it is critical to be pedantic and get as much as possible into configuration management, if you only do certain components there will always be the question of does X affect Y which isn’t in configuration management? and quite frankly, every time you have that conversation a sysadmin dies due to embarrassment.

Reduce & Reuse

After getting to Phase 1, probably in a hack and slash way, the same problems that caused the need for Phase 1 happen. 100 servers in configuration management lots of environments with variables set in them, and servers, and in the manifests themselves and the question starts to be come well is that variable overriding that one, why is there settings for var X in 5 places, which one wins? Granted in configuration management systems there are hierarchies that determine what takes precedence but that requires someone to always look through multiple definitions. On top of having the variables set in multiple locations, it is probably becoming clear that more variables are needed, more logic is needed, what was once a sensible default is now crazy.

This is where phase 2 comes in, aim to move 80%+ of each configuration into variables, have chunks of configuration turned on or off through key variables being set and set sensible defaults inside a module/cookbook. This is half of phase 2, the second half and probably the more important side is to reduce the definitions of the systems down to as few as possible. Back in the day, we use to have a server manifest, an environment manifest and a role manifest each of these set different variables in different places, how do you make sure that your 5 web servers in prod have the same config as the 5 in staging? that’s 14 manifests! why not have 1? just define a role and set the variables appropriately, this can then contain the sensible defaults for that role, all other variables would need to be externalised in something like hiera, or you would need to push them into Facter / ohai.

By taking this approach to minimising the definitions of what a server should be and reducing it down to one you are able to reuse the same configuration so all of your roleX servers are now identical except what ever variables are set in your external data store which can now easily be diff’d.

build, don’t configure

By this point, phase 1 & 2 are done, all is well with the world but still there’s some oddities Box X has a patch level y and box A has a patch level z, or there’s some left over hack to solve a prod issue which causes a problem on one of the servers. Well treat your servers as configurable and throw-away-able, There’s many technologies to help with this be it cloud based with Amazon and OpenStack or maybe VMWare, even physical servers with cobbler. This is Phase 3, build everything from scratch every time, at this point the consistency of the environment is pretty good leaving only the data in each environment to contend with.

Summary

Try and treat configuration management as something more than just config files on servers and be persistent about making everything as simple as possible while trying to get everything into it. If you’re only going to manage the files you might as well use tar’s and if that sounds crazy it’s the same level as phase 1 which is why you have to get everything in and I realise it can seem a massive task but start with the application stack you’re running and then cherry pick the modules/cookbooks that already exist for the main OS components like ntp, ssh etc

Cut down, deliver early and often

Deliver all the things

There’s idealistic people in the world and that’s fine (thanks for reading by the way :) ), and there’s pragmatic people, I want to go through how you can provide solutions that give a pragmatic approach to delivering value and doing it in a way that gets it done on time but still helps you get to your idealistic goals.

Often when sitting down and planning with senior management bods a feature list as long as your arm comes out, the reality is even if this is thought to be the bear minimum list, in reality it probably isn’t and is instead a bloated minimum, there’s always room to cut out features, so agree some prioritisation on the features and operate a bucket approach, one in one out.

After the features are agreed you can now start about delivering them, cutting where necessary.

Deadlines for Deadlines sake

When it comes to deadlines people take them a bit like marmite, you either love it or you hate it. Some people feel that having a deadline is a sure fire way of creating a bad product as corners are cut, others think that if you have a deadline you can at least work towards something with an end in sight rather than running off into the wild forever and ever.

To deliver the features needed it’s better to have a deadline, and one that stretches you and forces you to make some cuts, it’s not about not delivering or delivering badly it’s just about delivering what is needed, worse case scenario you have to deliver everything, but this way at least you do it in stages.

Certainly with a deadline you can help focus people on delivering what is important, I think sometimes people get caught up in trying to deliver everything perfectly for the deadline rather than delivering the value they already have. Some of my colleagues and mysef are currently working on a monitoring and metrics platform that integrates fully with nagios style checks but also allows you to write them in the web browser and test them on a server of your choice before distributing. The idea being that you can take the monitoring up to a real time level while reporting back business level reporting and everything in between so you have one place to go to to find out why something isn’t working and how well it has been doing, how many people have signed up; it’s a devops dashboard really.

Anyway, for a couple of months we have been identifying the core technologies and implementing various key functionality to the product but with at no point was any of it “working” some bits sort of worked but not quite, some bits just weren’t there. There was no real end date to this project as it’s something that will keep involving until it works and is useful however we need something to work towards and after a couple of months of sorting out the technology a deadline was set to do a demo and within a week we had the product up and working with the pages we needed with the correct functionality and everything working fine. Writing a nagios check on the fly, pushing it to the server distributing it to all the others and then reporting all in less than 30 seconds, wonderful.

I’m not saying what we did in that week was “production ready” but if our livelihood depended on it, it was good enough, and thats what being agile and lean is about. What is the least amount of work I can do to get me to the minimum product I need in the least amount of effort. The key is to obviously not get stuck delivering bear minimum all the time, with every sprint you need to improve upon what was there as well as add the new stuff; I think it is necessary to always fix something up when adding new features to get the product better and it certainly works for us.

Alternatively, of course, we could have not had a deadline and kept drifting aimlessly into the distance ensuring that the technology was “just right” all the time but the reality is we have to deliver something somewhere.

Iterate

Anyone that is familiar with Agile, Scrum, Extreme programming etc knows it’s better to deliver in small bite size pieces than in large chunks, you can provide value back to the business quicker and you focus on doing the task rather than doing it well. Not all tasks can be done by cutting a few corners but there’s normally a quick way, a good way and the right way fo doing it, so choose one and go for it, if it is a bit of angular that pulls down a list of plugins, go the quick way, if it’s a graphing engine that needs to draw lots of graphs and is used everywhere do it the right way; you’re sensible people, find a balance.

I’ve been talking all about software development which is where most of these methodologies come from, but they can be applied to systems administration as well, I think the same goes for sysadmins as it does programmers, they tend to get stuck in doing the best solution rather than the solution the business needs. Just to dispel any hopes and dreams, maybe save some time by realising that the business cares it works and is stable not how elegant or easy to maintain it is. So when coming up with a load balancing solution, maybe version 1 is haproxy with basic config and version 2 is a bit more in depth, version 3 is F5 & haproxy, version 4 is F5, haproxy and caching…. By all means have the hopes and dreams of the gold solution, but deliver the bronze one okay. If people really use the system and it provides more value iteratively make it better, maybe a bronze + a bit of silver, litle chunks, often.

Summary

Don’t get stuck in the end goal, think about what does the client really need or the business really need, bear minimum; deliver that, measure usage, iterate and improve.

Bash-off – a way to relieve your self

Bash off!

In our team at work we have this concept which we call a “Bash off” the idea is simple. Take something that you could do very simply in bash in several steps with manual fudging in the middle to end up with a result. The sort of thing that may take you an hour to brute force your way through. Now in your programming language of choice, automate the entire process.

Sounds simple right ? I remember the first task we had which was to grab two unique columns of data and to stream them onto the screen, I think it was originally done with watch, tail and tee, took about 20 mins of playing with bash. As a team we chose our own languages and went for it, an hour later we all had a working prototype, another hour later we all had our programs really efficient.

Obviously these are really pointless, but they do have a couple of benefits that help a DevOps person stay good at what they do. It gets you to do more with your language of choice than you normally would and it also causes you to think about how you’re structuring code to make it more efficient, it also helps bond the team together with a bit of healthy competition.

To clarify just because it is called a bash off that dosen’t mean that a solution can’t be in bash. So if it was a very manual human process that you believe you can fully automate in bash, Go for it :)

The challenge

This weeks challenge came from our friends in finance, they have a spreadsheet with some 14k rows in it, two of the columns have many duplicated fields in them and the have a one to many mapping between them. So the challenge is to get from 14k rows down to just the unique entries in each of these two columns and to then make sure the mapping in the spreadsheet can be looked up from a database (where the spreadsheet results must end up)

Due to some meetings I was late to the game on this task, pesky meetings; we had a bash driven prototype which we excluded because it required the excel spreadsheet to be turned into a csv and a python one that sort of got some fields from the spreadsheet and pushed them into the DB.

This is where I picked it up, sort of working; I’ve not done a lot of python so I forked the code and me and my boss went our separate ways to achieve the task, in the end it took us a couple of hours to get this working and for us to have the same results. All in all it we had achieved the processing of 14k rows of data and the manipulation into a db with the correct data, but was that enough? No We decided that it taking about 1 min was not good enough so we started focusing on making it better, I think my run time was 58 seconds and my bosses 40.

We had both chosen different ways of doing it, I had chosen to use the DB to ensure the fields were unique by checking if that field existed in the DB and if not to create it and returned the id, if it was in the DB it would return the id. My bosses approach was better, he created lists with the data in and then made the lists unique. I decided I had to get mine down to a similar speed so I started hacking it around; I decided I would store the unique entries in a list for each table and then before calling the method that puts the data in the database I would check if that unique value was already there. The first issue with this is I lost the ID number which I needed to add to the lookup table so I had to change the list to a dictionary.

I also found by moving the commit messages for the database out side of any of the methods and just dropping it at the end saved a few seconds, I was also able to remove a couple of additional DB queries all of this helped bring the time down. One of the best changes I did was on the lookup table when doing a query only pulling back one ID, rather than two; I didn’t even need it but I couldn’t see a way of querying sql alchemy with out having a field to bring back that would be quick.

As the night progressed we both made good progress, my Boss got his down to 14.1 seconds I managed to get mine down to 21 seconds so we had both made massive improvements in our codes efficiency. My boss was making use of gevent but when I tried this it slowed my program down so I left it out not understanding it anyway. I kept pluggin away and I made it down to 13.5 seconds.

Summary

I urge you to take an afternoon out in your team and to push your skills forward with a programming challenge and to see what happens, it will make you better, you will learn stuff, it is fun, it does bond the team and you will enjoy it, it is also a waste of time but what else were you going to do ?