GNU Parallel – When bash just isn’t quick enough

Bash not cutting it for you?

We were looking at using nrpe to do smoke tests of our product after a deployment to give us a feel for how well the system faired with the deployment. Well, we wanted it to run quicker, anyone that has performance tuned bash scripts knows that you strip out as much as possible to and use the in built functions as they are quicker.

Well we had done that and still it wasn’t quickenough. My boss came across parallel Luckily the documentation was pretty good but we still had fun playing with it.

Imagine a situation where you want to run the following commands

echo 1 1
echo 1 2
echo 1 3
echo 2 1
echo 2 2
echo 2 3

Typically to do this you may write some bash loops to achieve this as follows

for i in {1..2}
do
    for x in {1..3}
    do
        echo "echo $i $x"
    done
done

This is a quite simple example, so lets look at something more useful, how would you do the following commands?

service tomcat6 stop
service tomcat6 start
service httpd stop
service httpd start
service nscd stop
service nscd start

Well, if you didn’t realise it’s the same loop above just with slightly different numbers… but what if I said you coud do that all on one line with no complicated syntax? Well this is where parallel fits in, it’s quite useful.

parallel service {1} {2} :::tomcat6 httpd nscd ::: stop start

Now it’s worth mentioning, Never do the above and in fact read on…

So having done Parallel a great disservice by reducing it to argument replacement it is probably worth mentioning you can do some other funky stuff. Which you can find on the documentation page, and it may be more applicable to your needs.

The other useful feature of parallel which I sort of started this blog on is the fact that it is run in parallel…. so if you consider the 6 service commands above, to execute all 6 could take 45 seconds, but by using parallel you could significantly reduce the time it takes to execute.

A real world example

Here is a sample of the smoke test script we’re running which makes use of MCollective with the nrpe plugin, this is a quick script my boss knocked up and I decided to steal some of it as an example, Thanks Steve.

#!/bin/bash

env=$1

function check
{
  echo -e "\nRunning $1 nagios check\n" 
  mco nrpe $1 -I /^$2/ 
}

# Arrays!

Common[0]='check_disk_all'
Common[1]='check_uptime'
Common[2]='check_yum'
Common[3]='check_ssh_processes'
Common[4]='check_load_avg'
Common[5]='check_unix_memory'
Common[6]='check_unix_swap'
Common[7]='check_zombie_processes'

Web[0]='check_apache_processes'

for i in "${Common[@]}"
do
  check $i $1server1
  check $i $1server2
  check $i $1server3
  check $i $1server4
done

# Check the Web nodes

for i in "${Web[@]}"
do
  check $i $1server1
done

Now the script above is abridged but the time is from the full script, and it takes 2 min 36 seconds, Well the parallel script is a little faster at 1 min 25 seconds, not bad.

Here is a sample of the same code above but in parallels

#!/bin/bash

env=$1

# Common checks
parallel mco nrpe {1} --np -q -I /^${env}{2}/ ::: check_disk_all check_uptime check_yum check_ssh_processes check_load_avg check_unix_memory check_unix_swap check_zombie_processes ::: server1 server2 server3 server4

parallel mco nrpe {1} --np -q -I /^${env}{2}/ ::: check_apache_processes ::: server1

It is not really pretty but it is quick.

parallel has all of the features of xargs too but with the added bonus of being in parallel so you can get these massive time savings if needed, the only thing to bear in mind is that it is parallel, sounds like a silly thing to mention but it does mean the service example above would have to be done in two parts to ensure the stop was done before the start.

Hopefully that’s proved a little interesting, enough to play with and make something truly amazing happen, Enjoy.