Bash not cutting it for you?
We were looking at using nrpe to do smoke tests of our product after a deployment to give us a feel for how well the system faired with the deployment. Well, we wanted it to run quicker, anyone that has performance tuned bash scripts knows that you strip out as much as possible to and use the in built functions as they are quicker.
Well we had done that and still it wasn’t quickenough. My boss came across parallel Luckily the documentation was pretty good but we still had fun playing with it.
Imagine a situation where you want to run the following commands
echo 1 1 echo 1 2 echo 1 3 echo 2 1 echo 2 2 echo 2 3
Typically to do this you may write some bash loops to achieve this as follows
for i in {1..2} do for x in {1..3} do echo "echo $i $x" done done
This is a quite simple example, so lets look at something more useful, how would you do the following commands?
service tomcat6 stop service tomcat6 start service httpd stop service httpd start service nscd stop service nscd start
Well, if you didn’t realise it’s the same loop above just with slightly different numbers… but what if I said you coud do that all on one line with no complicated syntax? Well this is where parallel fits in, it’s quite useful.
parallel service {1} {2} :::tomcat6 httpd nscd ::: stop start
Now it’s worth mentioning, Never do the above and in fact read on…
So having done Parallel a great disservice by reducing it to argument replacement it is probably worth mentioning you can do some other funky stuff. Which you can find on the documentation page, and it may be more applicable to your needs.
The other useful feature of parallel which I sort of started this blog on is the fact that it is run in parallel…. so if you consider the 6 service commands above, to execute all 6 could take 45 seconds, but by using parallel you could significantly reduce the time it takes to execute.
A real world example
Here is a sample of the smoke test script we’re running which makes use of MCollective with the nrpe plugin, this is a quick script my boss knocked up and I decided to steal some of it as an example, Thanks Steve.
#!/bin/bash env=$1 function check { echo -e "\nRunning $1 nagios check\n" mco nrpe $1 -I /^$2/ } # Arrays! Common[0]='check_disk_all' Common[1]='check_uptime' Common[2]='check_yum' Common[3]='check_ssh_processes' Common[4]='check_load_avg' Common[5]='check_unix_memory' Common[6]='check_unix_swap' Common[7]='check_zombie_processes' Web[0]='check_apache_processes' for i in "${Common[@]}" do check $i $1server1 check $i $1server2 check $i $1server3 check $i $1server4 done # Check the Web nodes for i in "${Web[@]}" do check $i $1server1 done
Now the script above is abridged but the time is from the full script, and it takes 2 min 36 seconds, Well the parallel script is a little faster at 1 min 25 seconds, not bad.
Here is a sample of the same code above but in parallels
#!/bin/bash env=$1 # Common checks parallel mco nrpe {1} --np -q -I /^${env}{2}/ ::: check_disk_all check_uptime check_yum check_ssh_processes check_load_avg check_unix_memory check_unix_swap check_zombie_processes ::: server1 server2 server3 server4 parallel mco nrpe {1} --np -q -I /^${env}{2}/ ::: check_apache_processes ::: server1
It is not really pretty but it is quick.
parallel has all of the features of xargs too but with the added bonus of being in parallel so you can get these massive time savings if needed, the only thing to bear in mind is that it is parallel, sounds like a silly thing to mention but it does mean the service example above would have to be done in two parts to ensure the stop was done before the start.
Hopefully that’s proved a little interesting, enough to play with and make something truly amazing happen, Enjoy.
When running jobs that are not limited by CPU, RAM or disk, then -j0 can often speed things up. Without -j0 GNU Parallel will only run one job per CPU core.
To make the long lines prettier in BASH consider using the line continuation after each input source: \
parallel -j0 mco nrpe {1} –np -q -I /^${env}{2}/ \
::: check_disk_all check_uptime check_yum check_ssh_processes check_load_avg check_unix_memory check_unix_swap check_zombie_processes \
::: server1 server2 server3 server4
You can also use BASH arrays to make the code easier to read.
checks=(check_disk_all check_uptime check_yum check_ssh_processes check_load_avg check_unix_memory check_unix_swap check_zombie_processes)
servers=(server1 server2 server3 server4)
parallel -j0 echo mco nrpe {1} –np -q -I /^${env}{2}/ ::: “${checks[@]}” ::: “${servers[@]}”
Combined with the function it looks like this (remember bash functions needs to be export -f’ed):
#!/bin/bash
env=$1
function check
{
echo -e “\nRunning $1 nagios check\n”
mco nrpe $1 -I /^$2/
}
export -f check
checks=(check_disk_all check_uptime check_yum check_ssh_processes check_load_avg check_unix_memory check_unix_swap check_zombie_processes)
servers=(server1 server2 server3 server4)
parallel -j0 check {1} ${env}{2} ::: “${checks[@]}” ::: “${servers[@]}”
Thanks for the info, I didn’t even think of creating a function and calling that with parallel that would have made life a bit simpler, especially in the new version which changes the colour of the output based on the return code of the check.
Another thing that I thought was useful was the use of variables within parallel (from the man page)
find . | parallel ‘a={}; name=${a##*/}; upper=$(echo “$name” | tr “[:lower:]” “[:upper:]”); echo “$name – $upper”‘