Back to Basics with Bash

July 17th, 2012 by Allister Banks

The default shell for Macs has been bash for as long as some of us can remember(as long as we forget it was tcsh through 10.2.8… and before that… there was no shell, it was OS 9!) Bash as a scripting language doesn’t get the best reputation as it is certainly suboptimal and generally unoptimized for modern workflows. To get common things done you need to care about procedural tasks and things can become very ‘heavy’ very quickly. With more modern programming languages that have niceties like API’s and libraries, the catchphrase you’ll hear is you get loads of functionality ‘for free,’ but it’s good to know how far we can get, and why those object-oriented folks keep telling us we’re missing out. And, although most of us are using bash every time we open a shell(zsh users probably know all this stuff anyway) there are things a lot of us aren’t doing in scripts that could be better. Bash is not going away, and is plenty serviceable for ‘lighter’, one-off tasks, so over the course of a few posts we’ll touch on bash-related topics.

Some things even a long-time scripter may easily overlook is how we might set variables more smartly and _often_, making good decisions and being specific about what we choose to variable-ize. If the purpose of a script is to customize things in a way that’s reusable, making a variable out of that customization (say, for example, a hostname or notification email address) allows us to easily re-set that variable in the future. And in our line of work, if you do something once, it is highly probable you’ll do it again.

Something else you may have seen in certain scripts is the PATH variable being explicitly set or overridden, under the assumption that may not be set in the environment the script runs in, or the droids binaries we’re looking for will definitely be found once we set the path directories specifically. This is well-intentioned, but imprecise to put it one way, clunky to put it another. Setting a custom path, or having binaries customized that could end up interacting with our script may cause unintended issues, so some paranoia should be exhibited. As scientists and troubleshooters, being as specific as possible always pays returns, so a guiding principle we should consider adopting is to, instead of setting the path and assuming, make a variable for each binary called as part of a script.

Now would probably be a good time to mention a few things that assist us when setting variables for binaries. Oh, and as conventions go, it helps to leave variable names that are set for binaries as lowercase, and all caps for the customizations we’re shoving in, which helps us visually see only our customized info in all caps as we debug/inspect the script and when we go in to update those variables for a new environment. /usr/bin/which tells us what is the path to the binary which is currently the first discovered in our path, for example ‘which which’ tells us we first found a version of ‘which’ in /usr/bin. Similarly, you may realize from its name what /usr/bin/whereis does. Man pages as a mini-topic is also discussed here. However, a more useful way to tell if you’re using the most efficient version of a binary is to check it with /usr/bin/type. If it’s a shell builtin, like echo, it may be faster than alternatives found at other paths, and you may not even find it necessary to make a variable for it, since there is little chance someone has decided to replace bash’s builtin ‘cd’…

The last practice we’ll try to spread the adoption of is using declare when setting variables. Again, while a lazy sysadmin is a good sysadmin, a precise one doesn’t have to worry about as many failures. A lack of portability across shells helped folks overlook it, but this is useful even if it is bash-specific. When you use declare and -r for read-only, you’re ensuring your variable doesn’t accidentally get overwritten later in the script. Just like the tool ‘set’ for shell settings, which is used to debug scripts when using the xtrace option for tracing how variables are expanded and executed, you can remove the type designation from variables with a +, e.g. set +x. Integers can be ensured by using -i (which frees us from using ‘let’ when we are just simply setting a number), arrays with -a, and when you need a variable to stick around for longer than the individual script it’s set in or the current environment, you can export the variable with -x. Alternately, if you must use the same exact variable name with a different value inside a nested script, you can set the variable as local so you don’t ‘cross the streams’. We hope this starts a conversation on proper bash-ing, look forward to more ‘back to basics’ posts like this one.

Tags: , ,

Comments are closed.