27 May 2018
Most Unix and Linux commands take options preceded by the "minus" symbol, so to list files in long format, ordered (in reverse) by
their timestamp, you use: ls -l -r -t
, which can also be expressed as ls -lrt
.
Some commands also take arguments, so you can create a tar
archive of the "myfiles
" directory with a
name "mytarfile.tar
" taken from the -f
option with the command: tar -c -f mytarfile.tar myfiles
.
In the case of tar -c
, any words after the options are taken to be the list of file(s) to put in the archive.
Okay, here's the "too long - didn't read" quick synopsis. Here's how you use getopts
:
while getopts 'srd:f:' c do case $c in s) ACTION=SAVE ;; r) ACTION=RESTORE ;; d) DB_DUMP=$OPTARG ;; f) TARBALL=$OPTARG ;; esac done
There is also the external utility getopt, which parses long-form arguments, like "--filename
" instead of the briefer "-f
" form. You might want to read that post, too.
Right, now that's got the busy people satisfied, we can start to explore what getopts
is, how it works, and how
it can be useful to your scripts.
There is a convenient utility which parses these options for you; it is called getopts
, and whilst its usage can feel a
little strange, using this technique will allow your scripts to process options in a standardised and familiar-feeling way.
The first argument you pass to getopts
is a list of which letters (or numbers, or any other single character)
it will accept. Each character, if followed by a colon, is expected to be followed an argument, like the tar -f mytarfile.tar
example
above. tar -f
always has to be followed by the name of a tar file. This option argument is passed to your script in the $OPTARG
variable.
getopts
will also set the $OPTIND
variable for you; we will deal with that later.
The second argument that you pass to getopts
is the name of a variable which will be populated with the character of the current
switch. Often, this is called opt
or just c
, although it can have any name you choose.
This example script can save/restore files (as a tarball) and a database. You must pass it either -s
(Save) or -r
(Restore). If you pass -d databasefile
then it will use that name to dump (or restore) the database; if you pass -f tarball
,
it will use that name for the tarball to create (or extract) the files.
There are a few things this first draft of the script doesn't deal with; passing both -s
and -r
is invalid. If you do,
this script will take the last thing you said, so dbdump.sh -s -r -d dbdump.bin -s -r -s
will Save (not Restore) since the last thing
it processed was the Save command.
Similarly, if you don't pass at least one of -d
and -f
, then nothing will happen at all.
The reason for the unset DB_DUMP TARBALL ACTION
is that the script does not want to be influenced by any environment
variables which may be already set. Note that this will only affect the scope of the running script; the calling shell won't have its
variables changed.
For brevity, I have not defined the save_database()
, save_files()
, restore_database()
and
restore_files()
functions here; the downloadable scripts do have dummy functions so that the scripts will actually run for you. They just
display what would be done, but don't actually do anything to your files.
#!/bin/bash unset DB_DUMP TARBALL ACTION while getopts 'srd:f:' c do case $c in s) ACTION=SAVE ;; r) ACTION=RESTORE ;; d) DB_DUMP=$OPTARG ;; f) TARBALL=$OPTARG ;; esac done if [ -n "$DB_DUMP" ]; then case $ACTION in SAVE) save_database $DB_DUMP ;; RESTORE) restore_database $DB_DUMP ;; esac fi if [ -n "$TARBALL" ]; then case $ACTION in SAVE) save_files $TARBALL ;; RESTORE) restore_files $TARBALL ;; esac fi
The getopts
command is an argument to a while
loop - each time through the loop, it processes the switch, and sets the
$c
variable to the character of the switch. You can read more about loops and case
in the main tutorial.
If we call this script as: dbdump.sh -s -r -d /tmp/dbdump.bin -f /tmp/files.tar -s
, it will process the -s
, set
$c=s
, and we run into the case
statement for the first time. This sees that $c=s
, sets $ACTION=SAVE
,
and the ;;
at the end of that line tells it to stop processing, and it goes back to getopts
for the next run around
the while
loop. This reads -r
, which logically doesn't make sense (we can't have it both save the backup and restore
the backup at the same time), but the script doesn't know that, so it sets $c=r
, the case
statement sets $ACTION=RESTORE
, and we go back to getopts
to process the next argument.
Now, getopts
sets $c=d
and also sets $OPTARG=/tmp/dbdump.bin
, because the 'd:' in the getopts
invocation tells it that -d
is followed by an argument (the name of the database dump file). Execution goes on to the case
statement, which sets $DBDUMP=/tmp/dbdump.bin
. When we get in to the main body of the script, if the $DBDUMP
variable
has a value, then it will either save the database to that file, or restore it from that file.
The next option is -f /tmp/files.tar
, and the same process is followed; getopts
sets $c=f
and also sets
$OPTARG=/tmp/files.tar
. The case
statement reads these, and sets $TARBALL=/tmp/files.tar
.
Finally, we passed it yet another -s
switch, so it will change the $ACTION
variable back to SAVE
.
When the main script starts, it checks if $DB_DUMP
is set, then checks the value of $ACTION
, and either saves or
restores the database, using $DB_DUMP
, according to the value of $ACTION
.
Similarly, it checks if $TARBALL
has been set, and either saves or restores the files with $TARBALL
as the argument.
This second version of the script uses a couple of useful functions. It is often convenient to have a usage()
function, which
tells the user the correct way to call the script, and exits with a non-zero error code, to express that it has failed.
It also has a set_variable()
function, which can be useful. This script doesn't really need it - the example above just set
the variables as it needed to. But since this function does various bits of error checking, it's worth putting all of that into a function
and calling it multiple times, rather than repeating the "is this variable already set?" code for every command-line option.
It uses eval
and variable indirection. When called as
set_variable ACTION SAVE
, it sets $varname=ACTION
, then checks 'if [ -z "${!varname}" ]
'. The exclamation
mark before varname
tells the shell to replace that with the value of $varname
, so the -z
test
will check whether the $ACTION
variable is of zero length. If so, it evaluates $varname=\"$@\"
, which works out as
ACTION=SAVE
in this case. It could just use: eval "$varname=$1"
, but that wouldn't allow for spaces in the filenames,
when we come to use this function to set those, too.
The real reason that set_variable()
is useful is that if ${!varname}
is not zero length, the function will spit
out an error message, that $varname
is already set, and call the usage()
function, which reminds the user of the correct
syntax, and exits the whole script.
The next change to this script from the first example, is that it adds -?
and -h
switches. These are common ways to
query a program to find out what the correct usage is. So if the user runs it as: dbdump.sh -?
or dbdump.sh -h
, it will
show them the usage()
message, and exit.
Two final checks before the main script gets underway;, the first checks to ensure that $ACTION
has been set (if it was called without either
-s
or -r
). If $ACTION
is zero length, it calls the usage function.
Then, it checks that at least one of $DB_DUMP
and $TARBALL
have been set. The logic of this is a little unintuitive: If
the -z "$DB_DUMP"
test passes, then the &&
passes execution to the next test, -z "$TARBALL"
. If that also
passes the test, it continues execution via the second &&
to the usage()
function, which will display the message and
terminate the script.
If either of those tests failed, then at least one of them is set, the script does have something to do, and the usage()
function
does not get called, which means that the script is allowed to continue.
#!/bin/bash usage() { echo "Usage: $0 [-s|-r] [ -d DB_DUMP ] [ -f TARBALL ]" exit 2 } set_variable() { local varname=$1 shift if [ -z "${!varname}" ]; then eval "$varname=\"$@\"" else echo "Error: $varname already set" usage fi } ######################### # Main script starts here unset DB_DUMP TARBALL ACTION while getopts 'srd:f:?h' c do case $c in s) set_variable ACTION SAVE ;; r) set_variable ACTION RESTORE ;; d) set_variable DB_DUMP $OPTARG ;; f) set_variable TARBALL $OPTARG ;; h|?) usage ;; esac done [ -z "$ACTION" ] && usage [ -z "$DB_DUMP" ] && [ -z "$TARBALL" ] && usage if [ -n "$DB_DUMP" ]; then case $ACTION in SAVE) save_database $DB_DUMP ;; RESTORE) restore_database $DB_DUMP ;; esac fi if [ -n "$TARBALL" ]; then case $ACTION in SAVE) save_files $TARBALL ;; RESTORE) restore_files $TARBALL ;; esac fi
We mentioned above that the other variable that getopts
will set for you is the index of where you are up to in
processing the options; this is the $OPTIND
variable. This is a bit of an odd one; it's the index of the next
variable to be processed, so if your script
takes arguments: "dbdump.sh -s -d foo -f bar
", then as it's processing -s
, $OPTIND
is 2
.
When it's processing -d foo
, $OPTIND
is 4
, because the next thing it will process will be the 4th
argument ("-f
").
The $OPTIND
variable is useful when your script processes some switches, followed by further arguments. For example, ls -ltr /tmp/*.txt /tmp/*.png
stops
processing switches when it finds the first non-option argument (/tmp/*.txt).
This third example processes its command line arguments, and then operates on any files it has been given.
download this script (getopts3.sh)#!/bin/bash unset VERBOSE while getopts 'smv' c do echo "Processing $c : OPTIND is $OPTIND" case $c in s) ACTION=sha1sum ;; m) ACTION=md5sum ;; v) VERBOSE=true ;; esac done echo "Out of the getopts loop. OPTIND is now $OPTIND" shift $((OPTIND-1)) if [ $VERBOSE ]; then set -x fi $ACTION $@
If you pass it -s
, it will check the sha1sum
checksum of the files; -m
tells it to do a md5sum
checksum instead. To help us see how $OPTIND
changes, a -v
switch enables verbose execution.
After getopts
has finished parsing the switches, it exits the loop, with $OPTIND
set to 2 (if only one switch was used)
or 3 (if two switches were used). Therefore, we need to call shift
with $OPTIND-1
to get rid of the first 1 (or 2) arguments
from the command line. This leaves $@
with the rest of the command line, which should be a list of files.
Here, I ran the script from this directory, which contains getopts1.sh
, getopts2.sh
and getopts3.sh
, so
the *.sh
represents these files.
$ ./getopts3.sh -v -m *sh Processing v : OPTIND is 2 Processing m : OPTIND is 3 Out of the getopts loop. OPTIND is now 3 + md5sum getopts1.sh getopts2.sh getopts3.sh e0f015e6f47709c9639589c98886c429 getopts1.sh ac16051be13728378f8827c0db84c6ae getopts2.sh 41521e2253685ab252f87ee820b466b7 getopts3.sh $
So when the script processes the -v
switch, $OPTIND=2
. It then moves on to -m
and $OPTIND=3
.
So the script calls shift 2
by working out the value of $OPTIND-1
. It can now call $ACTION *.sh
, which will
evaluate as md5sum getopts1.sh getopts2.sh getopts3.sh
(or sha1sum getopts1.sh getopts2.sh getopts3.sh
, if you ran it with the -s
switch.
My Shell Scripting books, available in Paperback and eBook formats. This tutorial is more of a general introduction to Shell Scripting, the longer Shell Scripting: Expert Recipes for Linux, Bash and more book covers every aspect of Bash in detail.
![]() Shell Scripting Tutorial is this tutorial, in 88-page Paperback and eBook formats. Convenient to read on the go, and in paperback format good to keep by your desk as an ever-present companion. Also available in PDF form from Gumroad:Get this tutorial as a PDF | ![]() Shell Scripting: Expert Recipes for Linux, Bash and more is my 564-page book on Shell Scripting. The first half covers all of the features of the shell in every detail; the second half has real-world shell scripts, organised by topic, along with detailed discussion of each script. |