RSS Feed Share on Twitter

Buy this tutorial as a PDF for only $5

All Shell Scripting Tips

29 Jan 2020

Parsing long command-line arguments with getopt

We have already covered the getopts command which is built in to the Bash shell (and most other shells, too). getopts can deal with single-character option letters (such as the simple flags -a and -b as well as -c foo and -d bar having additional parameters with them - useful for "-f filename", for example.

However, it is sometimes useful, convenient, or even necessary to use longer option names, such as --filename or --target. The shell builtin getopts command does not deal with those options, however the GNU getopt command, which is an external command (not built-in to the shell) can deal with these longer styles as well as the short form. The resulting code is arguably a little bit less easy to read, but it achieves the desired result in the tidiest way, so this technique is well worth knowing.

There is also a second style of switches which getopt also deals with; you can use whitespace between the switch and its value, like "--charlie Charles", or you can use an "equals" symbol, like "--delta=River". These are handled automatically by getopt, as both are converted into the former style.

You will have noticed that "-a" has a single hyphen, while "--alpha" has two. A third style that getopt optionally supports is to use a single "-" even for the long form, as "-alpha" instead of "--alpha", or "-charlie Charles" instead of "--charlie Charles". This will be handled if you use the "-a" switch to getopt itself; see below for an explanation of the arguments used in this simple (but complete) example script.

Basic Usage

If you just want the simple crib of how to use it, let's cut to the chase. We will then go deeper into what is going on under the covers, so that you can actually understand properly how this useful tool works.

Let's say we want to create a simple shell script called alphabet.sh which has the following usage pattern:

alphabet.sh \
    -a, --alpha \
    -b, --beta \
    -c, --charlie=Charles \
    -d, --delta=River \
    filename(s)

The following script processes these four parametes (followed by a list of filenames), handles invalid input, and shows the status of its variables after it has been run.

download this script (alphabet.sh)
#!/bin/bash
# Set some default values:
ALPHA=unset
BETA=unset
CHARLIE=unset
DELTA=unset

usage()
{
  echo "Usage: alphabet [ -a | --alpha ] [ -b | --beta ]
                        [ -c | --charlie CHARLIE ] 
                        [ -d | --delta   DELTA   ] filename(s)"
  exit 2
}

PARSED_ARGUMENTS=$(getopt -a -n alphabet -o abc:d: --long alpha,bravo,charlie:,delta: -- "$@")
VALID_ARGUMENTS=$?
if [ "$VALID_ARGUMENTS" != "0" ]; then
  usage
fi

echo "PARSED_ARGUMENTS is $PARSED_ARGUMENTS"
eval set -- "$PARSED_ARGUMENTS"
while :
do
  case "$1" in
    -a | --alpha)   ALPHA=1      ; shift   ;;
    -b | --beta)    BETA=1       ; shift   ;;
    -c | --charlie) CHARLIE="$2" ; shift 2 ;;
    -d | --delta)   DELTA="$2"   ; shift 2 ;;
    # -- means the end of the arguments; drop this, and break out of the while loop
    --) shift; break ;;
    # If invalid options were passed, then getopt should have reported an error,
    # which we checked as VALID_ARGUMENTS when getopt was called...
    *) echo "Unexpected option: $1 - this should not happen."
       usage ;;
  esac
done

echo "ALPHA   : $ALPHA"
echo "BETA    : $BETA "
echo "CHARLIE : $CHARLIE"
echo "DELTA   : $DELTA"
echo "Parameters remaining are: $@"

Sample Runs

Here are some examples of executing the script above, and the variety of inputs it deals with.

1. Short Options

The getopt method still deals with short options just as getopts does.

$ ./alphabet.sh -a -b -c charlie -d river lorem ipsum
PARSED_ARGUMENTS is  -a -b -c 'charlie' -d 'river' -- 'lorem' 'ipsum'
ALPHA   : 1
BRAVO   : 1 
CHARLIE : charlie
DELTA   : river
Parameters remaining are: lorem ipsum

2. Long Options

All-long options can be dealt with, which can be more user-friendly to deal with.

$ ./alphabet.sh --alpha --bravo --charlie charlie --delta river lorem ipsum
PARSED_ARGUMENTS is  --alpha --bravo --charlie 'charlie' --delta 'river' -- 'lorem' 'ipsum'
ALPHA   : 1
BRAVO   : 1 
CHARLIE : charlie
DELTA   : river
Parameters remaining are: lorem ipsum
$ 

3. Combination of Long, Short, Equal and Space Delimiters

Here we mix the long and short, with a single or double dash before the long options, and the use of space or equals sign between the -c and -d switches and their parameters.

$ $ ./alphabet.sh -a --bravo -c=Charlie -delta River lorem ipsum
PARSED_ARGUMENTS is  -a --bravo --charlie 'Charlie' --delta 'River' -- 'lorem' 'ipsum'
ALPHA   : 1
BRAVO   : 1 
CHARLIE : Charlie
DELTA   : River
Parameters remaining are: lorem ipsum
$ 

Analysis of the above script

If you want to use getopt to its fullest, then just copying this simple example will not be enough for you; you'll want to investigate how this script was put together.

I'll try to define a couple of terms first - in this example script, "-a" and "-b" are often referred to as "switches" since they don't take any parameters, whereas "-c" and "-d" do take values, or parameters, which are also known as arguments, and in this context, sometimes referred to as "OPTARG". I'll try to be consistent with these terms!

 

Okay, let's get in to the script...

# Set some default values:
ALPHA=unset
BETA=unset
CHARLIE=unset
DELTA=unset

First of all, default values were set; this is generally a useful thing to do, though of course your situation will determine what you choose to do - it may be that you just want an existing (exported) environment variable to take precedence if no arguments are passed.

 

usage()
{
  echo "Usage: alphabet [ -a | --alpha ] [ -b | --beta ]
  ... etc ...
}

Then, a usage() function is defined. This can be a useful thing to put at the top of a script, even if the function is only called once; it tells anybody reading the script what the usage is!

 

PARSED_ARGUMENTS=$(getopt -a -n alphabet -o abc:d: --long alpha,bravo,charlie:,delta: -- "$@")

Now we actually call the getopt program.

We save its output into a variable, called $PARSED_ARGUMENTS - the output from getopt is a more standardised version of whatever the user gave us. We save this, then regurgitate it later, for easier consumption.

Like getopts, getopt uses the colon (:) to indicate that an argument expects a value, like "-d=river" rather than a simple "-d." So here we can see that "-o abc:d:" means "short options abcd, of which c: and d: require a value."

Similarly, "--long alpha,bravo,charlie:,delta:" means "alpha and bravo require no values passing, but "charlie:" and "delta:" do, because they have the colon after their names.

Here, we have passed the -a option to getopt to enable the "alternative" mode, whereby a single hyphen is also accepted (so "-alpha" is the same as "--alpha"). Then the "-n alphabet" option tells getopt that the program is actually known to the user as "alphabet". Without this, getopt would spit out error messages with "getopt" as the program name, like this this:

$ ./alphabet -x
getopt: unrecognized option '-x'
...

It is much friendlier for your user to get the message apparently coming from the "alphabet" script they have called, rather than the mysterious "getopt" utility which they may never have heard of!

$ ./alphabet.sh -x
alphabet: unrecognized option '-x'
...

 

If getopt accepted all of the input, it will return a status code of zero (0) to indicate success. Otherwise, it will return a non-zero status code. This is passed to the calling shell in the special $? variable. We save this as VALID_ARGUMENTS - you could just check $? directly, but it's nice to save it in a variable in case some extra command gets inserted between the getopt and the test.
If the VALID_ARGUMENTS does not indicate success, we call the usage() function, which will also exit the script.

 

echo "PARSED_ARGUMENTS is $PARSED_ARGUMENTS"

For informational purposes, we then display the status of the PARSED_ARGUMENTS variable. This is useful for understanding how getopt has mangled the user's input into a standardised form.

 

eval set -- "$PARSED_ARGUMENTS"

By reading that set of standardised arguments into the shell's input arguments the shell script now thinks that it was called with these simpler, standardised set of arguments.

 

while :
do
  case "$1" in
    -a | --alpha)   ALPHA=1      ; shift   ;;
    ... etc ...
  esac
done

Now we can loop through the $1 variable - every time we call shift it pushes one (or more) off the stack. See shift for more information on how that works.

This is now a lot like the getopts structure - loop through the variable and parse each as it comes along. The structure is subtly different; in getopts we define our own variable; with getopt we are working through the $PARSED_ARGUMENTS variable which the calling script has now consumed as its own command-line arguments. So here, we have $1 as the argument being processed, and $2 as its parameter (which would have been $OPTARG under getopts).

Each time around the loop, we check $1 for its long or short form.
For the simple switches "-a" and "-b", we set a variable to note that the switch was used, then shift to move that off the stack, and move on.

 

    -c | --charlie) CHARLIE="$2" ; shift 2 ;;
For the arguments which take parameters, we take the parameter's value from $2, and call "shift 2" to shift both $1 and $2 off the stack in one go.

 

    # -- means the end of the arguments; drop this, and break out of the while loop
    --) shift; break ;;

The special case -- is passed to us after all the parameters have been parsed; here we shift to get the "--" off the stack, then break out of the while loop.

 

echo "ALPHA   : $ALPHA"
echo "BETA    : $BETA "
echo "CHARLIE : $CHARLIE"
echo "DELTA   : $DELTA"
echo "Parameters remaining are: $@"

We can now display the variables; in a real-world script, you would be using these values passed in from the user to control what your script does.

Anything left on the stack is in the $@ variable (which again can be broken down into $1, $2 and so on, if necessary), as shown by the final line of this example script.

 

Hopefully you now have the information you need to write your own script which can accept long and short arguments, with optional parameters, and process them all correctly.

Do check "man getopt" for even more information about this utility!

 

Appreciate this site? Please consider making a donation:

 


Books and eBooks

Contact

You can mail me with this form. If you expect a reply, please ensure that the address you specify is valid. Don't forget to include the simple addition question at the end of the form, to prove that you are a real person!

You can buy the content of this Shell Scripting Tutorial as a PDF!