The Shell Scripting Tutorial


Sorting on Fields

Understanding the 'sort' utility

3 June 2015

The "sort" utility seems a pretty obvious thing. But it can catch you out in odd little ways. Here's a simple example. Given some machine sizings as follows, how does a script display and sort them appropriately?:

SizeCPUMemory
Tiny12048
Small14096
Medium24096
Big28192
Large48192
XL416384
XXL832768

Start with a template file which declares two simple arrays - call it size.tmpl. This can be read in by the main script via "source ./size.tmpl" (or just ". ./size.tmpl"):

Download size.tmpl
declare -A CPU RAM

# CPU Count
CPU['Tiny']=1
CPU['Small']=1
CPU['Medium']=2
CPU['Big']=2
CPU['Large']=4
CPU['XL']=4
CPU['XXL']=8

# RAM in Mb
RAM['Tiny']=2048     # 2Gb
RAM['Small']=4096    # 4Gb
RAM['Medium']=4096   # 4Gb
RAM['Big']=8192      # 8Gb
RAM['Large']=8192    # 8Gb
RAM['XL']=16384      # 16Gb
RAM['XXL']=32768     # 32Gb

Then create a script which will read those arrays, and display their output, call it sort.sh:

#!/bin/bash
. ./size.tmpl

printf "%-10s%4s%8s\n" Size CPU RAM   # show header

for SIZE in ${!CPU[*]}
do
  printf "%-10s%4d%8d\n" ${SIZE} ${CPU[$SIZE]} ${RAM[$SIZE]}
done

Unfortunately, this doesn't parse the arrays in any particular order:

$ ./sort.sh
Size       CPU     RAM
XL           4   16384
Medium       2    4096
Tiny         1    2048
Small        1    4096
Large        4    8192
Big          2    8192
XXL          8   32768

The answer is to use sort. The -n switch tells it to sort numerically (so that "9" comes before "10", for example). And you can give it keys to sort on. By default, the padding is whitespace, which is what we have here, so we just need to use "sort -n -k 3 -k 2". This tells it to sort on column 3 (and anything which might come after), then on column two:

Download sort.sh
#!/bin/bash
. ./size.tmpl

printf "%-10s%4s%8s\n" Size CPU RAM   # show header

for SIZE in ${!CPU[*]}
do
  printf "%-10s%4d%8d\n" \
    ${SIZE} ${CPU[$SIZE]} ${RAM[$SIZE]}
done  | sort -n -k 3 -k 2

This now gives a more sensibly formatted output:

$ ./sort.sh
Size       CPU     RAM
Tiny         1    2048
Small        1    4096
Medium       2    4096
Big          2    8192
Large        4    8192
XL           4   16384
XXL          8   32768

And so we have a nicely formatted display, sorted by CPU and by RAM.

Bonus Points

For bonus points, we can tell sort more about the input format. If it was CSV, for example, we can use "sort -t," to tell it that the comma separates the fields:

Download sort-csv.sh
#!/bin/bash
. ./size.tmpl

echo "Size,CPU,RAM"

for SIZE in ${!CPU[*]}
do
  printf "%s,%d,%d\n" \
    ${SIZE} ${CPU[$SIZE]} ${RAM[$SIZE]}
done  | sort -t, -n -k 3 -k 2

Then you can create a sorted CSV file:

$ ./sort-csv.sh
Size,CPU,RAM
Tiny,1,2048
Small,1,4096
Medium,2,4096
Big,2,8192
Large,4,8192
XL,4,16384
XXL,8,32768


My Paperbacks and eBooks

My Shell Scripting books, available in Paperback and eBook formats. This tutorial is more of a general introduction to Shell Scripting, the longer Shell Scripting: Expert Recipes for Linux, Bash and more book covers every aspect of Bash in detail.

Shell Scripting Tutorial

Shell Scripting Tutorial
is this tutorial, in 88-page Paperback and eBook formats. Convenient to read on the go, and in paperback format good to keep by your desk as an ever-present companion.

Also available in PDF form from Gumroad:Get this tutorial as a PDF
Shell Scripting: Expert Recipes for Linux, Bash and more

Shell Scripting: Expert Recipes for Linux, Bash and more
is my 564-page book on Shell Scripting. The first half covers all of the features of the shell in every detail; the second half has real-world shell scripts, organised by topic, along with detailed discussion of each script.