21 Mar 2018
Temporary File Names
Generating temporary (and possibly unique, and/or identifiable) filenames
Some file names are easy to come up with. The
foobar application needs a configuration file? Then
/etc/foobar.conf is probably the best place. It needs a log file? Then
/var/log/foobar/foobar.log would make sense (in its own subdirectory as
/var/log is only writeable by the
root user, but I digress...)
Other files are less obvious. If your script needs to write data temporarily to a small file, then
/tmp is probably a good location for that file, but what should it be called?
Sometimes, you will see
/tmp/foobar.$$ in a script, or maybe
/tmp/foobar.$RANDOM, or quite commonly,
What do these all mean, and how do they differ?
Option 1) /tmp/foobar.$$
$$ variable is a read-only variable, from which a shell script can find its own Process Identifier, or PID. For example, here, my Bash shell's Process ID (PID) is
echo $$ gives
ps -fp 6046 shows the process via the
ps command. Further the
ps -fp $$ command shows the same result, but without ever having to hard-code the PID itself.
steve@linux:~$ echo $$ 6046 steve@linux:~$ ps -fp 6046 UID PID PPID C STIME TTY TIME CMD steve 6046 2651 0 Mar20 pts/3 00:00:00 bash steve@linux:~$ ps -fp $$ UID PID PPID C STIME TTY TIME CMD steve 6046 2651 0 Mar20 pts/3 00:00:00 bash steve@linux:~$
If we create a file using a pattern like
/tmp/foobar.$$, then it can easily be associated with this particular process. If you know that the
foobar script uses files named
/tmp/foobar.$$, then even if there are multiple instances of
foobar running on the system, and multiple
/tmp/foobar.nnnn files (where
nnnn represents an integer PID), then you can correlate each file to the same instance of
This could be very good, or it could be very bad.
For example, consider this
foobar script, which asks you some questions, comments on them, and stores your answers:
#!/bin/bash TMPFILE=/tmp/foobar.$$ read -p "What is your favourite colour? " ANSWER echo "That's nice. I like $ANSWER too." echo $ANSWER > $TMPFILE read -p "What is your favourite number? " ANSWER echo "Oh, I don't really like $ANSWER very much." echo "Each to their own." echo $ANSWER >> $TMPFILE read -p "What is your favourite movie? " ANSWER echo "That's a good one, isn't it? I do like $ANSWER so much." echo $ANSWER >> $TMPFILE counter=0 while read inputword do let counter++ echo "Your answer number $counter was: $inputword" done < $TMPFILE # Email the answers to our support mailbox cat $TMPFILE | mailx -s "Answers for process $$" firstname.lastname@example.org rm -f $TMPFILE
A sample run of the script might look like this:
steve@linux:~$ ./foobar.sh What is your favourite colour? red That's nice. I like red too. What is your favourite number? 27 Oh, I don't really like 27 very much. Each to their own. What is your favourite movie? The Blues Brothers That's a good one, isn't it? I do like The Blues Brothers so much. Your answer number 1 was: red Your answer number 2 was: 27 Your answer number 3 was: The Blues Brothers steve@linux:~$
Although this script is quite small, if it was to start creating large amounts of data, then knowing which process the large file (eg,
/tmp/foobar.8404) was associated with would be useful in identifying what was going on.
However, if someone could write to the file while you were typing in an answer, they could get other data intermingled with the genuine data:
steve@linux:~$ ./foobar.sh What is your favourite colour? yellow That's nice. I like yellow too. What is your favourite number? 93 Oh, I don't really like 93 very much. Each to their own. What is your favourite movie? Star Wars That's a good one, isn't it? I do like Star Wars so much. Your answer number 1 was: yellow Your answer number 2 was: 93 Your answer number 3 was: Extra, unexpected answer here! Your answer number 4 was: Star Wars steve@linux:~$
What happened here was that in another shell the attacker inserted some extra data into the file which was about to be emailed, like this:
steve@linux:~$ ps -eaf|grep foobar.sh steve 8404 19523 0 22:47 pts/1 00:00:00 /bin/bash ./foobar.sh steve 8409 17559 0 22:47 pts/7 00:00:00 grep foobar.sh steve@linux:~$ cat /tmp/foobar.8404 yellow 93 steve@linux:~$ echo "Extra, unexpected answer here!" >> /tmp/foobar.8404
This is a bit of an artificial example, but you can see the principle. If a bad player knows your PID, they can create or edit files which you will be reading from, writing to, or both. This is quite commonly used in real-world exploits.
Of course, the attacker would have to be able to get on to your system, but if they have "read" permissions on the file, they can steal your data; if they have "write" permissions, they can edit the data as shown above.
See https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=predictable+filename for just a few examples of vulnerabilities in real-world software related to the ability to predict the name of a temporary file used by a process.
Option 2) /tmp/foobar.$RANDOM
The Bash shell, and others, have a
$RANDOM variable, which usually creates an integer between 0 and 32767 (aka 2^^15-1). If the script above used
TMPFILE=/tmp/foobar.$RANDOM then nobody would be able to associate the process with the file (it would show up in the
/proc/$$/fd/ directory whilst reading/writing the file, but that is a much less reliable way of working out the association).
Probably the biggest problem with this approach is that two instances of the script could run at the same time, and come up with exactly the same filename to use for their own "personal" use.
Option 3) mktemp
A problem with both
/tmp/foobar.$RANDOM is that you can't be certain that the filename is unique. In the first case, it is likely to be unique, but only because nobody is likely to want to create a file of that name. For example, it is entirely possible that a previous instance of
foobar had the same PID (they do get recycled, eventually) and crashed before it could clean up its temporary files. Or that somebody is deliberately trying to manipulate your script. If they know that a long-running process will occasionally try to read from
/tmp/foobar.$$ if it exists, then they can create that file any time, and cause your script to read from it.
In the second case (
/tmp/foobar.$RANDOM), you could try a loop, like this:
#!/bin/bash TMPFILE=/tmp/foobar.$RANDOM while [ -f $TMPFILE ]; do TMPFILE=/tmp/foobar.$RANDOM done
This would keep on proposing a random filename, until it found one which did not exist, at which point it would create the actual file.
There is still the possibility of a race condition, but it would be better than blindly assuming that the name you've come up with is definitely available for your use.
A third and - for most situations - the best way is to use the
mktemp utility. It deals with the looping condition above, and actually checks properly for race conditions.
mktemp deals with these situations, ensures that it has created a uniquely-named file, and tells you the name of it.
mktemp in an interactive shell shows what it does:
steve@linux:~$ mktemp /tmp/tmp.MgfkDneFhR steve@linux:~$ ls -l /tmp/tmp.MgfkDneFhR -rw------- 1 steve steve 0 Mar 21 23:06 /tmp/tmp.MgfkDneFhR steve@linux:~$ rm -f /tmp/tmp.MgfkDneFhR steve@linux:~$
mktemp determined that /tmp/tmp.MgfkDneFhR was a valid and non-existant filename, created it, and displayed its name to its standard output. We can then list it via
ls, and see that it is a zero-length file, owned by
steve (the user who called
mktemp's default pattern is
tmp.XXXXXXXXXX. You can call it with another pattern; here,
mktemp /tmp/foobar.XXXXXXXX resulted in
/tmp/foobar.7iFw5vi1 being created; a second run created
steve@linux:~$ mktemp /tmp/foobar.XXXXXXXX /tmp/foobar.7iFw5vi1 steve@linux:~$ rm foobar.yCMTC5RS steve@linux:~$ mktemp /tmp/foobar.XXXXXXXX /tmp/foobar.l2KCpH2J steve@linux:~$ rm /tmp/foobar.l2KCpH2J
Strictly, to provide a directory as well as a pattern, you should use this method, although the end result is the same:
steve@linux:~$ mktemp -p /tmp foobar.XXXXXXXX /tmp/foobar.spa1kuK6 steve@linux:~$ rm /tmp/foobar.spa1kuK6
mktemp in use
Notice that I have been careful to delete these temporary files; otherwise, you will end up with a load of randomly-named files, with no idea what they are for!
mktemp creates a new file each time you call it. Whilst
$$ will always return your PID, and therefore will be the same every time you call it,
$RANDOM will give a random number each time, and
mktemp is guaranteed to give a different file each time. So the way to use
mktemp in a shell script is to grab its output as you run
mktemp. That way, you always have the file's actual name - there is no other way to get the name once it has been created. So grab the output from
mktemp in one move. Here is the
foobar script above (shortened), showing how
mktemp is used:
#!/bin/bash TMPFILE=`mktemp` # Now write to the file... read -p "What is your favourite colour? " ANSWER echo "That's nice. I like $ANSWER too." echo $ANSWER > $TMPFILE # And read from the file... cat $TMPFILE | mailx -s "Answers for process $$" email@example.com # Finally, delete this temporary file: rm $TMPFILE
/tmp/foobar.$$ gives a nice, reasonably unique but predictable filename, which can easily be associated with the running process (which could be good, bad, neither or both, depending on the situation).
/tmp/foobar.$RANDOM gives a filename which cannot easily be associated with the running process.
mktemp -p /tmp foobar.XXXXXXXX gives a guaranteed-unique filename, which cannot easily be associated with the running process. It also caters for creation of directories.
Check the documentation for mktemp; it has a few other useful switches, such as creating a directory instead of a file.
You can buy the content of this tutorial as a PDF to download to all of your devices!
You can mail me with this form. If you expect a reply, please ensure that the address you specify is valid. Don't forget to include the simple addition question at the end of the form, to prove that you are a real person!