RSS Feed Share on Twitter

All Shell Scripting Tips

21 Mar 2018

temp files

Temporary File Names

Generating temporary (and possibly unique, and/or identifiable) filenames

Some file names are easy to come up with. The foobar application needs a configuration file? Then /etc/foobar.conf is probably the best place. It needs a log file? Then /var/log/foobar/foobar.log would make sense (in its own subdirectory as /var/log is only writeable by the root user, but I digress...)

Other files are less obvious. If your script needs to write data temporarily to a small file, then /tmp is probably a good location for that file, but what should it be called?

Sometimes, you will see /tmp/foobar.$$ in a script, or maybe /tmp/foobar.$RANDOM, or quite commonly, TMPFILE=`mktemp`.

What do these all mean, and how do they differ?

Option 1) /tmp/foobar.$$

The $$ variable is a read-only variable, from which a shell script can find its own Process Identifier, or PID. For example, here, my Bash shell's Process ID (PID) is 6046. So echo $$ gives 6046, and ps -fp 6046 shows the process via the ps command. Further the ps -fp $$ command shows the same result, but without ever having to hard-code the PID itself.

steve@linux:~$ echo $$
6046
steve@linux:~$ ps -fp 6046
UID        PID  PPID  C STIME TTY          TIME CMD
steve     6046  2651  0 Mar20 pts/3    00:00:00 bash
steve@linux:~$ ps -fp $$
UID        PID  PPID  C STIME TTY          TIME CMD
steve     6046  2651  0 Mar20 pts/3    00:00:00 bash
steve@linux:~$ 

If we create a file using a pattern like /tmp/foobar.$$, then it can easily be associated with this particular process. If you know that the foobar script uses files named /tmp/foobar.$$, then even if there are multiple instances of foobar running on the system, and multiple /tmp/foobar.nnnn files (where nnnn represents an integer PID), then you can correlate each file to the same instance of foobar.

This could be very good, or it could be very bad.

For example, consider this foobar script, which asks you some questions, comments on them, and stores your answers:

foobar.sh
#!/bin/bash
TMPFILE=/tmp/foobar.$$

read -p "What is your favourite colour? " ANSWER
echo "That's nice. I like $ANSWER too."
echo $ANSWER > $TMPFILE

read -p "What is your favourite number? " ANSWER
echo "Oh, I don't really like $ANSWER very much."
echo "Each to their own."
echo $ANSWER >> $TMPFILE

read -p "What is your favourite movie? " ANSWER
echo "That's a good one, isn't it? I do like $ANSWER so much."
echo $ANSWER >> $TMPFILE

counter=0
while read inputword
do
  let counter++
  echo "Your answer number $counter was: $inputword"
done < $TMPFILE

# Email the answers to our support mailbox
cat $TMPFILE | mailx -s "Answers for process $$" support@example.com
rm -f $TMPFILE

A sample run of the script might look like this:

steve@linux:~$ ./foobar.sh
What is your favourite colour? red
That's nice. I like red too.
What is your favourite number? 27
Oh, I don't really like 27 very much.
Each to their own.
What is your favourite movie? The Blues Brothers
That's a good one, isn't it? I do like The Blues Brothers so much.
Your answer number 1 was: red
Your answer number 2 was: 27
Your answer number 3 was: The Blues Brothers
steve@linux:~$ 

Although this script is quite small, if it was to start creating large amounts of data, then knowing which process the large file (eg, /tmp/foobar.8404) was associated with would be useful in identifying what was going on.

However, if someone could write to the file while you were typing in an answer, they could get other data intermingled with the genuine data:

steve@linux:~$ ./foobar.sh
What is your favourite colour? yellow
That's nice. I like yellow too.
What is your favourite number? 93
Oh, I don't really like 93 very much.
Each to their own.
What is your favourite movie? Star Wars
That's a good one, isn't it? I do like Star Wars so much.
Your answer number 1 was: yellow
Your answer number 2 was: 93
Your answer number 3 was: Extra, unexpected answer here!
Your answer number 4 was: Star Wars
steve@linux:~$ 

What happened here was that in another shell the attacker inserted some extra data into the file which was about to be emailed, like this:

steve@linux:~$ ps -eaf|grep foobar.sh
steve     8404 19523  0 22:47 pts/1    00:00:00 /bin/bash ./foobar.sh
steve     8409 17559  0 22:47 pts/7    00:00:00 grep foobar.sh
steve@linux:~$ cat /tmp/foobar.8404
yellow
93
steve@linux:~$ echo "Extra, unexpected answer here!" >> /tmp/foobar.8404

This is a bit of an artificial example, but you can see the principle. If a bad player knows your PID, they can create or edit files which you will be reading from, writing to, or both. This is quite commonly used in real-world exploits.

Of course, the attacker would have to be able to get on to your system, but if they have "read" permissions on the file, they can steal your data; if they have "write" permissions, they can edit the data as shown above.

See https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=predictable+filename for just a few examples of vulnerabilities in real-world software related to the ability to predict the name of a temporary file used by a process.

Option 2) /tmp/foobar.$RANDOM

The Bash shell, and others, have a $RANDOM variable, which usually creates an integer between 0 and 32767 (aka 2^^15-1). If the script above used TMPFILE=/tmp/foobar.$RANDOM then nobody would be able to associate the process with the file (it would show up in the /proc/$$/fd/ directory whilst reading/writing the file, but that is a much less reliable way of working out the association).

Probably the biggest problem with this approach is that two instances of the script could run at the same time, and come up with exactly the same filename to use for their own "personal" use.

Option 3) mktemp

A problem with both /tmp/foobar.$$ and /tmp/foobar.$RANDOM is that you can't be certain that the filename is unique. In the first case, it is likely to be unique, but only because nobody is likely to want to create a file of that name. For example, it is entirely possible that a previous instance of foobar had the same PID (they do get recycled, eventually) and crashed before it could clean up its temporary files. Or that somebody is deliberately trying to manipulate your script. If they know that a long-running process will occasionally try to read from /tmp/foobar.$$ if it exists, then they can create that file any time, and cause your script to read from it.

In the second case (/tmp/foobar.$RANDOM), you could try a loop, like this:

#!/bin/bash
TMPFILE=/tmp/foobar.$RANDOM
while [ -f $TMPFILE ]; do
  TMPFILE=/tmp/foobar.$RANDOM
done

This would keep on proposing a random filename, until it found one which did not exist, at which point it would create the actual file.

There is still the possibility of a race condition, but it would be better than blindly assuming that the name you've come up with is definitely available for your use.

A third and - for most situations - the best way is to use the mktemp utility. It deals with the looping condition above, and actually checks properly for race conditions.

mktemp deals with these situations, ensures that it has created a uniquely-named file, and tells you the name of it.

Running mktemp in an interactive shell shows what it does:

steve@linux:~$ mktemp
/tmp/tmp.MgfkDneFhR
steve@linux:~$ ls -l /tmp/tmp.MgfkDneFhR
-rw------- 1 steve steve 0 Mar 21 23:06 /tmp/tmp.MgfkDneFhR
steve@linux:~$ rm -f /tmp/tmp.MgfkDneFhR
steve@linux:~$ 

mktemp determined that /tmp/tmp.MgfkDneFhR was a valid and non-existant filename, created it, and displayed its name to its standard output. We can then list it via ls, and see that it is a zero-length file, owned by steve (the user who called mktemp).

mktemp's default pattern is tmp.XXXXXXXXXX. You can call it with another pattern; here, mktemp /tmp/foobar.XXXXXXXX resulted in /tmp/foobar.7iFw5vi1 being created; a second run created /tmp/foobar.l2KCpH2J:

steve@linux:~$ mktemp /tmp/foobar.XXXXXXXX
/tmp/foobar.7iFw5vi1
steve@linux:~$ rm foobar.yCMTC5RS 
steve@linux:~$ mktemp /tmp/foobar.XXXXXXXX
/tmp/foobar.l2KCpH2J
steve@linux:~$ rm /tmp/foobar.l2KCpH2J

Strictly, to provide a directory as well as a pattern, you should use this method, although the end result is the same:

steve@linux:~$ mktemp -p /tmp foobar.XXXXXXXX
/tmp/foobar.spa1kuK6
steve@linux:~$ rm /tmp/foobar.spa1kuK6

mktemp in use

Notice that I have been careful to delete these temporary files; otherwise, you will end up with a load of randomly-named files, with no idea what they are for!

Of course, mktemp creates a new file each time you call it. Whilst $$ will always return your PID, and therefore will be the same every time you call it, $RANDOM will give a random number each time, and mktemp is guaranteed to give a different file each time. So the way to use mktemp in a shell script is to grab its output as you run mktemp. That way, you always have the file's actual name - there is no other way to get the name once it has been created. So grab the output from mktemp in one move. Here is the foobar script above (shortened), showing how mktemp is used:

#!/bin/bash
TMPFILE=`mktemp`

# Now write to the file...
read -p "What is your favourite colour? " ANSWER
echo "That's nice. I like $ANSWER too."
echo $ANSWER > $TMPFILE

# And read from the file...
cat $TMPFILE | mailx -s "Answers for process $$" support@example.com

# Finally, delete this temporary file:
rm $TMPFILE

Summary

/tmp/foobar.$$ gives a nice, reasonably unique but predictable filename, which can easily be associated with the running process (which could be good, bad, neither or both, depending on the situation).

/tmp/foobar.$RANDOM gives a filename which cannot easily be associated with the running process.

mktemp -p /tmp foobar.XXXXXXXX gives a guaranteed-unique filename, which cannot easily be associated with the running process. It also caters for creation of directories.

Check the documentation for mktemp; it has a few other useful switches, such as creating a directory instead of a file.

 

 


You can buy the content of this tutorial as a PDF to download to all of your devices!

Contact

You can mail me with this form. If you expect a reply, please ensure that the address you specify is valid. Don't forget to include the simple addition question at the end of the form, to prove that you are a real person!