RSS Feed Share on Twitter

All Shell Scripting Tips

6th October 2018

What happens when you edit a running shell script?

I realised recently that I wasn't confident in my understanding of what happens when you change a shell script which is already running. It turns out that there isn't one single answer to this, as it depends on how you make the change. For this test, I created some small shell scripts and observed what they do under different conditions. For reference, the shell used for this is Bash (version 4.3.30) on an ext4 filesystem. It might be interesting to repeat these tests with other shells and filesystems. Some of these observations are more about the kernel and Unix/Linux's treatment of open files, but it is all informative.

To do the tests, I opened two terminal windows on a GNU/Linux laptop. The first terminal ran the script, the second terminal did something to modify it.

The main script is a simple little loop, plus a few other commands:

Download a.sh

#!/bin/bash
echo "Starting... PID is $$"
for i in `seq 1 10`
do
  echo -n "$i ... "
sleep 1
done
echo "Interval..."
sleep 2
echo "Sleep again..."
sleep 2
echo "Done."

This will execute echo "Starting... PID is $$" then do a for loop. This is relevant because although it traverses many lines, the entire for loop is one command, as far as the shell is concerned. (note: the first sleep really should have been indented to match the echo -n above it.)

Then it does another sleep, then echo "Interval..."... you get the picture.

steve@home:~$ ./a.sh
Starting... PID is 24760
1 ... 2 ... 3 ... 4 ... 5 ... 6 ... 7 ... 8 ... 9 ... 10 ... Interval...
Sleep again...
Done.
steve@home:~$ 

Test One: Delete a running script

For my first test, I deleted a script which was running. The shell interpreter had the file held open, so the kernel does not actually delete the file from the disk until that shell process closes it. As a result, the shell script kept on running until it had completed normally. The output is exactly like that shown above.

Test Two: Truncate a running script

It is possible to truncate a file; that is, it is the same file, it keeps its inode number, ownership, permissions, etc, but its contents are removed. As above, the shell interpreter has the file open, but this time, when the currently-running command in the script has completed, the shell interpreter then finds that there are no more commands to read from the script, and it exits, with a return code of zero. In this case, I truncated the script whilst the loop was running (actually about when it was printing "3 ..."), but the loop continued until it had finished, then the echo "Interval" never happened, as if that had always been the way that the a.sh script had ended.

steve@home:~$ ./a.sh
Starting... PID is 24814
1 ... 2 ... 3 ... 4 ... 5 ... 6 ... 7 ... 8 ... 9 ... 10 ... steve@home:~$

Test Three: Append to a running script

What happens if you append code to an already running script? Well, it gets read and executed, as if that had been there all along. The script can even do that to itself. This script runs three echo commands, but one happens to echo "date" >> c.sh. This appends the word "date" to the script itself.

steve@home:~$ cat c.sh
#!/bin/bash
echo "Hello world"
echo "date" >> c.sh
echo "That was fun!"
steve@home:~$ ./c.sh
Hello world
That was fun!
Sat  6 Oct 22:57:29 BST 2018
steve@home:~$

The script had appended the "date" command to itself, so at the end, it displayed the current date and time, even though that was not in the original script. When we display the script afterwards, we can confirm that it has modified itself:

steve@home:~$ cat c.sh
#!/bin/bash
echo "Hello world"
echo "date" >> c.sh
echo "That was fun!"
date
steve@home:~$ 

Test Four: Edit a file with vim

When vim (or sed, or many other editors) edit a file, they actually save a new file and delete the original. This can be confirmed by checking the inode of a file as you edit it. (use ls -il a.sh to check the inode of a.sh). So the effect here is the same as in Test One above: The original script continues exactly as before. But to any other process, the script has been modified.

Test Five: Copy a new file over the existing file

This was actually the thing I was interested in when I started this investigation. It turns out that once the current command has completed, Bash looks for the next character in the file to read.

For this test, I created another script, b.sh which is very simple, but easy to see where in its execution it is up to:

Download b.sh

#!/bin/bash
echo 1
sleep 1
echo 2
sleep 1
echo 3
sleep 1
echo 4
sleep 1
echo 5
sleep 1
echo 6
sleep 1
echo 7
sleep 1
echo 8
sleep 1
echo 9
sleep 1
echo 10

When I ran the same a.sh script again, and while the loop was running, I went to another terminal and copied b.sh over a.sh. (cp b.sh a.sh). With the cp command (unlike vim and sed in Test Four above), a.sh keeps its inode number; it is the same file, as far as the filesystem is concerned, but the contents are now different.

steve@home:~$ ./a.sh
Starting... PID is 24179
1 ... 2 ... 3 ... 4 ... 5 ... 6 ... 7 ... 8 ... 9 ... 10 ... ./a.sh: line 8: ep: command not found
7
8
9
10
steve@home:~$ 

The shell threw out an obscure error message, and then starting running some apparently arbitrary point in b.sh. What exactly happened here?

When we convert the linebreaks to spaces for easy observation, we see the two scripts like this. The 10th character starts with "h echo" for both scripts. The 20th starts with "arting..." for a.sh, and with "sleep 1 echo 2" for b.sh:

          10        20        30        40
01234567890123456789012345678901234567890
#!/bin/bash echo "Starting... PID is $$"              a.sh
#!/bin/bash echo 1 sleep 1 echo 2 sleep 1             b.sh

Moving further in to the scripts, we see why the shell complained that there is no command named "ep" with its message "line 8: ep: command not found". The shell has just finished the loop, so it's at the done command, which ends at character 95 into the script:

90        100       110       120       130
012345678901234567890123456789012345678901234567890
1 done echo "Interval..." sleep 2 echo "Sleep again   a.sh
o 6 sleep 1 echo 7 sleep 1 echo 8 sleep 1 echo 9 sl   b.sh

Note that the interpreter at this point thinks that it is on line 8 of a.sh, but actually, it is on the 4th character of line 12 of b.sh. We can see this because the next thing it does is "echo 7", which is line 13 of b.sh. What the shell has actually kept track of is that it was up to character 97 of the script, so it continued reading from character 98, and found the invalid ep 1 command, which caused it to complain "ep: command not found."

Summary and Conclusion

So the answer is, that it depends entirely on how you modify the script in question, but the interpreter will keep on reading the next character from the file, until there is nothing left to read.

None of these may be quite what was expected, so it is useful to understand what is happening, and why it happens differently in these different cases.

 

 


You can buy the content of this tutorial as a PDF to download to all of your devices!

Contact

You can mail me with this form. If you expect a reply, please ensure that the address you specify is valid. Don't forget to include the simple addition question at the end of the form, to prove that you are a real person!