6th October 2018
What happens when you edit a running shell script?
I realised recently that I wasn't confident in my understanding of what happens when you change a shell script which is already running. It turns out that there isn't one single answer to this, as it depends on how you make the change. For this test, I created some small shell scripts and observed what they do under different conditions. For reference, the shell used for this is Bash (version 4.3.30) on an
ext4 filesystem. It might be interesting to repeat these tests with other shells and filesystems. Some of these observations are more about the kernel and Unix/Linux's treatment of open files, but it is all informative.
To do the tests, I opened two terminal windows on a GNU/Linux laptop. The first terminal ran the script, the second terminal did something to modify it.
The main script is a simple little loop, plus a few other commands:
#!/bin/bash echo "Starting... PID is $$" for i in `seq 1 10` do echo -n "$i ... " sleep 1 done echo "Interval..." sleep 2 echo "Sleep again..." sleep 2 echo "Done."
This will execute
echo "Starting... PID is $$" then do a
for loop. This is relevant because although it traverses many lines, the entire
for loop is one command, as far as the shell is concerned. (note: the first
sleep really should have been indented to match the
echo -n above it.)
Then it does another
echo "Interval..."... you get the picture.
steve@home:~$ ./a.sh Starting... PID is 24760 1 ... 2 ... 3 ... 4 ... 5 ... 6 ... 7 ... 8 ... 9 ... 10 ... Interval... Sleep again... Done. steve@home:~$
Test One: Delete a running script
For my first test, I deleted a script which was running. The shell interpreter had the file held open, so the kernel does not actually delete the file from the disk until that shell process closes it. As a result, the shell script kept on running until it had completed normally. The output is exactly like that shown above.
Test Two: Truncate a running script
It is possible to truncate a file; that is, it is the same file, it keeps its inode number, ownership, permissions, etc, but its contents are removed. As above, the shell interpreter has the file open, but this time, when the currently-running command in the script has completed, the shell interpreter then finds that there are no more commands to read from the script, and it exits, with a return code of zero. In this case, I truncated the script whilst the loop was running (actually about when it was printing "
3 ..."), but the loop continued until it had finished, then the
echo "Interval" never happened, as if that had always been the way that the
a.sh script had ended.
steve@home:~$ ./a.sh Starting... PID is 24814 1 ... 2 ... 3 ... 4 ... 5 ... 6 ... 7 ... 8 ... 9 ... 10 ... steve@home:~$
Test Three: Append to a running script
What happens if you append code to an already running script? Well, it gets read and executed, as if that had been there all along. The script can even do that to itself. This script
echo commands, but one happens to
echo "date" >> c.sh. This appends the word "
date" to the script itself.
steve@home:~$ cat c.sh #!/bin/bash echo "Hello world" echo "date" >> c.sh echo "That was fun!" steve@home:~$ ./c.sh Hello world That was fun! Sat 6 Oct 22:57:29 BST 2018 steve@home:~$
The script had appended the "
date" command to itself, so at the end, it displayed the current date and time, even though that was not in the original script. When we display the script afterwards, we can confirm that it has modified itself:
steve@home:~$ cat c.sh #!/bin/bash echo "Hello world" echo "date" >> c.sh echo "That was fun!" date steve@home:~$
Test Four: Edit a file with vim
sed, or many other editors) edit a file, they actually save a new file and delete the original. This can be confirmed by checking the
inode of a file as you edit it. (use
ls -il a.sh to check the inode of
a.sh). So the effect here is the same as in Test One above: The original script continues exactly as before. But to any other process, the script has been modified.
Test Five: Copy a new file over the existing file
This was actually the thing I was interested in when I started this investigation. It turns out that once the current command has completed, Bash looks for the next character in the file to read.
For this test, I created another script,
b.sh which is very simple, but easy to see where in its execution it is up to:
#!/bin/bash echo 1 sleep 1 echo 2 sleep 1 echo 3 sleep 1 echo 4 sleep 1 echo 5 sleep 1 echo 6 sleep 1 echo 7 sleep 1 echo 8 sleep 1 echo 9 sleep 1 echo 10
When I ran the same
a.sh script again, and while the loop was running, I went to another terminal and copied
cp b.sh a.sh). With the
cp command (unlike
sed in Test Four above),
a.sh keeps its inode number; it is the same file, as far as the filesystem is concerned, but the contents are now different.
steve@home:~$ ./a.sh Starting... PID is 24179 1 ... 2 ... 3 ... 4 ... 5 ... 6 ... 7 ... 8 ... 9 ... 10 ... ./a.sh: line 8: ep: command not found 7 8 9 10 steve@home:~$
The shell threw out an obscure error message, and then starting running some apparently arbitrary point in
b.sh. What exactly happened here?
When we convert the linebreaks to spaces for easy observation, we see the two scripts like this. The 10th character starts with "
h echo" for both scripts. The 20th starts with "
a.sh, and with "
sleep 1 echo 2" for
10 20 30 40 01234567890123456789012345678901234567890 #!/bin/bash echo "Starting... PID is $$" a.sh #!/bin/bash echo 1 sleep 1 echo 2 sleep 1 b.sh
Moving further in to the scripts, we see why the shell complained that there is no command named "
ep" with its message "
line 8: ep: command not found". The shell has
just finished the loop, so it's at the
done command, which ends at character 95 into the script:
90 100 110 120 130 012345678901234567890123456789012345678901234567890 1 done echo "Interval..." sleep 2 echo "Sleep again a.sh o 6 sleep 1 echo 7 sleep 1 echo 8 sleep 1 echo 9 sl b.sh
Note that the interpreter at this point thinks that it is on line 8 of
a.sh, but actually, it is on the 4th character of line 12 of
b.sh. We can see this because
the next thing it does is "
echo 7", which is line 13 of
b.sh. What the shell has actually kept track of is that it was up to character 97 of the script, so it continued
reading from character 98, and found the invalid
ep 1 command, which caused it to complain "
ep: command not found."
Summary and Conclusion
So the answer is, that it depends entirely on how you modify the script in question, but the interpreter will keep on reading the next character from the file, until there is nothing left to read.
None of these may be quite what was expected, so it is useful to understand what is happening, and why it happens differently in these different cases.
You can buy the content of this tutorial as a PDF to download to all of your devices!
You can mail me with this form. If you expect a reply, please ensure that the address you specify is valid. Don't forget to include the simple addition question at the end of the form, to prove that you are a real person!