Linux Error Handling Using Traps

Unexpected Behaviors

Error Propagation Through Pipes

By default errors will not propagate through pipes, so if the latter most section of the pip succeeds but previous sections fail, those failures will not get trapped. Take the simple example script below:

trap 'echo "ERROR"; exit 1' ERR
false | echo "latter most command in pipe"
echo "Completed without a problem"

If this script is run the following output is seen.

Completed without a problem

Oops! To rectify the problem we must tell bash to propagate the error status through the pipe:

set -o pipefail  ## Propagate error status through pipes
trap 'echo "ERROR"; exit 1' ERR
echo "Completed without a problem"
false | echo "latter most command in pipe"

Now the script will behave as expected and output the error message because we used set -o pipefail. When this is set the return value of a pipeline is the status of the last command to exit with a non-zero status, or zero if no command exited with a non-zero status.

ERR Trap Is Not Inherited By Shell Functions

Sigh, this is a gotcha too! Have a look at the following test script.

trap 'echo "ERROR"; exit 1' ERR
function test() {
    false
    echo "Function completed"
}
test
echo "Completed without a problem"

When the script is run, the following output is seen.

Function completed
Completed without a problem

The error generated in the function has not been trapped!!

The Bash man page has the following to say about shell functions:

  1. When executed, the exit status of a function is the exit status of the last command executed in the body.
  2. Functions are executed in the context of the current shell ... [most] aspects of the shell execution environment are identical between a function and its caller with these exceptions: ... the ERR trap is not inherited unless the -o errtrace shell option has been enabled

So, we see both the reason and the solution. It is the combination of both points. Note that if the function test() was defined as follows that the ERR would be trapped.

function test() {
    echo "Function completed"
    false
}

The reason that it is trapped in this case is that the return value of a shell function is is the exit status of the last command executed. The ERR is trapped at the main script level.

In the original case, however, the last command was echo ..., so the function's exit code would be 0. So, no error could be trapped at the main script level. The trapping of the ERR has to be inherited by the function, which is not what one would expect. To fix this use set -o errtrace:

set -o errtrace  ## Ensure ERR is trapped in functions too
trap 'echo "ERROR"; exit 1' ERR
function test() {
        false
        echo "Function completed"
}
test
echo "Completed without a problem"

Now, when the script is run, the expected output is seen.

ERROR

Note, that you also have to exit from the ERR handler, if you do not want to essentially ignore the error, otherwise the script will call the ERR handler and then continue.

DEBUG & RETURN Traps Are Not Inherited By Shell Functions

In exactly the same way as ERR is not inherited by shell functions neither are the DEBUG and RETURN traps. To allow these to be inherited use set -o functrace.

A Generic Error Handler With Stack Trace

Bash has some magic array variables, FUNCNAME, BASH_LINENO and BASH_SOURCE as well as a builtin function caller.

The builtin caller provides the easiest way to create a stack trace.

function testA() {
    caller
    echo "--------"
    i=0; while caller $i; do (( i=i+1 )); done
}

function testB() {
    testA
}

function testC() {
    testB
}

testC

The output of this script is:

8 ./test6.sh
--------
8 testB ./test6.sh
12 testC ./test6.sh
15 main ./test6.sh

Its not a pretty stack trace, but it is a stack trace.

Lets put this into a trap handler:

function err_handler() {
    echo -e "\\n-----------------------------------------"
    echo "An error occurred with status $1"
    echo "Stack trace is:"
    i=0; while caller $i; do (( i=i+1 )); done
}
trap "err_handler $?" ERR
set -o pipefail
set -o errtrace

false

function a1() {
    false
}

function a2() {
    a1
}

a2

The error handler deliberately does not exit so that we can see how all the errors are processed. The output of the script is as follows, annotated with linux style comments:

-----------------------------------------
An error occurred with status 0
Stack trace is:
11 main test6.sh    #<: This is the false before the function definitions

-----------------------------------------
An error occurred with status 0
Stack trace is:
14 a1 test6.sh      #<: This is the fail from the function chain main -> a2 -> a1
18 a2 test6.sh
21 main test6.sh

-----------------------------------------
An error occurred with status 0
Stack trace is:
18 a2 test6.sh      #<: This is _interesting_! The error handler in a2 has trapped the
21 main test6.sh    #<: error generated by a1... this could execute because the handler
                    #<: did not exit()
-----------------------------------------
An error occurred with status 0
Stack trace is:
21 main test6.sh    #<: This is _interesting_! The error handler main has trapped the
21 main test6.sh    #<: error generated by a2... this could execute because the handler
                    #<: did not exit()

Yay! We have an ERR trap handler that prints out a stack trace. Wonderful! We could make the stack trace prettier either by parsing the caller output or by using the builtin Bash variables mentioned above.