Instrumenting All Function Calls

Page Contents

References

Why Instrumenting All Function Calls

I had a complex program test that failed. I wanted to see what lead up to the failure. Its a long execution path and I wanted to be able to see what the execution path was that lead to the failure.

Small Modification of Freedom Embedded's Article

I was Google'ing about for a way to do this and found Freedom Embedded's Article [Ref]. It almost worked straight out of the box for me, which was good.

The problem I was that my program was relocated so the addresses it recorded would not be the same is in the OBJ files because Linux applies a randomised load offset to the program when it loads it.

To get around this I modified his constructor function to print the address of main(), so that the first line of trace.out had an address of a known function. By looking up main inthe objdump of the executable I could then figure out the load offset and subtract this from all the addresses in trace.out.

I also borrowed from A. Melinte's article and book [Ref][Ref] and added as output the thread ID.

The resulting C is this:

#if defined(__GNUC__) || defined(__GNUG__)
extern "C" {
    // See https://balau82.wordpress.com/2010/10/06/trace-and-profile-function-calls-with-gcc/
    // This is a derivative of the above article.
    #include <stdio.h>
    #include <pthread.h>

    static FILE *fp_trace;

    void
    __attribute__((constructor))
    trace_begin(void)
    {
        fp_trace = fopen("trace.out", "w");
        if (fp_trace != NULL)
        {
            extern int main(int, char **);
            fprintf(fp_trace, "ADDRESS_OF_MAIN %p\n", main);
        }
    }

    void
    __attribute__((destructor))
    trace_end(void)
    {
        if (fp_trace != NULL) {
            fclose(fp_trace);
        }
    }

    void
    __cyg_profile_func_enter(void *func, void *caller)
    {
        if (fp_trace != NULL) {
            const pthread_t self = pthread_self();
            fprintf(fp_trace, "e %p %p %p\n", self, func, caller);
        }
    }

    void
    __cyg_profile_func_exit(void *func, void *caller)
    {
        if (fp_trace != NULL) {
            const pthread_t self = pthread_self();
            fprintf(fp_trace, "x %p %p %p\n", self, func, caller);
        }
    }
}
#endif

The shell script became this:

#/bin/bash

if test ! -f "$1"; then
 echo "Error: executable $1 does not exist."
 exit 1
fi

if test ! -f "$2"; then
 echo "Error: trace log $2 does not exist."
 exit 1
fi

EXECUTABLE="$1"
TRACELOG="$2"

PMA_MAIN=$(objdump -d "$EXECUTABLE" | grep "<main>:" | cut -d" " -f1 | sed -e 's/^0*\([1-9][0-9]*\)/\1/g')
VMA_MAIN=$(head -1 "$TRACELOG" | cut -d" " -f2)
VMA_MAIN=${VMA_MAIN:2:-1}
VMA_OFFSET=$((16#$VMA_MAIN - 16#$PMA_MAIN))

INDENT=2
COUNT=0

while read LINETYPE TID FADDR CADDR; do
    NEWFADDR="$((16#${FADDR:2:-1} - $VMA_OFFSET))"
    FADDR="$(printf "0x%x", "$NEWFADDR")"

    NEWCADDR="$((16#${CADDR:2:-1} - $VMA_OFFSET))"
    CADDR="$(printf "0x%x", "$NEWCADDR")"

    FNAME="$(addr2line -f -e ${EXECUTABLE} ${FADDR}|head -1)"
    if [ "$FNAME" == "??" ]; then FNAME=$FADDR; fi

    if test "${LINETYPE}" = "e"
    then
        CNAME="$(addr2line -f -e ${EXECUTABLE} ${CADDR} | head -1)"
        CLINE="$(addr2line -s -e ${EXECUTABLE} ${CADDR})"
        SPACES="$(printf "%0.s " $(seq 1 $COUNT))"
        echo "${TID}: ${SPACES}Enter ${FNAME}, called from ${CNAME} (${CLINE})"
        COUNT=$((COUNT + INDENT))
    fi
    if test "${LINETYPE}" = "x"
    then
        COUNT=$((COUNT - INDENT))
        SPACES="$(printf "%0.s " $(seq 1 $COUNT))"
        echo "${TID}: ${SPACES}Exit  ${FNAME}"
    fi
done < "${TRACELOG}"

Remembering to compile with options -O0 -g -finstrument-functions and it all worked like a charm. Thank you Balua and Melinte!

WARNING: This will produce a tonne of debug into. The file trace.out will potentially be many many GIGABYTES of information.

NOTE: For anything but really small examples the shell script will be REALLY SLOW.

Parse Function Traces Efficiently

As noted Balua's excellent example will run slowly. I liked his idea of just dumping addresses to a trace file as it doesn't require the program-to-be-traced to link against anything new really... the code just slots in. I also like Melinte's use of the thread ID, but didn't want to have all that extra code in by GCC trace functions and also didn't manage to find all his functions.

So... I decided to to modify Balua's example in two ways. Firstly, get it to output a binary trace format, to try, in vain it seems, to reduce the trace.out file size. Secondly, to remove the bash script which has to spawn a new addre2line process for each line, to something that looks up the whole trace in one processes, like addr2line, but for the trace output file.

The result was a fairly length investigation into how addr2line works, the BFD library and the result is the this GitHub repo. Enjoy :)