Instrumenting All Function Calls
Page Contents
References
- Trace and profile function calls with GCC, Freedom Embedded.
- Monitoring Function Calls, By Aurelian Melinte.
- The addr2line source code, GitHub.
- Binary File Descriptor Library, WikiPedia.
- LIB BFD, the Binary File Descriptor Library.
- Linux ApplicationsDebuggingTechniques, Aurelian Melinte, see page 20, chapter "The Interposition LIbrary& quot;.
- Trace and profile function calls fast, JehTech code.
Why Instrumenting All Function Calls
I had a complex program test that failed. I wanted to see what lead up to the failure. Its a long execution path and I wanted to be able to see what the execution path was that lead to the failure.
Small Modification of Freedom Embedded's Article
I was Google'ing about for a way to do this and found Freedom Embedded's Article [Ref]. It almost worked straight out of the box for me, which was good.
The problem I was that my program was relocated so the addresses it recorded would not be the same is in the OBJ files because Linux applies a randomised load offset to the program when it loads it.
To get around this I modified his constructor function to print the address of main(),
so that the first line of trace.out had an address of a known function. By looking
up main inthe objdump of the executable I could then figure out the load offset and
subtract this from all the addresses in trace.out.
I also borrowed from A. Melinte's article and book [Ref][Ref] and added as output the thread ID.
The resulting C is this:
#if defined(__GNUC__) || defined(__GNUG__)
extern "C" {
// See https://balau82.wordpress.com/2010/10/06/trace-and-profile-function-calls-with-gcc/
// This is a derivative of the above article.
#include <stdio.h>
#include <pthread.h>
static FILE *fp_trace;
void
__attribute__((constructor))
trace_begin(void)
{
fp_trace = fopen("trace.out", "w");
if (fp_trace != NULL)
{
extern int main(int, char **);
fprintf(fp_trace, "ADDRESS_OF_MAIN %p\n", main);
}
}
void
__attribute__((destructor))
trace_end(void)
{
if (fp_trace != NULL) {
fclose(fp_trace);
}
}
void
__cyg_profile_func_enter(void *func, void *caller)
{
if (fp_trace != NULL) {
const pthread_t self = pthread_self();
fprintf(fp_trace, "e %p %p %p\n", self, func, caller);
}
}
void
__cyg_profile_func_exit(void *func, void *caller)
{
if (fp_trace != NULL) {
const pthread_t self = pthread_self();
fprintf(fp_trace, "x %p %p %p\n", self, func, caller);
}
}
}
#endif
The shell script became this:
#/bin/bash
if test ! -f "$1"; then
echo "Error: executable $1 does not exist."
exit 1
fi
if test ! -f "$2"; then
echo "Error: trace log $2 does not exist."
exit 1
fi
EXECUTABLE="$1"
TRACELOG="$2"
PMA_MAIN=$(objdump -d "$EXECUTABLE" | grep "<main>:" | cut -d" " -f1 | sed -e 's/^0*\([1-9][0-9]*\)/\1/g')
VMA_MAIN=$(head -1 "$TRACELOG" | cut -d" " -f2)
VMA_MAIN=${VMA_MAIN:2:-1}
VMA_OFFSET=$((16#$VMA_MAIN - 16#$PMA_MAIN))
INDENT=2
COUNT=0
while read LINETYPE TID FADDR CADDR; do
NEWFADDR="$((16#${FADDR:2:-1} - $VMA_OFFSET))"
FADDR="$(printf "0x%x", "$NEWFADDR")"
NEWCADDR="$((16#${CADDR:2:-1} - $VMA_OFFSET))"
CADDR="$(printf "0x%x", "$NEWCADDR")"
FNAME="$(addr2line -f -e ${EXECUTABLE} ${FADDR}|head -1)"
if [ "$FNAME" == "??" ]; then FNAME=$FADDR; fi
if test "${LINETYPE}" = "e"
then
CNAME="$(addr2line -f -e ${EXECUTABLE} ${CADDR} | head -1)"
CLINE="$(addr2line -s -e ${EXECUTABLE} ${CADDR})"
SPACES="$(printf "%0.s " $(seq 1 $COUNT))"
echo "${TID}: ${SPACES}Enter ${FNAME}, called from ${CNAME} (${CLINE})"
COUNT=$((COUNT + INDENT))
fi
if test "${LINETYPE}" = "x"
then
COUNT=$((COUNT - INDENT))
SPACES="$(printf "%0.s " $(seq 1 $COUNT))"
echo "${TID}: ${SPACES}Exit ${FNAME}"
fi
done < "${TRACELOG}"
Remembering to compile with options -O0 -g -finstrument-functions and it all
worked like a charm. Thank you Balua and Melinte!
WARNING: This will produce a tonne of debug into. The file trace.out will
potentially be many many GIGABYTES of information.
NOTE: For anything but really small examples the shell script will be REALLY SLOW.
Parse Function Traces Efficiently
As noted Balua's excellent example will run slowly. I liked his idea of just dumping addresses to a trace file as it doesn't require the program-to-be-traced to link against anything new really... the code just slots in. I also like Melinte's use of the thread ID, but didn't want to have all that extra code in by GCC trace functions and also didn't manage to find all his functions.
So... I decided to to modify Balua's example in two ways. Firstly, get it to output a
binary trace format, to try, in vain it seems, to reduce the trace.out
file size. Secondly, to remove the bash script which has to spawn a new addre2line
process for each line, to something that looks up the whole trace in one processes, like
addr2line, but for the trace output file.
The result was a fairly length investigation into how addr2line works, the
BFD library and the result is the this GitHub repo. Enjoy :)