Analyze Program Performance using gprof

Analyze Program Performance using gprof

When optimizing software performance, it's crucial to understand where the program spends its time. If you're working with C or C++ programs compiled using gcc or g++, a useful profiling tool is gprof, which is part of the GNU Binutils suite. The gprof allows you to analyze function-level performance, helping you identify time-consuming routines and optimize bottlenecks. This tutorial explains how to analyze program performance using gprof.

Suppose we have the following C or C++ code saved in a file named main.c or main.cpp:

int func1() {
    int result = 0;
    for (int i = 0; i < 2000000000; ++i) {
        ++result;
    }

    return result;
}

int func2() {
    int result = 0;
    for (int i = 0; i < 1000000000; ++i) {
        ++result;
    }

    return result;
}

int main() {
    func1();
    func2();

    return 0;
}

To enable profiling, compile the program using the -pg option. This injects instrumentation code that gprof can use to collect timing and call data:

gcc -pg main.c -o my_program
g++ -pg main.cpp -o my_program

This step produces an instrumented executable called my_program.

Execute the program as usual. This will generate a file named gmon.out in the current directory:

./my_program

The gmon.out contains raw profiling data collected during the program's execution.

Now use gprof to process gmon.out and produce a readable report:

gprof my_program gmon.out > report.txt

You can open report.txt in any text editor to analyze the results.

Here's an excerpt from a sample gprof report:

  • Flat profile
Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 65.52      0.38     0.38        1   380.00   380.00  func1
 34.48      0.58     0.20        1   200.00   200.00  func2

This section shows how much time was spent in each function and how many times each was called.

  • Call graph
index % time    self  children    called     name
                                                 <spontaneous>
[1]    100.0    0.00    0.58                 main [1]
                0.38    0.00       1/1           func1 [2]
                0.20    0.00       1/1           func2 [3]
-----------------------------------------------
                0.38    0.00       1/1           main [1]
[2]     65.5    0.38    0.00       1         func1 [2]
-----------------------------------------------
                0.20    0.00       1/1           main [1]
[3]     34.5    0.20    0.00       1         func2 [3]
-----------------------------------------------

This section shows calling relationships between functions and how time is distributed across them, making it easier to see which paths consume the most time.

Leave a Comment

Cancel reply

Your email address will not be published.