Profiling Code Using clock_gettime

After raising the issue of the low-resolution problem of the timer provided by clock() in Resolution Problems in clock(), I ended the post by mentioning two more functions that should provide high-resolution timers suitable for profiling code. In this post, I will discuss one of them, clock_gettime().

The clock_gettime() function provides access to several useful timers with a resolution of nanoseconds. First, the prototype for the function is as follows:

int clock_gettime(clockid_t clk_id, struct timespec *tp);

The clk_id allows us to select a specific clock from the several
offered by the system, which include:

CLOCK_REALTIME, a system-wide realtime clock.
CLOCK_PROCESS_CPUTIME_ID, a high-resolution timer provided by the CPU for each process.
CLOCK_THREAD_CPUTIME_ID, a high-resolution timer provided by the CPU for each thread.

Usually, there are more clocks provided, but I find these three the most useful, as they allow us to get the execution time spent at the system level, process level, and thread level.

The current clock time for the chosen clock is stored in the struct provided by the *tp pointer. The timespec struct is defined as follows:

struct timespec {
    time_t tv_sec; /* seconds */
    long tv_nsec; /* nanoseconds */
};

To time how long some function took to process, one should call clock_gettime() twice: once before the function call and once right after it, and subtract the returned timings to get the actual runtime.

Getting the difference between two timespec structs isn’t very complicated and can be achieved using the function diff() defined below:

timespec diff(timespec start, timespec end)
{
    timespec temp;
    if ((end.tv_nsec-start.tv_nsec)<0) {
        temp.tv_sec = end.tv_sec-start.tv_sec-1;
        temp.tv_nsec = 1000000000+end.tv_nsec-start.tv_nsec;
    } else {
        temp.tv_sec = end.tv_sec-start.tv_sec;
        temp.tv_nsec = end.tv_nsec-start.tv_nsec;
    }
    return temp;
}

Now let’s move to a real example:

#include <iostream>
#include <time.h>
using namespace std;

timespec diff(timespec start, timespec end);

int main()
{
    timespec time1, time2;
    int temp;
    clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time1);
    for (int i = 0; i< 242000000; i++)
        temp+=temp;
    clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time2);
    cout<<diff(time1,time2).tv_sec<<":"<<diff(time1,time2).tv_nsec<<endl;
    return 0;
}

timespec diff(timespec start, timespec end)
{
    timespec temp;
    if ((end.tv_nsec-start.tv_nsec)<0) {
        temp.tv_sec = end.tv_sec-start.tv_sec-1;
        temp.tv_nsec = 1000000000+end.tv_nsec-start.tv_nsec;
    } else {
        temp.tv_sec = end.tv_sec-start.tv_sec;
        temp.tv_nsec = end.tv_nsec-start.tv_nsec;
    }
    return temp;
}

To use clock_gettime, you need to include time.h and link to librt.a. If you use gcc, just make sure you add -lrt to your list of arguments.

Play a bit with the length of the for loop. As you can see, clock_gettime provides much more accurate results and can register very short processing times too. Just remember that, as is the case with any profiling function, this function adds a little overhead to your program, so make sure you disable the profiling code, using preprocessor commands for example, in the production release.

26/9/2007 – Update

You may want to take a look at The Revised String Iteration Benchmark post for another, larger, example of using clock_gettime to time code performance.

60 thoughts on “Profiling Code Using clock_gettime”

Returning a local variable in diff function? Isn’t that a call for disaster?

Actually not, returning a struct is part of the C specification.

Hi Guy,

Can you explain more detail as following statement ?

temp.tv_sec = end.tv_sec-start.tv_sec-1;
temp.tv_nsec = 1000000000+end.tv_nsec-start.tv_nsec;

Thanks

It preforms the carry operation of the subtraction.

Pingback: Como obter o tempo do sistema operacional em nanosegundos ou milisegundos utilizando o Qt Creator? | DL-UAT

cout<<diff(time1,time2).tv_sec<<“:”<<diff(time1,time2).tv_nsec<<endl;
may cause problems if the number of nanoseconds is not represented by exactly 9 digits.
%09ld formatting may solve things if the number of nanoseconds is represented by less than 9 digits,
but not in case of more than 9 digits.

Better be safe and use something like
printf(“Elapsed time equals %.4f seconds”, diff(time1,time2).tv_sec + diff(time1,time2).tv_nsec/1e9);

Hey Guy,

Part of the code from your example:

int temp; for (int i = 0; i< 242000000; i++) temp+=temp;

has undefined behaviour of using uninitialized variable and signed integer overflow. Plus it can be easily optimized out as a dead code.

Thank you for the post, though!

Pingback: Beating the optimizer | Spin Foam

diff function must be

struct timespec Lib_Utils_diff_time (struct timespec start, struct timespec end)
{
// stuff
}

Pingback: Como obter o tempo do sistema operacional em nanosegundos ou milisegundos utilizando o Qt Creator? - c c++ qt - Perguntas e Respostas

Profiling Code Using `clock_gettime`

26/9/2007 – Update

60 thoughts on “Profiling Code Using `clock_gettime`”

Leave a Reply

26/9/2007 – Update

Share this:

60 thoughts on “Profiling Code Using clock_gettime”

Leave a Reply

60 thoughts on “Profiling Code Using `clock_gettime`”