Few days ago I decided to check what is really the fastest method to iterate over strings in C++. As a string class I chose string class from STL as it is very popular and provides a couple of ways to iterate it. So how can one iterate over an std::string
?
- By using indexes. E.g.
str[i]
and runningi
from zero to the length of the string. - By using the
at
method.string::at(size_t pos)
provides similar interface to indexes with the exceptions that it checks whether the given position is past the end of the string and then throws an exception. One may see it as the safe version of the regular index. - Treating the string as a sequence of characters and and iterate over it using iterators.
- Using
string::c_str()
to get a pointer to a regular C string representation of the string stored in thestd::string
and treating it as array, e.g. using indexes to go over it. - The last way to iterate over the string is to get a pointer to a C string representation using
string::c_str()
and advancing the pointer itself to iterate over the string.
The third method is the native method of iterating over objects in STL, and like the last two it can’t be used if the iteration changes the string itself (e.g. inserting or deleting characters). The first and second method are similar to the fourth (treating the pointer to the C string as an array), except that they aren’t so problematic as the latter when changing the string. The second method is the safest as it’s the only one that does range checks and throws exception if trying to access positions which are outside the string.
To benchmark and find out which method is the fastest method to iterate over a string I’ve created a huge string of random characters ranging from ‘a’ to ‘z’ and five executables, each one implementing one of the above iteration methods to do a simple task
(count the number of occurrences of each letter). The string is fifty million characters long which, as the longer the string the less important the overhead becomes.
The executables for the benchmark of every version were compiled with the default setting of g++
(without optimization as the compiler might change the iteration methods when optimizing). The benchmark executables where timed by using the time
command and redirecting the executables output to /dev/null
. The tests were run both on 64bit Gentoo (with 1 GB RAM) and on 32bit Kubuntu (with 512 MB RAM), to make sure the overall results (which method it better not the runtime itself) isn’t system depended.
Continue reading What is the Fastest Method to Iterate Over a String?