Black Magic Code

Saturday, May 31, 2008

About outperforming STL

I can't really say that I'm chocked by the conclusion in the article by Sergey Solyanik called STL vector performance. But it is a fine piece on what is wrong with microbenchmarking. It is also a case study of people who critique C++/STL should actually learn it before they critique it.

I've spent a great deal of time in learning C++ and STL. It is complex language and library where there a number of finer points to remember. There are a number of other things that I cringe on but I'll limit myself to "microbenchmarking" and actually learning the library/language.

You can make any language/library look bad with a microbenchmark. Because it limits itself to a couple of features and a particular way of solving a problem. The biggest problem in the STL vector performance post is that he is comparing a narrow problem of filling an array with values and filling the same values into a much more complex and capable vector template. All that complexity is adding abstractions that cost a fair bit of cycles that in the case of integers can be outperformed with realloc or if you have known size preallocating. The post revolves around a very unclear problem. What exactly is vector bad at? This is where the language and library knowledge comes in.

As a C++ programmer I am in the habit of using vectors and arrays the same way. Because the STL library implementation gives a nice little detail with vectors. The elements of a vector are stored in a continuous area of memory exactly like an array. You can access the internal array of a vector through the [] and it will have same speed as a regular array access. But to keep this continuous array of objects this way it needs to reallocate the memory when you need more space for your objects. i.e. insertions of objects at the end of vector is not constant. This the complexity of push_back, check if you need more space before you insert and actually doing an insert. For a vector of int:s it could be quite large compared to a array of int:s reallocated with realloc(). If I'm ready to sacrifice access speed to the elements I could use something like deque from STL instead. Insertion in the front and back is guaranteed to be "constant", because it does not provide the guarantee of continuous storage. Internally it looks quite different, read all about it in vector vs. deque.

What if you need fast access by index? You should use a STL vector and fine tune your growth with reserve to increase capacity. Adding to the vector with push_back might still be costly depending on what you are pushing. Because when you reserve you don't actually add anything to the vector you just make sure it has enough memory(a.k.a. capacity) when you add more objects onto it. Read all about it in Uses and Abuses of Vector.