Posts Tagged ‘performance’

Eric Niebler: out parameters, move semantics, and stateful algorithms

October 16, 2013 Leave a comment

Eric Niebler: out parameters, move semantics, and stateful algorithms

This post from Eric, inspired by a prompt from Andrei Alexcandrescu, examines the interface of std::getline appraising its usability and performance. Eric looks at a few different designs, in particular those that leverage the move semantics introduced in C++11.

This is a particularly good exploration of a small and simple API – almost too simple to bother with – and Eric’s range-based solution is far more natural and still efficient.  The crux of his design approach stems from the insightful realisation that “getline is a curious example because what looks at first blush like a pure out parameter is, in fact, an in/out parameter; on the way in, getline uses the passed-in buffer’s capacity to make it more efficient. This puts getline into a large class of algorithms that work better when they have a chance to cache or precompute something.

The discussion in the comments below the post also merits reading, particularly Bartosz Milewski’s suggestion that front be thought of as take 1 (take n xs is a function in Haskell that returns a list of the first n elements of the list xs.

Why processing a sorted array might be faster than unsorted

I’ve just read this week’s stackoverflow newsletter, and was intrigued by a recent top question “Why is processing a sorted array faster than an unsorted array?”

Anyone writing performance-critical processing code should be aware of this; the C++ comparison showed that processing the sorted array was nearly 6x faster than with the unsorted, which is certainly an eyebrow-raising result. A straight Java equivalent showed a similar, albeit less extreme, difference.

The reason for this potentially surprising behaviour is that in the code with the unsorted array, the CPU’s branch predictor fails 50% of the time and stalls the pipeline. I’ll hand you over to GManNickG to show you the code and Mysticial to explain, rather well, what is going on: Why is processing a sorted array faster than an unsorted array?

This is something I was aware of but have never taken into account, so I’ll be trying to remember Mysticial’s general rule of thumb to “avoid data-dependent branching in critical loops“.


Get every new post delivered to your Inbox.