I’ve just read this week’s stackoverflow newsletter, and was intrigued by a recent top question “Why is processing a sorted array faster than an unsorted array?”
Anyone writing performance-critical processing code should be aware of this; the C++ comparison showed that processing the sorted array was nearly 6x faster than with the unsorted, which is certainly an eyebrow-raising result. A straight Java equivalent showed a similar, albeit less extreme, difference.
The reason for this potentially surprising behaviour is that in the code with the unsorted array, the CPU’s branch predictor fails 50% of the time and stalls the pipeline. I’ll hand you over to GManNickG to show you the code and Mysticial to explain, rather well, what is going on: Why is processing a sorted array faster than an unsorted array?
This is something I was aware of but have never taken into account, so I’ll be trying to remember Mysticial’s general rule of thumb to “avoid data-dependent branching in critical loops“.