In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.
Indeed, the array is partitioned in a contiguous zone with data < 128
and another with data >= 128
. So you should find the partition point with a dichotomic searchdichotomic search (using Lg(arraySize) = 15
comparisons), then do a straight accumulation from that point.
Something like (unchecked)
int i= 0, j, k= arraySize;
while (i < k)
{
j= (i + k) >> 1;
if (data[j] >= 128)
k= j;
else
i= j;
}
sum= 0;
for (; i < arraySize; i++)
sum+= data[i];
or, slightly more obfuscated
int i, k, j= (i + k) >> 1;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
sum+= data[i];
A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536;
(assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)