In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.

Indeed, the array is partitioned in a contiguous zone with data < 128 and another with data >= 128. So you should find the partition point with a dichotomic searchdichotomic search (using Lg(arraySize) = 15 comparisons), then do a straight accumulation from that point.

Something like (unchecked)

int i= 0, j, k= arraySize;
while (i < k)
{
  j= (i + k) >> 1;
  if (data[j] >= 128)
    k= j;
  else
    i= j;
}
sum= 0;
for (; i < arraySize; i++)
  sum+= data[i];

or, slightly more obfuscated

int i, k, j= (i + k) >> 1;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
  j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
  sum+= data[i];

A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536; (assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)

In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.

Indeed, the array is partitioned in a contiguous zone with data < 128 and another with data >= 128. So you should find the partition point with a dichotomic search (using Lg(arraySize) = 15 comparisons), then do a straight accumulation from that point.

Something like (unchecked)

int i= 0, j, k= arraySize;
while (i < k)
{
  j= (i + k) >> 1;
  if (data[j] >= 128)
    k= j;
  else
    i= j;
}
sum= 0;
for (; i < arraySize; i++)
  sum+= data[i];

or, slightly more obfuscated

int i, k, j= (i + k) >> 1;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
  j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
  sum+= data[i];

A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536; (assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)

In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.

Indeed, the array is partitioned in a contiguous zone with data < 128 and another with data >= 128. So you should find the partition point with a dichotomic search (using Lg(arraySize) = 15 comparisons), then do a straight accumulation from that point.

Something like (unchecked)

int i= 0, j, k= arraySize;
while (i < k)
{
  j= (i + k) >> 1;
  if (data[j] >= 128)
    k= j;
  else
    i= j;
}
sum= 0;
for (; i < arraySize; i++)
  sum+= data[i];

or, slightly more obfuscated

int i, k, j= (i + k) >> 1;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
  j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
  sum+= data[i];

A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536; (assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)

Mod Removes Wiki by Shog9
Post Made Community Wiki by George Stocker
added 1 character in body
Source Link
Yves Daoust
  • 52.9k
  • 8
  • 41
  • 94

In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.

Indeed, the array is partitioned in a contiguous zone with data < 128 and another with data >= 128. So you should find the partition point with a dichtomicdichotomic search (using Lg(arraySize) = 15 comparisons), then do a straight accumulation from that point.

Something like (unchecked)

int i= 0, j, k= arraySize;
while (i < k)
{
  j= (i + k) >> 1;
  if (data[j] >= 128)
    k= j;
  else
    i= j;
}
sum= 0;
for (; i < arraySize; i++)
  sum+= data[i];

or, slightly more obfuscated

int i, k, j= (i + k) >> 1;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
  j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
  sum+= data[i];

A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536; (assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)

In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.

Indeed, the array is partitioned in a contiguous zone with data < 128 and another with data >= 128. So you should find the partition point with a dichtomic search (using Lg(arraySize) = 15 comparisons), then do a straight accumulation from that point.

Something like (unchecked)

int i= 0, j, k= arraySize;
while (i < k)
{
  j= (i + k) >> 1;
  if (data[j] >= 128)
    k= j;
  else
    i= j;
}
sum= 0;
for (; i < arraySize; i++)
  sum+= data[i];

or, slightly more obfuscated

int i, k, j= (i + k) >> 1;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
  j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
  sum+= data[i];

A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536; (assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)

In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.

Indeed, the array is partitioned in a contiguous zone with data < 128 and another with data >= 128. So you should find the partition point with a dichotomic search (using Lg(arraySize) = 15 comparisons), then do a straight accumulation from that point.

Something like (unchecked)

int i= 0, j, k= arraySize;
while (i < k)
{
  j= (i + k) >> 1;
  if (data[j] >= 128)
    k= j;
  else
    i= j;
}
sum= 0;
for (; i < arraySize; i++)
  sum+= data[i];

or, slightly more obfuscated

int i, k, j= (i + k) >> 1;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
  j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
  sum+= data[i];

A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536; (assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)

added 14 characters in body
Source Link
Yves Daoust
  • 52.9k
  • 8
  • 41
  • 94

In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.

Indeed, the array is partitioned in a contiguous zone with data < 128 and another with data >= 128. So you should find the partition point with a dichtomic search (using Lg(arraySize) = 15 comparisons), then do a straight accumulation from that point.

Something like (unchecked)

int i= 0, j, k= arraySize;
while (i < k)
{
  j= (i + k) >> 1;
  if (data[j] >= 128)
    k= j;
  else
    i= j;
}
sum= 0;
for (; i < arraySize; i++)
  sum+= data[i];

or, slightly more obfuscated

int i, jk, k;j= (i + k) >> 1;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
  j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
  sum+= data[i];

A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536; (assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)

In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.

Indeed, the array is partitioned in a contiguous zone with data < 128 and another with data >= 128. So you should find the partition point with a dichtomic search (using Lg(arraySize) = 15 comparisons), then do a straight accumulation from that point.

Something like (unchecked)

int i= 0, j, k= arraySize;
while (i < k)
{
  j= (i + k) >> 1;
  if (data[j] >= 128)
    k= j;
  else
    i= j;
}
sum= 0;
for (; i < arraySize; i++)
  sum+= data[i];

or, slightly more obfuscated

int i, j, k;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
  j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
  sum+= data[i];

A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536; (assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)

In the sorted case, you can do better than relying on successful branch prediction or any branchless comparison trick: completely remove the branch.

Indeed, the array is partitioned in a contiguous zone with data < 128 and another with data >= 128. So you should find the partition point with a dichtomic search (using Lg(arraySize) = 15 comparisons), then do a straight accumulation from that point.

Something like (unchecked)

int i= 0, j, k= arraySize;
while (i < k)
{
  j= (i + k) >> 1;
  if (data[j] >= 128)
    k= j;
  else
    i= j;
}
sum= 0;
for (; i < arraySize; i++)
  sum+= data[i];

or, slightly more obfuscated

int i, k, j= (i + k) >> 1;
for (i= 0, k= arraySize; i < k; (data[j] >= 128 ? k : i)= j)
  j= (i + k) >> 1;
for (sum= 0; i < arraySize; i++)
  sum+= data[i];

A yet faster approach, that gives an approximate solution for both sorted or unsorted is: sum= 3137536; (assuming a truly uniform distribution, 16384 samples with expected value 191.5) :-)

deleted 1 characters in body
Source Link
Lance Roberts
  • 21.7k
  • 30
  • 108
  • 129
Loading
added 206 characters in body
Source Link
Yves Daoust
  • 52.9k
  • 8
  • 41
  • 94
Loading
added 198 characters in body
Source Link
Yves Daoust
  • 52.9k
  • 8
  • 41
  • 94
Loading
Source Link
Yves Daoust
  • 52.9k
  • 8
  • 41
  • 94
Loading