9 events
when toggle format what by license comment
Apr 21, 2022 at 5:31 comment added Peter Cordes Note that benchmarking / profiling un-optimized ("debug mode") code is normally a bad idea. With optimization, that version without branch misses would be faster by a much larger margin, not stalling on store/reload latency for local variables. The actual branch mispredict rate for the critical branch should be about the same, though (assuming there is one: modern compilers can vectorize this or otherwise make branchless asm). Loop unrolling could change the overall miss rate by running a predictable loop branch less often.
Jan 29, 2016 at 1:24 history wiki removed Shog9
Jan 28, 2016 at 20:56 history made wiki Post Made Community Wiki by George Stocker
Dec 9, 2013 at 4:29 comment added caf @tall.b.lo: The 25% is of all branches - there are two branches in the loop, one for data[c] >= 128 (which has a 50% miss rate as you suggest) and one for the loop condition c < arraySize which has ~0% miss rate.
Dec 9, 2013 at 4:00 comment added TallBrianL This is scary, in the unsorted list, there should be 50% chance of hitting the add. Somehow the branch prediction only has a 25% miss rate, how can it do better than 50% miss?
Oct 18, 2012 at 19:20 history edited Peter Mortensen CC BY-SA 3.0
Dressed the naked link.
S Oct 15, 2012 at 9:57 history suggested peter CC BY-SA 3.0
Added a section about perf, which may a better choice for linux users than cachegrind.
Oct 15, 2012 at 9:57 review Suggested edits
S Oct 15, 2012 at 9:57
Oct 12, 2012 at 5:53 history answered caf CC BY-SA 3.0