I can't, for the life of me, remember what exactly did our teacher said that day and I'm hoping you would probably know.

The module is "Data Structures and Algorithms" and he told us something along the lines of:

The if statement is the most expensive [something]. [something] registers [something].

Yes, I do have a horrible memory and I'm really really sorry, but I've been googling for hours and nothing has come up. Any ideas?

  • 23
    Is asking your teacher an option? – Michael Myers Nov 24 '08 at 20:22
  • 4
    Why don't you email your teacher? It's unlikely anyone on SO knows what your teacher said, unless they were there at the time (or your teacher himself reads SO). – Bill Karwin Nov 24 '08 at 20:23
  • 9
    And of course a link to the obligatory railroad answer – bobobobo Jan 6 '13 at 22:49
  • If statements or especially "? :" expressions in C-influenced curly-bracket languages can be implemented by special conditional execution instructions on eg x86 and arm processors. These are instructions that either do or do not do some operation based on a prior test. Using these excellent instructions avoids the need for conditional jump / branch / 'goto' instructions altogether. A huge performance improvement in some situations by making the program flow completely predictable as it just ploughs on straight with no (possibly unpredictable) jumping around to different points in the code. – Cecil Ward Aug 30 '17 at 17:57
  • A good compiler might sometimes need a bit of a push in the right direction so that it uses conditional instructions instead of being dumb and using conditional jumps, by reorganising code and possibly using a clever arithmetic in an expression or a ? : expression. Don't play with this unless you really know your asm and have read eg Agner Fog's optimisation guides. Compilers sometimes get it right regardless of whether if statements or ? : expressions are used. – Cecil Ward Aug 30 '17 at 18:01

16 Answers 16

up vote 146 down vote accepted

At the very lowest level (in the hardware), yes, ifs are expensive. In order to understand why, you have to understand how pipelines work.

The current instruction to be executed is stored in something typically called the instruction pointer (IP) or program counter (PC); these terms are synonymous, but different terms are used with different architectures. For most instructions, the PC of the next instruction is just the current PC plus the length of the current instruction. For most RISC architectures, instructions are all a constant length, so the PC can be incremented by a constant amount. For CISC architectures such as x86, instructions can be variable-length, so the logic that decodes the instruction has to figure out how long the current instruction is to find the location of the next instruction.

For branch instructions, however, the next instruction to be executed is not the next location after the current instruction. Branches are gotos - they tell the processor where the next instruction is. Branches can either be conditional or unconditional, and the target location can be either fixed or computed.

Conditional vs. unconditional is easy to understand - a conditional branch is only taken if a certain condition holds (such as whether one number equals another); if the branch is not taken, control proceeds to the next instruction after the branch like normal. For unconditional branches, the branch is always taken. Conditional branches show up in if statements and the control tests of for and while loops. Unconditional branches show up in infinite loops, function calls, function returns, break and continue statements, the infamous goto statement, and many more (these lists are far from exhaustive).

The branch target is another important issue. Most branches have a fixed branch target - they go to a specific location in code that is fixed at compile time. This includes if statements, loops of all sorts, regular function calls, and many more. Computed branches compute the target of the branch at runtime. This includes switch statements (sometimes), returning from a function, virtual function calls, and function pointer calls.

So what does this all mean for performance? When the processor sees a branch instruction appear in its pipeline, it needs to figure out how to continue to fill up its pipeline. In order to figure out what instructions come after the branch in the program stream, it needs to know two things: (1) if the branch will be taken and (2) the target of the branch. Figuring this out is called branch prediction, and it's a challenging problem. If the processor guesses correctly, the program continues at full speed. If instead the processor guesses incorrectly, it just spent some time computing the wrong thing. It now has to flush its pipeline and reload it with instructions from the correct execution path. Bottom line: a big performance hit.

Thus, the reason why if statements are expensive is due to branch mispredictions. This is only at the lowest level. If you're writing high-level code, you don't need to worry about these details at all. You should only care about this if you're writing extremely performance-critical code in C or assembly. If that is the case, writing branch-free code can often be superior to code that branches, even if several more instructions are needed. There are some cool bit-twiddling tricks you can do to compute things such as abs(), min(), and max() without branching.

  • 16
    It's not just branch mispredicts. Branches also inhibit instruction reordering, at the compiler level, and also to some extent on the CPU level (for an out-of-order CPU, of course). Nice detailed answer though. – jalf Nov 24 '08 at 20:58
  • 4
    If high-level languages are ultimately translated down into low-level languages and you're writing very performance-centric code, do you still gain nothing by writing code that avoids if statements? Does this concept not carry into higher-level languages? – c.. May 27 '15 at 13:37

"Expensive" is a very relative term, especially with relationship to an "if" statement since you also have to take into the account the cost of the condition. That could range anywhere from a few short cpu instructions to testing the result of a function that calls out to a remote database.

I wouldn't worry about it. Unless you're doing embedded programming you probably shouldn't be concerned about the cost of "if" at all. For most programmers it's just not going to ever be the driving factor in your app's performance.

  • 1
    Definitely relative... cmp/cond jmp is still faster than a mul on many processors. – Brian Knoblauch Nov 24 '08 at 20:26
  • 2
    Yes, I agree that I shouldn't be concerned about it. I'm not trying to optimize anything here. I'm just trying to find out and learn. ;) – pek Nov 24 '08 at 20:28

Branches, especially on RISC architecture microprocessors, are some of the most expensive instructions. This is because on many architectures, the compiler predicts which path of execution will be taken most likely and puts those instructions next in the executable, so they'll already be in the CPU cache when the branch happens. If the branch goes the other way, it has to go back out to main memory and fetch the new instructions -- that's fairly expensive. On many RISC architectures, all instructions are one cycle except for branch (which is often 2 cycles). We're not talking about a major cost here, so don't worry about it. Also, the compiler will optimize better than you do 99% of the time :) One of the really awesome things about the EPIC architecture (Itanium is an example) is that it caches (and begins processing) instructions from both sides of the branch, then discards the set it doesn't need once the outcome of the branch is known. This saves the extra memory access of a typical architecture in the event that it branches along the unpredicted path.

Check out the article Better Performance Through Branch Elimination on Cell Performance. Another fun one is this post about branchless selections on the Real Time Collision Detection Blog.

In addition to the excellent answers already posted in response to this question, I'd like to put in a reminder that although "if" statements are considered expensive low-level operations, trying to utilize branch-free programming techniques in a higher level environment, such as a scripting language or a business logic layer (regardless of language), may be ridiculously inappropriate.

The vast majority of the time, programs should be written for clarity first and optimized for performance second. There are numerous problem domains where performance is paramount, but the simple fact is that most developers are not writing modules for use deep in the core of a rendering engine or a high performance fluid dynamics simulation that runs for weeks on end. When the top priority is for your solution to "just work" the last thing on your mind should be whether or not you can save on the overhead of a conditional statement in your code.

On the lowest possible level if consists of (after computing all the app-specific prerequisites for particular if):

  • some test instruction
  • jump to some place in the code if test succeeds, proceed forwards otherwise.

Costs associated with that:

  • a low level comparison -- usually 1 cpu operation, super cheap
  • potential jump -- which can be expensive

Reson why jumps are expensive:

  • you can jump to arbirary code that lives anywhere in memory, if it turns out that it is not cached by the cpu -- we have a problem, because we need to access main memory, which is slower
  • modern CPUs do branch predition. They try to guess whether if will succeed or not and execute code ahead in the pipeline, so speed things up. If the prediction fails all computation done ahead by pipeline has to be invalidated. That also is an expensive operation

So to sum up:

  • If can be expesive, if you really, really, relly care about performance.
  • You should care about it if and only if you are writing real time raytracer or biological simulation or something similar. There is no reason to care about it in most of the real world.
  • Take this to the next level: what about nested and/or compound if statements? The expense can become quite noticeable quickly if someone writes a lot of if statements like this. And since to most developers if statements seem like such a fundamental operation, avoiding the convoluted conditional branching is often relegated to a stylistic concern. Stylistic concerns are still important, but often in the heat of the moment they can be the first concern to be ignored. – jaydel Oct 18 '17 at 14:47

Modern processors have long execution pipelines which means that several instructions are executed in various stages at the same time. They may not always know the outcome of one instruction when the next one begins to run. When they run into a conditional jump (if) they sometimes have to wait until the pipeline is empty before they can know which way the instruction pointer should go.

I think of it as a long freight train. It can carry a lot of cargo fast in a straight line, but it corners badly.

Pentium 4 (Prescott) had a famously long pipeline of 31 stages.

More on Wikipedia

  • 3
    +1 for the freight train metaphor -- I'll remember that for the next time I need to explain processor pipelines. – Daniel Pryden Aug 31 '09 at 8:03

Maybe the branching kills the CPU instruction prefetching?

  • Upon my... "research" I learned about jump tables and branching for the switch statements but nothing about the if statements. Could you elaborate a little on that? – pek Nov 24 '08 at 20:24
  • IIRC, the CPU is usually prefetching instructions along a single probable execution path, but an 'if' statement that causes a branch from the predicted execution path it will invalidate the prefetched instructons and the preteching will have to restart. – activout.se Nov 24 '08 at 20:28
  • Any decent processor should have branch prediction capabilities that will try to guess whether a branch will be taken or not, and prefetch instruction based on the prediction (which is generally quite good). GCC even has C extensions that allow a programmer to provide hints for branch predictors. – mipadi Nov 24 '08 at 20:30
  • 2
    Moreover, the CPU usually looks ahead to start executing upcoming instructions early (not just prefetch them), and the compiler tries to reorder instructions, and that becomes dangerous across branches, so you can really kill instruction scheduling with too many branches. Which hurts performance. – jalf Nov 24 '08 at 20:32

The only thing I can imagine this might be referring to is the fact that an if statement generally can result in a branch. Depending on the specifics of the processor architecture, branches can cause pipeline stalls or other less than optimal situations.

However, this is extremely situation specific - most modern processors have branch prediction capabilities that attempt to minimize the negative effects of branching. Another example would be how the ARM architecture (and probably others) can handle conditional logic - the ARM has instruction level conditional execution, so simple conditional logic results in no branching - the instructions simply execute as NOPs if the conditions are not met.

All that said - get your logic correct before worrying about this stuff. Incorrect code is as unoptimized as you can get.

  • I've heard that ARM's conditional instructions inhibit ILP so they might just be pushing the problem around. – Jon Harrop Nov 27 '12 at 14:10

if in itself is not slow. Slowness is always relative i bet for my life that you haven't ever felt the "overhead" of an if-statement. If you are going to make a high-performance code, you migh want to avoid branches anyway. What makes if slow is that the processor is preloading code from after the if based on some heuristic and whatnot. It will also stop pipelines from executing code directly after the if branch instruction in the machine code, since the processor doesn't know yet what path will be taken (in a pipelined processor, multiple instructions are interleaved and executed). Code executed could have to be executed in reverse (if the other branch was taken. it's called branch misprediction), or noop's be filled at those places so that this doesn't happen.

If if is evil, then switch is evil too, and &&, || too. Don't worry about it.

CPUs are deeply pipelined. Any branch instruction (if/for/while/switch/etc) means that the CPU doesn't really know what instruction to load and run next.

The CPU either stalls while waiting to know what to do, or the CPU takes a guess. In the case of an older CPU, or if the guess is wrong, you'll have to suffer a pipeline stall while it goes and loads the correct instruction. Depending on the CPU this can be as high as 10-20 instructions worth of stall.

Modern CPUs try to avoid this by doing good branch prediction, and by executing multiple paths at the same time, and only keeping the actual one. This helps out a lot, but can only go so far.

Good luck in the class.

Also, if you have to worry about this in real life, you're probably doing OS design, realtime graphics, scientific computing, or something similarly CPU-bound. Profile before worrying.

As pointed out by many, conditional branches can be very slow on a modern computer.

That being said, there are a whole lot of conditional branches that don't live in if statements, you can't always tell what the compiler will come up with, and worrying about how long basic statements will take is virtually always the wrong thing to do. (If you can tell what the compiler will generate reliably, you may not have a good optimizing compiler.)

Also note that inside a loop is not necessarily very expensive.

Modern CPU assumes upon first visit of an if-statement, that the "if-body" is to be taken (or said the other way: it also assumes a loop-body to be taken multiple times) (*). Upon second and further visits, it (the CPU) can maybe look into the Branch History Table, and see how the condition was the last time (was it true? was it false?). If it was false the last time, then speculative execution will proceed to the "else" of the if, or beyond the loop.

(*) The rule is actually "forward branch not taken, backward branch taken". In an if-statement, there is only a [forward] jump (to the point after the if-body) if the condition evaluates to false (remember: the CPU anyways assumes to not take a branch/jump), but in a loop, there is maybe a forward branch to the position after the loop (not to be taken), and a backward branch upon repetetion (to be taken).

This is also one of the reasons why a call to a virtual function or a function-pointer-call is not that worse as many assume (http://phresnel.org/blog/)

Write your programs the clearest, simplest, cleanest way that isn't obviously inefficient. That makes the best use of the most expensive resource, you. Be it writing or later debugging (requires understanding) the program. If the performance isn't enough, measure where the bottlenecks are, and see how to mitigate them. Only on extremely rare occasions will you have to worry about individual (source) instructions when doing so. Performance is about selecting the right algorithms and data structures in the first line, careful programing, getting a fast enough machine. Use a good compiler, you'd be surprised when seeing the kind of code restructuring a modern compiler does. Restructuring code for performance is a sort of last resort measure, the code gets more complex (thus buggier), harder to modify, and thus all-around more expensive.

I had this argument with a friend of mine once. He was using a very naive circle algorithm, but claimed his to be faster than mine (The kind that only calculates 1/8th of the circle) because mine used if. In the end, the if statement was replaced with sqrt and somehow that was faster. Perhaps because the FPU has sqrt built in?

Some CPU's(like X86) provides branch prediction to programming level to avoid such a branch prediction latency.

Some compiler exposes(like GCC) these as a extension to higher level programming languages(like C/C++).

Refer likely()/unlikely() macros in the Linux kernel - how do they work? What's their benefit?.

The most expensive in terms of ALU usage? It uses up CPU registers to store the values to be compared and takes up time to fetch and compare the values each time the if statement is run.

Therefore an optimization of that is to do one comparison and store the result as a variable before the loop is run.

Just trying to interpret your missing words.

Your Answer

 
discard

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Not the answer you're looking for? Browse other questions tagged or ask your own question.