Revisions to Why is processing a sorted array faster than processing an unsorted array?

added 17 characters in body

Source Link

edited Feb 10, 2020 at 2:05

10.4k
5
46
65

As what has already been mentioned by others, what behind the mystery is Branch Predictor.

I'm not trying to add something but explaining the concept in another way. There is a concise introduction on the wiki which contains text and diagram. I do like the explanation below which uses a diagram to elaborate the Branch Predictor intuitively.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline (see fig. 1).

Based on the described scenario, I have written an animation demo to show how instructions are executed in a pipeline in different situations.

Without the Branch Predictor.

Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline.

The example contains three instructions and the first one is a conditional jump instruction. The latter two instructions can go into the pipeline until the conditional jump instruction is executed.

It will take 9 clock cycles for 3 instructions to be completed.

Use Branch Predictor and don't take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 7 clock cycles for 3 instructions to be completed.

Use Branch Predictor and take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 9 clock cycles for 3 instructions to be completed.

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor.

As you can see, it seems we don't have a reason not to use Branch Predictor.

It's quite a simple demo that clarifies the very basic part of Branch Predictor. If those gifs are annoying, please feel free to remove them from the answer and visitors can also get the live demo source code from BranchPredictorDemo

As what has already been mentioned by others, what behind the mystery is Branch Predictor.

I'm not trying to add something but explaining the concept in another way. There is a concise introduction on the wiki which contains text and diagram. I do like the explanation below which uses a diagram to elaborate the Branch Predictor intuitively.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline (see fig. 1).

Based on the described scenario, I have written an animation demo to show how instructions are executed in a pipeline in different situations.

Without the Branch Predictor.

Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline.

The example contains three instructions and the first one is a conditional jump instruction. The latter two instructions can go into the pipeline until the conditional jump instruction is executed.

It will take 9 clock cycles for 3 instructions to be completed.

Use Branch Predictor and don't take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 7 clock cycles for 3 instructions to be completed.

Use Branch Predictor and take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 9 clock cycles for 3 instructions to be completed.

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor.

As you can see, it seems we don't have a reason not to use Branch Predictor.

It's quite a simple demo that clarifies the very basic part of Branch Predictor. If those gifs are annoying, please feel free to remove them from the answer and visitors can also get the demo from BranchPredictorDemo

As what has already been mentioned by others, what behind the mystery is Branch Predictor.

I'm not trying to add something but explaining the concept in another way. There is a concise introduction on the wiki which contains text and diagram. I do like the explanation below which uses a diagram to elaborate the Branch Predictor intuitively.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline (see fig. 1).

Based on the described scenario, I have written an animation demo to show how instructions are executed in a pipeline in different situations.

Without the Branch Predictor.

Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline.

The example contains three instructions and the first one is a conditional jump instruction. The latter two instructions can go into the pipeline until the conditional jump instruction is executed.

It will take 9 clock cycles for 3 instructions to be completed.

Use Branch Predictor and don't take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 7 clock cycles for 3 instructions to be completed.

Use Branch Predictor and take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 9 clock cycles for 3 instructions to be completed.

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor.

As you can see, it seems we don't have a reason not to use Branch Predictor.

It's quite a simple demo that clarifies the very basic part of Branch Predictor. If those gifs are annoying, please feel free to remove them from the answer and visitors can also get the live demo source code from BranchPredictorDemo

added 28 characters in body

Source Link

edited Jun 4, 2019 at 16:20

Eugene

10.4k
5
46
65

As what has already been mentioned by others, what behind the mystery is Branch Predictor.

I'm not trying to add something but explaining the concept in another way. There is a concise introduction on the wiki which contains text and diagram. I do like the explanation below which uses a diagram to elaborate the Branch Predictor intuitively.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline (see fig. 1).

Based on the described scenario, I have written an animation demo to show how instructions are executed in a pipeline in different situations.

Without the Branch Predictor.

Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline.

The example contains three instructions and the first one is a conditional jump instruction. The latter two instructions can go into the pipeline until the conditional jump instruction is executed.

It will take 9 clock cycles for 3 instructions to be completed.

Use Branch Predictor and don't take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 7 clock cycles for 3 instructions to be completed.

Use Branch Predictor and take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 9 clock cycles for 3 instructions to be completed.

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor.

As you can see, it seems we don't have a reason not to use Branch Predictor.

It's quite a simple demo that clarifies the very basic part of Branch Predictor. If those gifs are annoying, please feel free to remove them from the answer and visitors can also get the demo from git BranchPredictorDemo

As what has already been mentioned by others, what behind the mystery is Branch Predictor.

I'm not trying to add something but explaining the concept in another way. There is a concise introduction on the wiki which contains text and diagram. I do like the explanation below which uses a diagram to elaborate the Branch Predictor intuitively.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline (see fig. 1).

Based on the described scenario, I have written an animation demo to show how instructions are executed in a pipeline in different situations.

Without the Branch Predictor.

Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline.

The example contains three instructions and the first one is a conditional jump instruction. The latter two instructions can go into the pipeline until the conditional jump instruction is executed.

It will take 9 clock cycles for 3 instructions to be completed.

Use Branch Predictor and don't take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 7 clock cycles for 3 instructions to be completed.

Use Branch Predictor and take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 9 clock cycles for 3 instructions to be completed.

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor.

As you can see, it seems we don't have a reason not to use Branch Predictor.

It's quite a simple demo that clarifies the very basic part of Branch Predictor. If those gifs are annoying, please feel free to remove them from the answer and visitors can also get the demo from git

As what has already been mentioned by others, what behind the mystery is Branch Predictor.

I'm not trying to add something but explaining the concept in another way. There is a concise introduction on the wiki which contains text and diagram. I do like the explanation below which uses a diagram to elaborate the Branch Predictor intuitively.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline (see fig. 1).

Based on the described scenario, I have written an animation demo to show how instructions are executed in a pipeline in different situations.

Without the Branch Predictor.

Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline.

The example contains three instructions and the first one is a conditional jump instruction. The latter two instructions can go into the pipeline until the conditional jump instruction is executed.

It will take 9 clock cycles for 3 instructions to be completed.

Use Branch Predictor and don't take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 7 clock cycles for 3 instructions to be completed.

Use Branch Predictor and take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 9 clock cycles for 3 instructions to be completed.

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor.

As you can see, it seems we don't have a reason not to use Branch Predictor.

It's quite a simple demo that clarifies the very basic part of Branch Predictor. If those gifs are annoying, please feel free to remove them from the answer and visitors can also get the demo from BranchPredictorDemo

Rollback to Revision 2

Source Link

edited Jan 29, 2019 at 8:26

Marc Glisse

7.4k
2
29
52

1212121As what has alreadyAs what has already been mentioned by others, what behind the mystery is Branch Predictor.

I'm not trying to add something but explaining the concept in another way. There is a concise introduction on the wiki which contains text and diagram. I do like the explanation below which uses a diagram to elaborate the Branch Predictor intuitively.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline (see fig. 1).

Based on the described scenario, I have written an animation demo to show how instructions are executed in a pipeline in different situations.

Without the Branch Predictor.

Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline.

The example contains three instructions and the first one is a conditional jump instruction. The latter two instructions can go into the pipeline until the conditional jump instruction is executed.

It will take 9 clock cycles for 3 instructions to be completed.

Use Branch Predictor and don't take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 7 clock cycles for 3 instructions to be completed.

Use Branch Predictor and take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 9 clock cycles for 3 instructions to be completed.

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor.

As you can see, it seems we don't have a reason not to use Branch Predictor.

It's quite a simple demo that clarifies the very basic part of Branch Predictor. If those gifs are annoying, please feel free to remove them from the answer and visitors can also get the demo from git

1212121As what has already been mentioned by others, what behind the mystery is Branch Predictor.

I'm not trying to add something but explaining the concept in another way. There is concise introduction on the wiki which contains text and diagram. I do like the explanation below which uses diagram to elaborate the Branch Predictor intuitively.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline (see fig. 1).

Based on described scenario, I have written an animation demo to show how instructions are executed in pipeline in different situations.

Without the Branch Predictor.

Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline.

The example contains three instructions and the first one is a conditional jump instruction. The latter two instructions can go into the pipeline until the conditional jump instruction is executed.

It will take 9 clock cycles for 3 instructions to be completed.

Use Branch Predictor and don't take conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 7 clock cycles for 3 instructions to be completed.

Use Branch Predictor and take conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 9 clock cycles for 3 instructions to be completed.

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor.

As you can see, it seems we don't have reason not to use Branch Predictor.

It's quite a simple demo that clarifies the very basic part of Branch Predictor. If those gifs are annoying, please feel free to remove them from the answer and visitors can also get the demo from git

As what has already been mentioned by others, what behind the mystery is Branch Predictor.

I'm not trying to add something but explaining the concept in another way. There is a concise introduction on the wiki which contains text and diagram. I do like the explanation below which uses a diagram to elaborate the Branch Predictor intuitively.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g. an if-then-else structure) will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high effective performance in many modern pipelined microprocessor architectures such as x86.

Two-way branching is usually implemented with a conditional jump instruction. A conditional jump can either be "not taken" and continue execution with the first branch of code which follows immediately after the conditional jump, or it can be "taken" and jump to a different place in program memory where the second branch of code is stored. It is not known for certain whether a conditional jump will be taken or not taken until the condition has been calculated and the conditional jump has passed the execution stage in the instruction pipeline (see fig. 1).

Based on the described scenario, I have written an animation demo to show how instructions are executed in a pipeline in different situations.

Without the Branch Predictor.

Without branch prediction, the processor would have to wait until the conditional jump instruction has passed the execute stage before the next instruction can enter the fetch stage in the pipeline.

The example contains three instructions and the first one is a conditional jump instruction. The latter two instructions can go into the pipeline until the conditional jump instruction is executed.

It will take 9 clock cycles for 3 instructions to be completed.

Use Branch Predictor and don't take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 7 clock cycles for 3 instructions to be completed.

Use Branch Predictor and take a conditional jump. Let's assume that the predict is not taking the conditional jump.

It will take 9 clock cycles for 3 instructions to be completed.

The time that is wasted in case of a branch misprediction is equal to the number of stages in the pipeline from the fetch stage to the execute stage. Modern microprocessors tend to have quite long pipelines so that the misprediction delay is between 10 and 20 clock cycles. As a result, making a pipeline longer increases the need for a more advanced branch predictor.

As you can see, it seems we don't have a reason not to use Branch Predictor.

It's quite a simple demo that clarifies the very basic part of Branch Predictor. If those gifs are annoying, please feel free to remove them from the answer and visitors can also get the demo from git