Operand forwarding

Operand forwarding is an optimization in pipelined CPUs to limit performance deficits which occur due to pipeline stalls caused by data hazards. A data hazard can lead to a pipeline stall when the current operation has to wait for the results of an earlier operation which has not yet finished.
It is very common that an instruction requires a value computed by the immediately preceding instruction. It may take a few clock cycles to write a result to the register file and then read it back for the subsequent instruction. To improve performance, the register file write/read is bypassed. The result of an instruction is forwarded directly to the execute stage of a subsequent instruction.

Example

ADD A B C #A=B+C
SUB D C A #D=C-A
If these two assembly pseudocode instructions run in a pipeline, after fetching and decoding the second instruction, the pipeline stalls, waiting until the result of the addition is written and read.

1	2	3	4	5	6	7	8
Fetch ADD	Decode ADD	Read Operands ADD	Execute ADD	Write result
	Fetch SUB	Decode SUB	stall	stall	Read Operands SUB	Execute SUB	Write result

1	2	3	4	5	6	7
Fetch ADD	Decode ADD	Read Operands ADD	Execute ADD	Write result
	Fetch SUB	Decode SUB	stall	Read Operands SUB: use result from previous operation	Execute SUB	Write result

In some cases all stalls from such read-after-write data hazards can be completely eliminated by operand forwarding:

1	2	3	4	5	6
Fetch ADD	Decode ADD	Read Operands ADD	Execute ADD	Write result
	Fetch SUB	Decode SUB	Read Operands SUB: use result from previous operation	Execute SUB	Write result

Technical realization

The CPU control unit must implement logic to detect dependencies where operand forwarding makes sense. A multiplexer can then be used to select the proper register or flip-flop to read the operand from.