Scoreboarding
Scoreboarding is a centralized method, first used in the CDC 6600 computer, for dynamically scheduling instructions so that they can execute out of order when there are no conflicts and the hardware is available.
In a scoreboard, the data dependencies of every instruction are logged, tracked and strictly observed at all times. Instructions are released only when the scoreboard determines that there are no conflicts with previously issued instructions. If an instruction is stalled because it is unsafe to issue, the scoreboard monitors the flow of executing instructions until all dependencies have been resolved before the stalled instruction is issued. In essence: reads proceed on the absence of write hazards, and writes proceed in the absence of read hazards.
Scoreboarding is essentially a hardware implementation of the same underlying algorithm seen in dataflow languages, creating a directed Acyclic Graph, where the same logic is applied in the programming language runtime.
Stages
Instructions are decoded in order and go through the following four stages.- Issue: The system checks which registers will be read and written by this instruction and where conflicts write after read and read after write and write after write are detected. RAW and WAR hazards are recorded using a Dependency Matrix as it will be needed in the following stages. Simultaneously, an entry is recorded in a second Matrix, which records the instruction order as a Directed Acyclic Graph. In order to avoid output dependencies the instruction is stalled until instructions intending to write to the same register are completed. The instruction is also stalled when required functional units are currently busy. No instruction is ever issued unless it is fully trackable from start to finish.
- Read operands: After an instruction has been issued and correctly allocated to the required hardware module, the Unit waits until all operands become available. The read only proceeds when write dependencies have been dropped from all other Units. To avoid Register File Port contention, a Priority Picker selects one Computational Unit.
- Execution: When all operands have been fetched, the Computation Unit starts its execution. After the result is ready, the scoreboard is notified.
- Write Result: In this stage the result is ready but has not yet been written to its destination register. The write may not proceed until the Unit is clear of all WAR hazards. The only additional delays here are based on availability of register file ports: in the 6600 a Priority Picker was used to select one result per write port. Once written the unit is marked as no longer busy, and all hazards and state is dropped. Note that only in advanced scoreboards with "Shadow" capability will the Write Result phase be prevented. The original 6600 did not have this capability.
Data structure
To control the execution of the instructions, the scoreboard maintains three status tables:Instruction Status: Indicates, for each instruction being executed, which of the four stages it is in.Functional Unit Status: Indicates the state of each functional unit. Each function unit maintains 9 fields in the table:- * Busy: Indicates whether the unit is being used or not
- * Op: Operation to perform in the unit
- * Fi: Destination register
- * Fj,Fk: Source-register numbers
- * Qj,Qk: Functional units that will produce the source registers Fj, Fk
- * Rj,Rk: Flags that indicates when Fj, Fk are ready for and are not yet read. Register Status: Indicates, for each register, which function unit will write results into it.
The original 6600 algorithm
The detailed algorithm for the scoreboard control, outlined in the original patent, is described below:function issue
wait until ; // FU can be any functional unit that can execute operation op
Busy ← Yes;
Op ← op;
Fi ← dst;
Fj ← src1;
Fk ← src2;
Qj ← Result;
Qk ← Result;
Rj ← Qj 0;
Rk ← Qk 0;
Result ← FU;
function read_operands
wait until ;
Rj ← No;
Rk ← No;
function execute
// Execute whatever FU must do
function write_back
wait until
foreach f do
if Qj=FU then Rj ← Yes;
if Qk=FU then Rk ← Yes;
Result ← No;
Remarks
Thornton's book pre-dates modern computing terminology.- Function Units were called "Computation Units".
- "First Order Conflict" covered both stall due to all Units being busy and also covered WAW conflict.
- "Second Order Conflict" was the term used for RAW conflict.
- "Third Order Conflict" covered WAR conflict.
An analysis of both algorithms was carried out by Luke Leighton and a transformation process outlined which shows equivalence between the Tomasulo algorithm and the 6600 Scoreboard algorithm. WAW hazards resolution is indeed missing from the original algorithm: the 6600 would stall at the first occurrence of a Write Hazard.