Tuesday, November 13, 2018 11:08 AM

CS 61C Pipelining
Fall 2018 Discussion 12: November 12, 2018

## 1 Pipelining Registers

In order to pipeline, we add registers between the five datapath stages. Label each of the five stages (IF, ID, EX, MEM, and WB) on the diagram below.



What is the purpose of the new registers? Keep a "snapshot" of the inst., use vey, so new data doesn't

1.2 Why do we add +4 to the PC again in the memory stage?

don't need extra registers

Preserve our full "snapshot", we need control to be m sync.

## 2 Performance Analysis

Register clk-to-q30 psBranch comp.75 psMemory write200 psRegister setup20 psALU200 psRegFile read150 psMux25 psMemory read250 psRegFile setup20 ps

- 2.1 With the delays provided above for each of the datapath components, what would be the fastest possible clock time for a single cycle datapath?
- 2.2 What is the fastest possible clock time for a pipelined datapath?
- 2.3 What is the speedup from the single cycle datapath to the pipelined datapath? Why is the speedup less than 5?

## 3 Hazards

One of the costs of pipelining is that it introduces three types of pipeline hazards: structural hazards, data hazards, and control hazards.

Not enough hardware data not do I want to Structural Hazards ready branch?

Structural hazards occur when more than one instruction needs to use the same datapath resource at the same time. There are two main causes of structural hazards:

Register File The register file is accessed both during ID, when it is read, and during WB, when it is written to. We can solve this by having separate read and write ports. To account for reads and writes to the same register, processors usually write to the register during the first half of the clock cycle, and read from it during in the second half. This is also known as double pumping.

**Memory** Memory is accessed for both instructions and data. Having a separate instruction memory (abbreviated IMEM) and data memory (abbreviated DMEM) solves this hazard.

Something to remember about structural hazards is that they can always be resolved by adding more hardware.

wrik read

#### Data Hazards

Data hazards are caused by data dependencies between instructions. In CS 61C, where we will always assume that instructions are always going through the processor in order, we see data hazards when an instruction **reads** a register before a previous instruction has finished **writing** to that register.

1) addi 2) NoP 3) NoP 4) and

#### Forwarding

Most data hazards can be resolved by forwarding, which is when the result of the EX or MEM stage is sent to the EX stage for a following instruction to use.

3.1 Look for data hazards in the code below, and figure out how forwarding could be used to solve them.

| Instruction                | C1 | C2 | С3   | C4   | C5  | C6  | C7 |
|----------------------------|----|----|------|------|-----|-----|----|
| 1. addi <u>t</u> 0, a0, -1 | IF | ID | EX • | MEM  | WB  |     |    |
| 2. and s2, t0, a0          |    | IF | ID   | ΈX   | MEM | WB  |    |
| 3. sltiu a0, t0, 5         |    |    | IF   | ID ' | ΈX  | MEM | WB |

3.2 Imagine you are a hardware designer working on a CPU's forwarding control logic. How many instructions after the first addi instruction above could be affected a potential data hazard created by this addi instruction?





You have the signals rs1, rs2, RegWEn, and rd for two instructions, instruction n and instruction n + 1. Write a condition you can check to see if there is a data hazard between the two instructions, in terms of these signals.

if 
$$(rsl(nH)==rd(n)||rs2(n+1)==rd(n))||$$
  
Reg W En (n)==1)

#### Stalls

3.4 Look for data hazards in the code below. One of them cannot be solved with forwarding—why? What can we do to solve this hazard?

| Instruction       | C1 | C2 | С3 | C4   | C5  | C6  | C7  | C8   |     |
|-------------------|----|----|----|------|-----|-----|-----|------|-----|
| 1. addi s0, s0, 1 | IF | ID | EX | MEM  | WB  |     |     |      | ]   |
| 2. addi t0, t0, 4 |    | IF | ID | EX < | MEM | WB  |     |      | /   |
| 3. lw t1, 0(t0)   |    |    | IF | ID   | EX  | MEM | WB  | L    | MOP |
| 4. add t2, t1, x0 |    |    |    | IF   | IDE | EXD | MEM | WREM | INR |

- 4 Pipelining
- 3.5 Say you are the compiler and can re-order instructions to minimize data hazards while guaranteeing the same output. How can you fix the code above?

### Control Hazards

Control hazards are caused by **jump and branch instructions**, because for all jumps and some branches, the next PC is not PC + 4, but the result of the computation completed in the EX stage. We could stall the pipeline for control hazards, but this decreases performance.

[3.6] Besides stalling, what can we do to resolve control hazards?

# Branch Predictor

## Extra for Experience

3.7 Given the RISC-V code above and a pipelined CPU with no forwarding, how many hazards would there be? What types are each hazard? Consider all possible hazards from all pairs of instructions.

How many stalls would there need to be in order to fix the data hazard(s)? What about the control hazard(s)?

| Instruction             | C1 | C2 | С3 | C4   | C5  | C6   | C7  | C8  | С9 |
|-------------------------|----|----|----|------|-----|------|-----|-----|----|
| 1. sub t1, s0, s1       | IF | ID | EX | MEM  | WB  |      |     |     |    |
| 2. or s0, t0, <u>t1</u> |    | IF | ID | ex 🔨 | MEM | WB   |     |     |    |
| 3. sw s1, 100(s0)       |    |    | IF | ID   | EX  | MEM  | WB  |     |    |
| 4. bgeu s0, s2, 1       |    |    |    | IF   | ID  | EX 🧶 | MEM | WB  |    |
| 5. add t2, x0, x0       |    |    |    |      | IF  | ID   | EX  | MEM | WB |

already loaded > flush pipelite if we branch