

#### **Propagation Delay**

- Main Idea: All digital logic circuits have propagation delay
  - Time it takes for output to change when inputs are changed



## **Combinational Logic Functions**

- Map input combinations of n-bits to desired m-bit output
- Can describe function with a truth table and then find its circuit implementation

1

0

0 0

1

1





Data Input

(could be

many bits)

Clock pulse

**USC**Viterb

%rax

Q

Data Output

(could be

many bits)

n

CP

**Block Diagram of** 

a Register

**USC**Viterbi

12a.5

**USC**Viterbi

#### ALU's

- Perform a selected operation on two input numbers.
  - FS[5:0] select the desired operation



|    | Func.<br>Code | Op.             | Func.<br>Code | Op.     |
|----|---------------|-----------------|---------------|---------|
|    | 00_0000       | A SHL B         | 10_0000       | A+B     |
|    | 00_0010       | A SHR B         | 10_0010       | A-B     |
|    | 00_0011       | A SAR B         |               |         |
|    |               |                 | 10_0100       | A AND B |
|    | 01_1000       | A * B           | 10_0101       | A OR B  |
| 0] | 01_1001       | A * B<br>(uns.) | 10_0110       | A XOR B |
|    | 01_1010       | A / B           | 10_0111       | A NOR B |
|    | 01_1011       | A / B<br>(uns.) |               |         |
|    |               |                 | 10_1010       | A < B   |

#### Sequential Devices (Registers)

- Registers the D input value when a control input (aka the clock signal) transitions from (clock edge) and stores that value at the Q output until the next clock edge
- A register is like a in software. It stores a value for later use.
- We can choose to only clock the register at "\_ times when we want the register to capture a new value (i.e. when it is the of an instruction)
- Key Idea: Registers data while we operate on those values







#### USCViterbi

#### Processor Execution (store)



USC Viterbi

#### Processor Execution (branch/jump)











#### Forwarding



# Solving Data Hazards

- Key Point: Data dependencies (i.e. instructions needing values produced by earlier ones) limit performance
- Forwarding solves many of the data hazards (data dependencies) that exist
  - It allows instructions to continue to flow through the pipeline without the need to stall and waste time
  - The cost is additional hardware and added complexity
- Even forwarding cannot solve all the issues
  - A structural hazard still exists when a \_\_\_\_\_\_ a value needed by the next instruction

# LD + Dependent Instruction Hazard

USC Viterbi

• Even forwarding cannot prevent the need to stall when a Load instruction produces a value needed by the instruction behind it





### LD + Dependent Instruction Hazard

**USC**Viterbi

- We would need to introduce \_\_\_\_\_ stall cycle (nop) into the pipeline to get the timing correct
- Keep this in mind as we move through the next slides











#### Summary 1

- Pipelining is an effective and important technique to improve the throughput of a processor
- Overlapping execution creates hazards which lead to stalls or wasted cycles
  - Data, Control, Structural
  - More hardware can be investigated to attempt to mitigate the stalls (e.g. forwarding)
- The compiler can help reorder code to avoid stalls and perform useful work (e.g. delay slots)