The latencies of individual stages in five-stage MIPS (Microprocessor without In

ID: 3576646 • Letter: T

Question

The latencies of individual stages in five-stage MIPS (Microprocessor without Interlocked Pipeline Stages) Architecture are given below.

Instruction

Instruction Fetch

Arithmetic Logic Unit (ALU)

Memory Access

Latency

200ps

100ps

200ps

300ps

100ps

(10 pts) What is the clock cycle time in a pipelined and non-pipelined processor?

Pipelined version : ______________

Non-pipelined version : ______________

The classic five-stage pipeline MIPS architecture is used to execute the code fragments. Assume the followings:

Register write is done in the first half of the clock cycle; register read is performed in the second half of the clock cycle,

Branches are resolved in the fourth stage of the pipeline and the architecture does not utilize any branch prediction mechanism

Forwarding is not supported.

(5 pts) Assuming there is no dependence other than one(s) given in the code, show the pipeline diagram.

Clock Cycle à

add R1, R2, R3

add R4, R5, R6

beq R1, R4, target

(5 pts) Assuming there is no dependence other than one(s) given in the code, show the pipeline diagram.

Clock Cycle à

add R4, R5, R6

lw R1, 0(R2)

beq R1, R4, target

(5 pts) Assuming there is no dependence other than one(s) given in the code, show the pipeline diagram.

Clock Cycle à

add R1, R2, R3

add R1, R1, R4

add R1, R1, R5

(5 pts) Assuming there is no dependence other than one(s) given in the code, show the pipeline diagram.

Clock Cycle à

lw R1, 4(R2)

sw R1, 0(R2)

The classic five-stage pipeline MIPS architecture is used to execute the code fragments. Assume the followings:

Register write is done in the first half of the clock cycle; register read is performed in the second half of the clock cycle,

Branches are resolved in the second stage of the pipeline and the architecture does not utilize any branch prediction mechanism

Forwarding is fully supported.

(5 pts) Assuming there is no dependence other than one(s) given in the code, show the pipeline diagram.

Clock Cycle à

add R1, R2, R3

add R4, R5, R6

beq R1, R4, target

(5 pts) Assuming there is no dependence other than one(s) given in the code, show the pipeline diagram.

Clock Cycle à

add R4, R5, R6

lw R1, 0(R2)

beq R1, R4, target

(5 pts) Assuming there is no dependence other than one(s) given in the code, show the pipeline diagram.

Clock Cycle à

add R1, R2, R3

add R1, R1, R4

add R1, R1, R5

(5 pts) Assuming there is no dependence other than one(s) given in the code, show the pipeline diagram.

Clock Cycle à

lw R1, 4(R2)

sw R1, 0(R2)

a) (18 pts) A 64 KB L1 cache has a 32 byte block size and is 8-way set-associative.

How many sets does the cache have?

How many bits are used for the offset, index, and tag, assuming that the CPU provides 32-bit addresses?

How large is the tag array including valid bit?

b) (16 pts) Consider a program that can execute with no stalls and a CPI of 1 if the underlying processor can service every load instruction with a 2-cycle L1 cache hit. In practice, 10% of all load instructions suffer from an L1 cache miss. Every cache miss results in a 300-cycle stall while data is fetched from memory. What is the CPI for this program if 20% of the program's instructions are load instructions?

c) (16 pts) Consider an L1 cache that has 16 sets, is direct-mapped (1-way), and supports a block size of 16 bytes. For the following memory access pattern (shown as byte addresses), show which accesses are hits and misses. For each case, indicate the set number.
0, 8, 16, 24, 32, 40, 48, 256, 28, 8, 36, 12, 20, 260.

Instruction

Instruction Fetch

Arithmetic Logic Unit (ALU)

Memory Access

Latency

200ps

100ps

200ps

300ps

100ps

Explanation / Answer

Pipelined: cycle time determined by slowest stage: 300ps.
Non-pipelined: cycle time determined by sum of all stages: 900ps.

Please post different questions for each question

Navigate

The latencies of individual stages in five-stage MIPS (Microprocessor without In

The latent heat of melting for lead is 22.4 kJ/kg and that of oxygen is 13.9 kJ/

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

The latencies of individual stages in five-stage MIPS (Microprocessor without In

Question

Explanation / Answer

Related Questions

Navigate