Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

PLEASE ONLY ANSWER NUMBER 2! #1) A particular program expressed in a particular

ID: 3723939 • Letter: P

Question

PLEASE ONLY ANSWER NUMBER 2!

#1) A particular program expressed in a particular ISA executes 200 ALU instructions, 10 Loads, 16 Stores, and 4 Branches. A simple, non-pipelined, implementation of that ISA takes 8 CPI for each ALU instruction, 20 CPI for each load, 10 CPI for each Store, and 10 CPI for each Branch. The original clock frequency is 2GHz. How many clock cycles would the program take to execute? How many microseconds would the program take to execute?

# CPI
ALU 200 8
Load 10 20
Store 16 10
Branch 4 10

Cycles = (200*8) + (10*20) + (16*10) + (4*10)
= 1600 + 200 + 160 + 40 = 2000
T = 1/f = 0.5 ns
Time = 2000 * 0.5ns = 1000ns = 1us

#2) Given the circumstances described in question 1 above, which of the following changes by itself would yield at least 2X speedup?

a) A clever compiler is able to eliminate all the Branch instructions
b) An improved ALU design reduces ALU instruction CPI from 8 to 2
c) Rewriting the program reduces the number of ALU instructions to 100
d) Adding a cache reduces Load CPI from 20 to 5 and Store CPI from 10 to 5
e) New VLSI fabrication technology halves the clock period, but doesn't change memory speed so Load takes 40 CPI

. A particular proa exressed napat SA executes 20ALU nsaucons,10 Loads, 16 Stores, and 4 Brnches. A sipe oa-pupelieed, umpleetaon of tha: ISA takes S CPI for each ALU instruction, 20 CPl lor eachlosd, 10 CPI for each Store, and 10 CPI tor each Branch corde would the pe CPI 2. For thi guerrion, chack ail shar opph: Give the circumstances described in question 10 above, which of the following chnzes by itself weuld yield t leat 2X speedup? laust 2X speedup? A clever ccmpale s able to elim all the Branch instructicns Aumproved ALU desum reduces ALL unstructioci CPI Loin 8 to 2 Rewiting the gnogram oduecs the sumber ofALUuctions to 100 Adding R cache 1edanLoad Cpr fircrm ?n to 5 and Stare C.PT firm' 10 to , Neu VI.SI fabrication t aves the clock period bur doesn't change memory speed so Laad takes 40 CPI

Explanation / Answer

AnsweR:
New VLSI fabrication technology halves the clock period



Corrections:
An improved ALU design reduces ALU instruction CPI from 8 to 3 and not 8 to 2

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote