Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Translate the above code (bottom using nested loops) using our DLX vector instru

ID: 3862323 • Letter: T

Question

Translate the above code (bottom using nested loops) using our DLX vector instruction set. Assume:

Vector registers of length 8

Load unit has a startup of L clocks

Adder unit has a startup of A clocks

Multiplier unit has a startup of M clocks

For vectors of length N, compute the number of clock cycles to execute the inner loop (the vector operations) both for normal execution and then for allowing changing of loads/stores/addition/ multiplication. How much speedup do we achieve with chaining?

low VL (n MVL); find odd-size piece using modulo op for (j 0; j (n/MVL) j j+1) /*outer loop*/ for (i low; i (low+VL); i i+1) runs for length VL*/ Y[i] a x[i] Y[i] /*main operation*/ low low VL; start of next vector*/ VL MVL; reset the length to maximum vector length

Explanation / Answer

• LOOP LD R4, 0(R1)     11

• LD R5, 0(R2) 11

• ADD R6,R4,R5 4

• SR R6, R6, 1 1

• ST R6, 0(R3) 11

• ADDI R1,R1,4 1

• ADDI R2, R2, 4 1

• ADDI R3, R3, 4 1

• ADDI R0, R0, 1 1

• BEQZ R0, LOOP 2

• Chaining: No need to wait until the vector register is loaded, you can start after the first element is ready.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote