Question
I am having trouble with the problem below. I attached the screenshot to this thread, however if the image is unclear I also uploaded the image to Wikisend at the following link:
http://wikisend.com/download/310314/1.png
Consider a sequence of instructions executing on aRISC processor (Table 6.1). The instruction syntax consists of an operation code (Op Code) followed by the destination operand (Opl), and one or two source operands (Op2 and Op3). Assume the use of a four-stage pipeline: fetch, decode, execute, and write back. Assume that all pipeline stages take one clock cycle except the execute stage. For simple integer arithmetic and logical instructions, the execute stage also takes one cycle, but for LOAD, five cycles are consumed (due to delay accessing main memory). Only LOAD instructions access main memory, and in this case, the notation [Rn] indicates register-direct addressing, in which the Rn holds the address of the operand. If we have a simple scalar pipeline which allows out-of-order execution, we can construct a table showing the four pipeline stages and the clock cycle during which each stage is scheduled for each instruction. The columns RAW. WAR. and WAW indicate the Read-After-Write, Write-After-Read. and Write-After-Write dependencies. For the processor with out-of-order scheduling capability, an instruction is scheduled fora given pipeline stage at the earliest possible opportunity when the following conditions are met: 1. The previous pipeline stage for that instruction is complete. That is. the ordering Fetch rightarrow Decode rightarrow Execute rightarrow Write must be preserved. Realize that this makes perfect sense, since an instruction cannot be decoded until after it has been fetched: it can't be executed until it's been decoded: results cannot be written until execution is complete. 2. Dependencies have been fulfilled. o RAW - read-after-write; one orboth operands are written by previous instructions, and the current instruction cannot read its operands (which occurs during the Execute stage) until the dependency instruction^) have completed the Write stage, o WAR - write-after-read: the destination operand is read by a previous instruction, and the current instruction cannot write its result to its destination register until the dependency instruction has completed its Execute stage, o WAW - write-after-write: the destination operand is written by a previous instruction, and the current instruction cannot write its result to its destination register until the dependency instruction has completed its Write stage. For example, for instruction 13. register R6 is read after it is written in instruction II. Therefore, the RAW column in the row for instruction 13 contains II. Because of this dependency, the execute stage of 13 must occur after the write-back stage for II. Otherwise. R6 will not be correctly updated before instruction 13 uses its value. Remember to search backwards from the current instruction when looking for dependencies, as only the rrcst recent modification of a register is relevant. The last 4 columns of the table show the pipeline staging of the same sequence of instructions, but for a processor without out-of-order capability. Scheduling for this processor has another constraint that the out-of-order processor does not - stages cannot be scheduled out-of-order. For example, the execute cycle for any instruction must occur after the execute cycle for the previous instruction. Reference: Videos: Pipelines Your task is to complete Table 6.1. indicating all the dependencies, and schedule the pipeline for the two different processors. If an instruction doesn't have a given dependency, place a dash (-) in that cell, as shown in the pre-filled row's. Now assume that the out-of-order processor operates at 2 GHz processor, and the in-order processor operates at 2.4 GHz. From the results in Table 6.1. find the number of cycles required for each processor to execute the benchmark program. Notice that the cycles are numbered beginning with 0. so be sure to include that cycle when counting how many clock cycles are required. Also calculate how long each processor takes to execute the above benchmark instructions (in ps). 2 GHz out-of-order processor: cycles execute in ps. 2.4 GHz in-order processor: cycles execute in i ps (round to nearest ps) ps (round to nearest ps) Notice that the out-of-order processor may be faster, even though it operates at a slower clock speed.
Explanation / Answer
HEY IT IS SO TOUGH