2Consider the following loop nest, and we would like to run it on two different
ID: 3822580 • Letter: 2
Question
2Consider the following loop nest, and we would like to run it on two different machines, a MIMD machine and a SIMD machine.
for (i=0; i<1000; i++)
for (j=0; j<1000; j++)
X[i][j] = Y[j][j] + 100;
2.1 [] The above loop nest can be parallelized on both MIMD and SIMD without correctness issue. Why is that?
2.2 [] If the above loop is run on a quad-core MIMD machine, how would you partition the loop iterations into each core? Provide a pseudo code that shows your parallelization scheme. What is the potential speedup on this machine?
2.3 [] Assuming the above loop is run on a SIMD machine with 128-bit vector registers, write a pseudo code using your own SIMD instruction to parallelize the above loop.
Explanation / Answer
1.
Consider the situation when an existing code as given above is to be parallelized.
As, in the particular time the parallelizing of code in any compiler can be random and to mechanize
and substantially simplify the process of re-constructing the same loop on both the machine(SIMD & MIMD) is a tough job for compiler.
While an OpenMP loop optimization and parallelization is the best and the most easiest solution. Hence, this is why the loop nest can be parallelized on both MIMD and SIMD without correctness issue.
2.
The ultmiate goal is to ensure that the execution of the cores are busy most.
If not than the time required to partitioning of the loop will be reduced and
the scheduling, context switching and synchronization will be believed to have
poorly balanced workload.
A poorly balanced workload is often caused by variations in computing time of the
loop iterations . It is very hard to determine the variability of the loop iteration
and thn compute time by time examining the code.
For example, given a loop with A = 786, Z = 5, and O = 30, partition of the loop is {200, 189, 175, 96, 73, 25, 82, 128}.
When (pi)4 is smaller than 85, it gets clipped to 76.
3. Pseudo Code For SIMD
Sum of the variables T[0], T[1], … ,T[a-1]
SIMD pseudo code
a := 0log2n0;
b := 0N/20;
q := a mod 2;
for y := 1 step 1 until a begin T[j×2k] := T[j×2k] + T[j×2k + 2k1], (0 j < b); q := (b + q) mod 2;
b := 0(b + q)/20;
q := q;
end;
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.