b. Determine whether the inner loop of the row-oriented algorithm can be paralle
ID: 3759768 • Letter: B
Question
b. Determine whether the inner loop of the row-oriented algorithm can be
parallelized.
c. Determine whether the (second) outer loop of the column-oriented algorithm
can be parallelized.
d. Determine whether the inner loop of the column-oriented algorithm can be
parallelized.
e. Write one OpenMP program for each of the loops that you determined could
be parallelized. You may find the single directive useful—when a block
of code is being executed in parallel and a sub-block should be executed by
only one thread, the sub-block can be modified by a #pragma omp single
directive. The threads in the executing team will block at the end of the
directive until all of the threads have completed it.
f. Modify your parallel loop with a schedule(runtime) clause and test the
program with various schedules. If your upper triangular system has 10,000
variables, which schedule gives the best performance?
Any help would be greatly appreciated! ( part a is in the picture as well as problem discription)
290 (290 of 391) 150% 295 CStart Presentation Leave Fullscreen 5.4. Recall that when we solve a large linear system, we often use Gaussian elim- ination followed by backward substitution. Gaussian elimination converts an n × n linear systern into an upper triangular linear system by using the "row operations." Add a multiple of one row to another row . Swap two rows .Multiply one row by a nonzero constant An upper triangular system has zeroes below the "diagonal" extending from the upper left-hand corner to the lower right-hand corner For example, the linear system 4x0-5x1 + x2 = 7 200- XI-3X2 = 5 can be reduced to the upper triangular form 2x0- 3xi and this system can be easily solved by first finding x2 using the last equation, then finding xi using the second equation, and finally finding xo using the first equation We can devise a couple of serial algorithms for back substitution. The "row oriented" version is for (row= n-1 ; row)20: row-) { x[row] -b[row]: for (col = row+1; colExplanation / Answer
a)
>The outer loop of the row-oriented algorithm cannot be parallelized.
>This is because the inner loop depends on x[col] where col=row+1.
>If the outer loop was parallelized for two threads where n = 4, then thread1 would assign x[3] and x[2] and thread2 would would assign x[1] and x[0].
>The problem with this is that x[1] is dependent on x[2] (x[col] where col = row+1) and x[2] would not have been assigned yet if thread2 went before thread1.
b)
The inner loop of the row-oriented algorithm can be parallelized,because x[row] - x[n] have been assigned by the outer loop so x[col] (where col = row+1) is not dependent on any of the other threads.
c)
>The second outer loop of the column-oriented algorithm can be parallelized.
>Since x[0..n] have been set in the first outerloop, the only line that had an dependent variable is x[row] -= A[row][col]*x[col] in the inner loop.
>However since x[col] is set by the same thread no threads should need to depend on each other
d)
>The inner loop of the column-oriented algorithm can also be parallelized. This is for the same reason why the second outer loop of the column-oriented algorithm can be parallelized.
>x[col] is the only thing that the inner loop depends on, and since x[0-n] are already assigned in the first outer loop the inner loop has to be parallelizable.
f)
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <omp.h>
int n = 1000;
int column_orient(int A[n][n], int b[n], int x[n], int thread_count)
{
int col;
int row;
int d = 0;
while(d < 50)
{
for(row = 0; row < n; row++)
x[row] = b[row];
for(col = n-1; col >= 0; col--)
{
x[col] /= A[col][col];
for(row = 0; row < col; row++)
x[row] -= A[row][col]*x[col];
}
}
return 0;
}
int row_orient(int A[n][n], int b[n], int x[n], int thread_count)
{
int col;
int row;
int d = 0;
while(d < 50)
{
for(row = n-1; row >= 0; row--)
{
x[row] = b[row];
for(col=row+1; col < n; col++)
{
x[row] -= A[row][col]*x[col];
}
x[row] /= A[row][row];
}
d++;
}
return 0;
}
int main(int argc, char *argv[])
{
int x[n];
int b[n];
int A[n][n];
int i;
int t;
srand(time(NULL));
int r;
for(i = 0; i < n; i++)
{
r = rand() ;
b[i] = r;
}
for(i = 0; i < n; i++)
{
for(t = 0; t < n ; t++)
{
r = rand();
A[i][t] = r;
}
}
int thread_count = strtol(argv[1], NULL, 10);
float start = omp_get_wtime();
if(row_orient(A, b, x, thread_count) == 0)
printf("row orient is done ");
float end = omp_get_wtime();
printf("Time row orient is: %f ", end-start);
start = omp_get_wtime();
if(column_orient(A, b, x, thread_count) == 0)
printf("column orient is done ");
end = omp_get_wtime();
printf("Time column orient is: %f ", end-start);
return 0;
}
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.