parallel programming: Consider the loop #pragma omp parallel for num_threads(2)
ID: 3676927 • Letter: P
Question
parallel programming:
Consider the loop
#pragma omp parallel for num_threads(2)
for(k = 1; k <= 16; ++k)
a[k] = f(k); // this line costs T(k).
The cost function is as follows:
T(1) = 1, T(5)=5, T(9) = 9, T(13)=13
T(2) = 1, T(6)=3, T(10)=5, T(14)=7
T(3) = 3, T(7)=7, T(11)=11, T(15)=15
T(4) = 1, T(8)=1, T(12)=3, T(16)=1
Given this cost function, calculate the parallel runtime for each of the following schedules, assuming all threads start at the same instant, there is no overhead, and evaluations of function f() are completely independent of each other.
OMP_SCHEDULE="static,1"
OMP_SCHEDULE="static,2"
OMP_SCHEDULE="static,3"
OMP_SCHEDULE="static,16"
OMP_SCHEDULE="dynamic,1"
OMP_SCHEDULE="dynamic,2"
OMP_SCHEDULE="dynamic,3"
OMP_SCHEDULE="dynamic,4"
PS: Please explain what the chunk size is in schedule(). Thank you!
Explanation / Answer
Static-even scheduling
By default, an OpenMP parallel for or worksharing for loop uses static-even scheduling. This means the iterations of a loop are distributed among the threads in a roughly equal number of iterations. If m iterations and N threads are in the thread team, each thread gets m/N iterations, and the compiler and runtime library correctly handles the case when m is not evenly divisible by N.
Here m/N =16/2 = 8 iterations and The format for the OMP_SCHEDULE environment variable is
schedule(kind [, chunk-size])
In OMP_SCHEDULE="static,1"
here, kind of thread is static and chunk size is 1 .So T(1) to T(8) will run in Thread 1 and T(9) to T(16) will run in Thread 2.
Parallel RunTime will be max(Cost function of (1 to 8) ,Cost function of(9 to 16)) = max(22,64) = 64
In OMP_SCHEDULE="static,2"
here, Kind of thread is static and chunk size is 2(16 chunks of 1 b .So T(1) to T(8) will run in Thread 1 and T(9) to T(16) will run in Thread 2.
Parallel RunTime will be max(Cost function of (1 to 8) ,Cost function of(9 to 16)) = max(22,64) = 64
This will "chop" iteration space into blocks of two iterations and there will be 4 such blocks in total.
The chunk size basically tells what will be the size of queue that will hold all schedules which can be static or dynamic.
Schedule Kind Chunk Size Chunks Distribution Static 1 16 chunks of size 1 Static 2 8 chunks of size 2 Static 3 5 chunks of size 3, 6th chunk of size 1 Static 16 1 chunk of size 16 Dynamic 1 16 chunks of size 1 Dynamic 2 8 chunks of size 2 Dynamic 3 5 chunks of size 1, 1 chunk of size 1 Dynamic 4 4 chunks of size 4Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.