Problem 1 (10 pts): Discuss the high-level goals of the partition phase for opti
ID: 3675541 • Letter: P
Question
Problem 1 (10 pts): Discuss the high-level goals of the partition phase for optimizing parallel performance and their conflicts Problem 2 (15 pts): Discuss the tradeoffs of the two partitioning schemes, ie block vs. strip partitioning in the grid solver problem. The grid is of size n by n arne processors are used to parallelize the sequential problem. HINT: compare the amount of inherent communication, communication-to-computation ratio, and th unt of artifactual communications between the two schemes. You can assum that the original grid is stored in a 2-D array and the underlying multiprocessor system has an extended memory hierarchy. Problem 3 (10 pts): Discuss why finite replication capacity (e.g, CPU cache and memory) can lead to artifactual communications. Problem 4 (10 pts): Given each of the following memory system traffic, suggest mitigation technique that is mostly effective to reduce that source of traffic in a shared address machine: 1) compulsory cache misses or cold-start traffic; 2) inherent communication; 3) extra data communication on a miss; 4) capacity- generated communications Problem 5 (15 pts): Under what conditions would the sum of busy-useful time across processes not equal the busy-useful time for the sequential program, assuming both the sequential and parallel programs are deterministic? Give examples in which the parallel busy-useful time is larger and smaller than the sequential busy-useful time, respectively. Problem 6: Suppose a parallel application with n data points has a communicati to-computation ratio of 0(IN) Ifn = 104 words and the average communicati latency in a 1000Mbps (Megabit per second) Ethernet for a word of data is 100 processor cycles. Answer the following questions assuming that a word is 8 byte byte is 8 bit, and a processor cycle is 2ns. 1) (10 pts) With no latency hidden, for what fraction of the execution time is process stalled due to communication latency? 2) (10pts) What would be the impact of execution time of halving the ommunication latency?Explanation / Answer
3) Artifactual Communication determined by programm implementation and architechtural interactions
Communication due to finite capacity is most fundamental artifact Like cache size and miss rate or memory traffic in uniprocessors.
Extended memory hierarchy view useful for this relationship.
Related Questions
Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.