Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

General-purpose processes are optimized for general-purpose computing. That is,

ID: 3744970 • Letter: G

Question

General-purpose processes are optimized for general-purpose computing. That is, they are optimized for behavior that is generally found across a large number of applications. However, once the domain is restricted somewhat, the behavior that is found across a large number of the target applications may be different from general-purpose applications. One such application is deep learning or neural networks. Deep learning can be applied to many different applications, but the fundamental building blocks of inference--using the learned information to make decisions--is the same across them all. Inference operations are largely parallel, so they are currently performed on graphics processing units, which are specialized more toward this type of computations, and not to inference in particular. In a quest for more performance per watt, Google has created a custom chip using tensor processing units to accelerate inference operations in deep learning. This approach can be used for speech recognition and image recognition, for example. This problem explores the trade-offs between this process, a general-purpose processor (Haswell E5-2699 v3) and a GPU (NVIDIA K80), in terms of performance and cooling. If heat is not removed from the computer efficiently, the fans will blow hot air back onto the computer, not cold air. Note: The differences are more than processor--on-chip memory and DRAM also come into play. Therefore, statistics are at a system level, not a chip level.

If Google’s data center spends 70% of its time on workload A and 30% of its time on workload B when running GPUs, what is the speedup of the TPU system over the GPU system.

If Google’s data center spends 70% of its time on workload A and 30% of its time on workload B when running GPUs, what percentage of Max IPS does it achieve for each of the three systems?

Building on (b), assuming that the power scales linearly from idle to busy power as IPS grows from 0% to 100%, what is the performance per watt of the TPU system over the GPU system?

If another data center spends 40% of its time on workload A, 10% of its time on workload B, and 50% of its time on workload C, what are the speedups of the GPU and TPU systems over the general-purpose system.

A cooling door for a rack costs $4000 and dissipates 14 kW (into the room; additional cost is required to get it out of the room). How many Haswell-, NVIDEA-, or Tensor-based servers can you cool with one cooling door, assuming TDP in Figures 1 and 2?

Typical server farms can dissipate a maximum of 200 W per square foot. Given that a server rack requires 11 square feet (including front and back clearance), how many servers from part (e) can be placed on a single rack, and how many cooling doors are required?

Figure 1: Power Usage of the three systems in Problem 4

System

Chip

TDP

Idle Power

Busy Power

General-purpose

Haswell E5-2699 v3

504 W

159 W

455 W

GPU

NVIDIA K80

1838 W

357 W

991 W

Custom ASIC

TPU

861 W

290 W

384 W


Figure 2: Performance Characteristics of the three systems in Problem 4

System

Chip

Throughput

% Max IPS

A

B

C

A

B

C

General-purpose

Haswell E5-2699 v3

5482

13194

12000

42%

100%

90%

GPU

NVIDIA K80

13461

36465

15000

37%

100%

40%

Custom ASIC

TPU

225000

280000

2000

80%

100%

1%

System

Chip

TDP

Idle Power

Busy Power

General-purpose

Haswell E5-2699 v3

504 W

159 W

455 W

GPU

NVIDIA K80

1838 W

357 W

991 W

Custom ASIC

TPU

861 W

290 W

384 W

Explanation / Answer

In figure 1 hardware characteristics for general-purpose processor,graphical processing unit based or custom ASIC-based ,including measure(cite ISCA paper).

In figure 2 performance characteristics for general-purpose processor,graphical processing unit-based or custom ASIC-based on two neural-net workloads(cite ISCA paper). Workload A and B are from published result . Workload C is a fictional, more general-purpose application.


Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Chat Now And Get Quote