Operation of Two-Level Memory The locality property can be exploited in the form

ID: 3864177 • Letter: O

Question

Operation of Two-Level Memory The locality property can be exploited in the formation of a two-level memory. The upper-level memory (M1) is smaller, faster, and more expensive (per bit) than the lower-level memory (M2). M1 is used as a temporary store for part of the contents of the larger M2. When a memory reference is made, an attempt is made to access the item in M1. If this succeeds, then a quick access is made. If not, then a block of memory locations is copied from M2 to M1 and the access then takes place via M1. Because of locality, once a block is brought into M1, there should be a number of accesses to locations in that block, resulting in fast overall service. To express the average time to access an item, we must consider not only the speeds of the two levels of memory, but also the probability that a given reference can be found in M1. We have Ts = H * T1 + (1 - H) * (T1 + T2) = T1 + (1 - H) * T2 (4.2)
where Ts = average (system) access time

T1 = access time of M1 (e.g., cache, disk cache)

T2 = access time of M2 (e.g., main memory, disk)

H = hit ratio (fraction of time reference is found in M1)

Let TM be the time to access an item in main memory

Let H1 be the hit rate for level 1 cache.

Let M2 be the miss rate for level 2 cache.

Also consider that the average memory access time per instruction is

AMAT = Time for a hit x Miss rate x Miss penalty

a. Express the AMAT for a memory access that is a miss in L1 but a hit in L2 in terms of

H1, T1 and T2:

_________________________________________________

b. Find the AMAT for a processor with a 1 ns clock cycle time, a miss penalty of 20 clock

cycles, a miss rate of 0.05 misses per instruction, and a cache access time (including hit

detection) of 1 clock cycle. Assume that the read and write miss penalties are the same

and ignore other write stalls. Answer in clock cycles and show the equation you used.

_______________________________________________________

c. If you have a 4 GHz system, how long is a clock cycle in nanoseconds?

__________________________________________________________________

d. Assume you have a 4 GHz CPU with a CPI of 1.0 (instructions on this CPU require 1.0

cycles per instruction on average) and a main memory access time of 100 ns. How

many clock cycles are required for a main memory access?

__________________________________________________________________

e. If the system in (d) misses 2.5% of the time, what’s the new effective CPI with one

level of cache? Use the equation Total CPI = Base CPI + Memory-stall cycles per

instruction

Total CPI = _________________________________________ cycles per instruction

Add to the system in (d) a second level of cache. Now a miss in L1 can be satisfied by a

hit in L2 or by main memory. This L2 cache has an access time of 5 ns for either a hit or

a miss, and is large enough to reduce the miss rate to main memory to 0.8%.

f. How many clock cycles are there for a miss penalty for an access to the second-level

cache?

__________________________________________________________________

Continuing with the same system, if a miss is satisfied in the secondary cache, then this

is the entire miss penalty. If the miss needs to go to main memory, then the total miss

penalty is the sum of the secondary cache access time and the main memory access

time.

g. What is the total access time for an access that is a miss in L1 and L2 but a hit in main

memory? Expand the Total CPI equation from (e) above to add memory stalls for L2.

---------------------------------------------------------------------------------

Explanation / Answer

(4 pts) Consider a processor with a 2 ns clock cycle, a miss penalty of 20 clock cycles, a miss rate of 0.05 misses per instruction, and a cache access time (hit time) of 1 clock cycle. Assume that the read and write miss penalties are the same. a) (1 pt) Find the average memory access time (AMAT). b) (1 pt) Suppose we can improve the miss rate to 0.03 misses per instruction by doubling the cache size. However, this causes the cache access time to increase to 1.2 cycles. Using the AMAT as a metric, determine if this is a good trade-off. c) (2 pts) If the cache access time determines the processor’s clock cycle time, which is often the case, AMAT may not correctly indicate whether one cache organization is better than another. If the processor’s clock cycle time must be changed to match that of a cache, is this a good tradeoff? Assume that the processors in part (a) and (b) are identical, except for the clock rate and the cache miss rate. Assume 1.5 references per instruction (for both I-cache and D-cache) and a CPI without cache misses of 2. The miss penalty is 20 cycles for both processors. Solution: a) AMAT = Hit time + Miss rate × Miss penalty = 2 ns + 0.05 × (20 × 2 ns) = 4 ns b) AMAT = 1.2 × 2 ns + 0.03 × 20 × 2 ns = 2.4 ns + 1.2 ns = 3.6 ns Yes, this is a good trade-off. c) CPU time = Clock cycle × IC × (CPIideal-cache + cache stall cycles per instruction) CPU time(a) = 2 ns × IC × (2 + 1.5 × 20 × 0.05) = 7 × IC CPU time(b) = 2.4 ns × IC × (2 + 1.5 × 20 × 0.03) = 6.96 × IC The CPU times in parts (a) and (b) are almost identical. Hence, doubling the cache size to improve the miss rate at the expense of stretching the clock cycle results in essentially no net gain.

Navigate

Operation management- Operation management refers to administration of business

Operation of an Inkjet Printer In an inkjet printer, letters and images are crea

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Operation of Two-Level Memory The locality property can be exploited in the form

Question

Explanation / Answer

Related Questions

Navigate