Operation of Two-Level Memory The locality property can be exploited in the form
ID: 3864177 • Letter: O
Question
Operation of Two-Level Memory The locality property can be exploited in the formation of a two-level memory. The upper-level memory (M1) is smaller, faster, and more expensive (per bit) than the lower-level memory (M2). M1 is used as a temporary store for part of the contents of the larger M2. When a memory reference is made, an attempt is made to access the item in M1. If this succeeds, then a quick access is made. If not, then a block of memory locations is copied from M2 to M1 and the access then takes place via M1. Because of locality, once a block is brought into M1, there should be a number of accesses to locations in that block, resulting in fast overall service. To express the average time to access an item, we must consider not only the speeds of the two levels of memory, but also the probability that a given reference can be found in M1. We have Ts = H * T1 + (1 - H) * (T1 + T2) = T1 + (1 - H) * T2 (4.2)
where Ts = average (system) access time
T1 = access time of M1 (e.g., cache, disk cache)
T2 = access time of M2 (e.g., main memory, disk)
H = hit ratio (fraction of time reference is found in M1)
Let TM be the time to access an item in main memory
Let H1 be the hit rate for level 1 cache.
Let M2 be the miss rate for level 2 cache.
Also consider that the average memory access time per instruction is
AMAT = Time for a hit x Miss rate x Miss penalty
a. Express the AMAT for a memory access that is a miss in L1 but a hit in L2 in terms of
H1, T1 and T2:
_________________________________________________
b. Find the AMAT for a processor with a 1 ns clock cycle time, a miss penalty of 20 clock
cycles, a miss rate of 0.05 misses per instruction, and a cache access time (including hit
detection) of 1 clock cycle. Assume that the read and write miss penalties are the same
and ignore other write stalls. Answer in clock cycles and show the equation you used.
_______________________________________________________
c. If you have a 4 GHz system, how long is a clock cycle in nanoseconds?
__________________________________________________________________
d. Assume you have a 4 GHz CPU with a CPI of 1.0 (instructions on this CPU require 1.0
cycles per instruction on average) and a main memory access time of 100 ns. How
many clock cycles are required for a main memory access?
__________________________________________________________________
e. If the system in (d) misses 2.5% of the time, what’s the new effective CPI with one
level of cache? Use the equation Total CPI = Base CPI + Memory-stall cycles per
instruction
Total CPI = _________________________________________ cycles per instruction
Add to the system in (d) a second level of cache. Now a miss in L1 can be satisfied by a
hit in L2 or by main memory. This L2 cache has an access time of 5 ns for either a hit or
a miss, and is large enough to reduce the miss rate to main memory to 0.8%.
f. How many clock cycles are there for a miss penalty for an access to the second-level
cache?
__________________________________________________________________
Continuing with the same system, if a miss is satisfied in the secondary cache, then this
is the entire miss penalty. If the miss needs to go to main memory, then the total miss
penalty is the sum of the secondary cache access time and the main memory access
time.
g. What is the total access time for an access that is a miss in L1 and L2 but a hit in main
memory? Expand the Total CPI equation from (e) above to add memory stalls for L2.
---------------------------------------------------------------------------------
Explanation / Answer
(4 pts) Consider a processor with a 2 ns clock cycle, a miss penalty of 20 clock cycles, a miss rate of 0.05 misses per instruction, and a cache access time (hit time) of 1 clock cycle. Assume that the read and write miss penalties are the same. a) (1 pt) Find the average memory access time (AMAT). b) (1 pt) Suppose we can improve the miss rate to 0.03 misses per instruction by doubling the cache size. However, this causes the cache access time to increase to 1.2 cycles. Using the AMAT as a metric, determine if this is a good trade-off. c) (2 pts) If the cache access time determines the processor’s clock cycle time, which is often the case, AMAT may not correctly indicate whether one cache organization is better than another. If the processor’s clock cycle time must be changed to match that of a cache, is this a good tradeoff? Assume that the processors in part (a) and (b) are identical, except for the clock rate and the cache miss rate. Assume 1.5 references per instruction (for both I-cache and D-cache) and a CPI without cache misses of 2. The miss penalty is 20 cycles for both processors. Solution: a) AMAT = Hit time + Miss rate × Miss penalty = 2 ns + 0.05 × (20 × 2 ns) = 4 ns b) AMAT = 1.2 × 2 ns + 0.03 × 20 × 2 ns = 2.4 ns + 1.2 ns = 3.6 ns Yes, this is a good trade-off. c) CPU time = Clock cycle × IC × (CPIideal-cache + cache stall cycles per instruction) CPU time(a) = 2 ns × IC × (2 + 1.5 × 20 × 0.05) = 7 × IC CPU time(b) = 2.4 ns × IC × (2 + 1.5 × 20 × 0.03) = 6.96 × IC The CPU times in parts (a) and (b) are almost identical. Hence, doubling the cache size to improve the miss rate at the expense of stretching the clock cycle results in essentially no net gain.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.