4.9 For its on-chip cache, the Intel 80486 uses a replacement algorithm referred
ID: 665392 • Letter: 4
Question
4.9 For its on-chip cache, the Intel 80486 uses a replacement algorithm referred to as
pseudo least recently used. Associated with each of the 128 sets of four lines (labeled
L0, L1, L2, L3) are three bits B0, B1, and B2. The replacement algorithm works as follows:
When a line must be replaced, the cache will first determine whether the most
recent use was from L0 and L1 or L2 and L3. Then the cache will determine which
of the pair of blocks was least recently used and mark it for replacement. Figure 4.20
illustrates the logic.
a. Specify how the bits B0, B1, and B2 are set and then describe in words how they
are used in the replacement algorithm depicted in Figure 4.20.
b. Show that the 80486 algorithm approximates a true LRU algorithm. Hint: Consider
the case in which the most recent order of usage is L0, L2, L3, L1.
c. Demonstrate that a true LRU algorithm would require 6 bits per set.
Explanation / Answer
The bits are set according to the following rules with each access to the set
1. If the access is to L0 or L1, B0 <-1.
2. If the access is to L0, B1<-1.
3. If the access is to L1,B1<-0.
4. If the access is to L2 or L3, B0<-0.
5.If the access is to L2, B2<-1.
6. If the access is to L3, B2<-0.
When a line must be replaced, the cache will first determine whether the most recent use was from L0 and L1 or L2 and L3. Then the cache will determine which of the pair of blocks was least recently used and mark it for replacement. When the cache is initialized or flushed all 128 sets of three LRU bits are set to zero.
Least Recently Used (LRU)Discards the least recently used items first. This algorithm requires keeping track of what was used when, which is expensive if one wants to make sure the algorithm always discards the least recently used item. General implementations of this technique require keeping "age bits" for cache-lines and track the "Least Recently Used" cache-line based on age-bits. In such an implementation, every time a cache-line is used, the age of all other cache-lines changes. LRU is actually a family of caching algorithms with members including 2Q by Theodore Johnson and Dennis Shasha,[3] and LRU/K by Pat O'Neil, Betty O'Neil and Gerhard Weikum
b) The 80486 divides the four lines in a set into two pairs (L0, L1 and L2, L3). Bit B0 is used to select the pair that has been least recently used. Within each pair, one bit is used to determine which member of the pair was least recently used. However, the ultimate selection only approximates LRU. Consider the case in which the order of use was: L0, L2, L3, L1. The least recently used pair is (L2,L3) and the least recently used member of that pair is L2, which is selected for replacement. However, the least recently used line of all is L0. Depending on the access history, the algorithm will always pick the least recently used entry or the second least recently used entry.
c) The most straight forward way to implement true LRU for a four line set is to associate a two bit counter with each line. When an access occurs, the counter for that block is set to 0; all counters with values lower than the original value for the accessed block are incremented by 1. When a miss occurs and the set is not full, a new block is brought in, its counter is set to 0 and all other counters are incremented by 1. When a miss occurs and the set is full, the block with counter value 3 is replaced; its counter is set to 0 and all other counters are incremented by 1. This approach requires a total of 8 bits. In general, for a set of N blocks, the above approach requires 2N bits. Amore efficient scheme can be designed which requires only N(N–1)/2 bits. The scheme operates as follows. Consider a matrix R with N rows and N columns, and take the upper-right triangular portion of the matrix, not counting the diagonal. For N = 4, we have the following layout:
R(1,2)
R(1,3)
R(1,4)
R(2,3)
R(2,4)
R(3,4)
When l ine I is referenced, row I of R(I,J) is set to 1, and column I of R(J,I) is set to 0. The LRU block is the one for which the row is entirely equal to 0 (for those bits in the row; the row may be empty) and for which the column is entirely 1 (for all the bits in the column; the column may be empty). As can be seen for N= 4, a total of 6 bits are required
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.