related articles
Digital IC front-end study notes: LSFR (Linear Feedback Shift Register)
Digital IC front-end study notes: cross-clock domain signal synchronization
Digital IC front-end study notes: signal synchronization and edge detection
Digital IC front-end study notes: synthesis of latch Latch
Digital IC front-end study notes: Verilog implementation of FIFO (1)
Digital IC front-end study notes: Verilog implementation of FIFO (2)
Digital IC front-end study notes: arbitration polling (1)
Digital IC front-end study notes: arbitration polling (2)
Digital IC front-end study notes: arbitration polling (3)
Digital IC front-end study notes: arbitration polling (4)
Digital IC front-end study notes: arbitration polling (5)
Digital IC front-end study notes: arbitration polling (6)
Table of contents
2. Matrix implementation of LRU
3. Using the matrix method to realize the Verilog code of LRU
1. Introduction to LRU
The LRU algorithm is used for cache management or any other occasion where periodic updates of access rights are required. Based on time and space considerations, data items that will be used in the near future are stored in the cache. When the cache is full, if a new data item arrives, an existing data item needs to be cleared from the cache to provide space for the new entrant. The algorithm commonly used at this time is called LRU (Least Recently Used, the least recently used), through the LRU algorithm, the data item that has not been used for the longest time can be found, the cache will clear the data item, and write the new data item here place.
Another place where the LRU algorithm is used is the routing table management circuit in the network device. The address space of the routing table is very large, and the memory used to store the routing table in the network device is relatively small, so only a part of the routing table entries can be stored in the CAM (Content Addressable Memory) memory, so the LRU algorithm needs to be used to find The entry that has not been used for the longest time in the current CAM is cleared and then a new entry is written, and the new entry becomes the latest entry.
2. Matrix implementation of LRU
There are several ways to implement the LRU algorithm at the RTL level. One method of implementing LRU in hardware is the matrix method. For example, there is a table that can store 4 entries, and the current entries are A, B, C and D. Our goal is to determine which one has not been visited for the longest time, the specific steps are as follows:
- Build a 4x4 matrix of memory cells (each memory cell is a flip-flop).
- Initialize all triggers to zero.
- Whenever a table entry is accessed, all its corresponding rows are set to 1, and then its corresponding columns are all set to 0.
- As long as a table entry is accessed, repeat the previous step.
- The entry corresponding to a row of all zeros is the least recent user, and is the object to be replaced by a new entry.
Suppose the access order is A, D, C, A, B. In this case, D is the least recently used entry and it should be replaced. The above algorithm is explained below using a 4×4 matrix.
Initial conditions
A | B | C | D | |
A | 0 | 0 | 0 | 0 |
B | 0 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 0 |
D | 0 | 0 | 0 | 0 |
Reference sequence: A
A | B | C | D | |
A | 0 | 1 | 1 | 1 |
B | 0 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 0 |
D | 0 | 0 | 0 | 0 |
Reference sequence: A, D
A | B | C | D | |
A | 0 | 1 | 1 | 0 |
B | 0 | 0 | 0 | 0 |
C | 0 | 0 | 0 | 0 |
D | 1 | 1 | 1 | 0 |
Reference sequence: A, D, C
A | B | C | D | |
A | 0 | 1 | 0 | 0 |
B | 0 | 0 | 0 | 0 |
C | 1 | 1 | 0 | 1 |
D | 1 | 1 | 0 | 0 |
Reference sequence: A, D, C, A
A | B | C | D | |
A | 0 | 1 | 1 | 1 |
B | 0 | 0 | 0 | 0 |
C | 0 | 1 | 0 | 1 |
D | 0 | 1 | 0 | 0 |
Reference sequence: A, D, C, A, B
A | B | C | D | |
A | 0 | 0 | 1 | 1 |
B | 1 | 0 | 1 | 1 |
C | 0 | 0 | 0 | 1 |
D | 0 | 0 | 0 | 0 |
The line D is all zeros and is the least recent user. Line B has the most 1s and is the most recent user.
3. Using the matrix method to realize the Verilog code of LRU
module matrix_lru #(parameter SIZE = 8)
(clk, rstb,
update_the_entry,
update_index,
lru_index);
input clk;
input rstb;
input update_the_entry;
input [2:0] update_index;
output [2:0] lru_index;
reg [(SIZE - 1):0] matrix [0:(SIZE - 1)];
reg [(SIZE - 1):0] matrix_nxt [0:(SIZE - 1)];
reg [2:0] lru_index;
reg [2:0] lru_index_nxt;
generate
genvar i;
for(i = 0; i < SIZE; i = i + 1) begin
always@(posedge clk or negedge rstb) begin
if(!rstb)
matrix[i] <= 0;
else
matrix[i] <= matrix_nxt[i];
end
end
endgenerate
generate
genvar j, k;
for (j = 0; j < SIZE; j = j + 1) begin
for(k = 0; k < SIZE; k = k + 1)begin
always@(*) begin
matrix_nxt[j][k] = matrix[j][k];
if(update_the_entry && (j == update_index) && (k != update_index))
matrix_nxt[j][k] = 1'b1;
else if(update_the_entry && (k == update_index))
matrix_nxt[j][k] = 1'b0;
end
end
end
endgenerate
always@(*) begin
lru_index_nxt = lru_index;
if(matrix[0] == 8'b0)
lru_index_nxt = 0;
else if(matrix[1] == 8'b0)
lru_index_nxt = 1;
else if(matrix[2] == 8'b0)
lru_index_nxt = 2;
else if(matrix[3] == 8'b0)
lru_index_nxt = 3;
else if(matrix[4] == 8'b0)
lru_index_nxt = 4;
else if(matrix[5] == 8'b0)
lru_index_nxt = 5;
else if(matrix[6] == 8'b0)
lru_index_nxt = 6;
else if(matrix[7] == 8'b0)
lru_index_nxt = 7;
end
always@(*) begin
if(!rstb)
lru_index <= 0;
else
lru_index <= !lru_index_nxt;
end
endmodule
A screenshot of the simulation is shown below.
The above content comes from "Verilog Advanced Digital System Design Technology and Example Analysis"