7.4 Virtual Memory and Translation Lookaside Buffers (TLBs)
6 min read•july 30, 2024
is a game-changer in computer architecture. It lets programs think they have a huge chunk of memory, even if the physical memory is limited. This clever trick uses secondary storage to store inactive parts of programs.
Translation Lookaside Buffers (TLBs) are the unsung heroes of virtual memory. They speed up address translation by caching recent entries. TLBs work hand-in-hand with caches to make memory access faster and more efficient.
Virtual memory concept and benefits
Virtual memory overview
Top images from around the web for Virtual memory overview
Understanding Virtual Address, Virtual Memory and Paging - Stack Overflow View original
Is this image relevant?
1 of 3
Virtual memory is a memory management technique that provides an abstraction of the main memory
Allows programs to operate as if they have access to a large, contiguous
Enables the execution of programs that require more memory than the available physical memory
Utilizes secondary storage (hard disk drives, SSDs) to store inactive portions of the program
Benefits and advantages
Increased multiprogramming capabilities
Allows multiple processes to coexist in memory, even if their combined memory requirements exceed the physical memory capacity
Enables efficient utilization of system resources and improved overall system performance
Simplified memory allocation and management
Each process is given its own virtual address space, independent of the physical memory layout
Eliminates the need for complex memory allocation strategies and simplifies programming
Enhanced security through memory isolation
Prevents processes from accessing or modifying memory regions belonging to other processes or the operating system
Provides a layer of protection against unauthorized access and potential security vulnerabilities
Supports large address spaces
Allows programs to use large virtual address spaces, even if the physical memory is limited
Enables the development and execution of memory-intensive applications (scientific simulations, databases)
Address translation and page tables
Address translation process
Address translation is the process of converting a virtual address, used by a program, into a corresponding physical address
Virtual address space is divided into pages, and physical memory is divided into frames
Page size is typically a power of two (4KB, 8KB) to simplify address calculations
Virtual address is divided into two parts: and page offset
VPN is used as an index into the page table to locate the corresponding
Page offset remains unchanged and is concatenated with the to form the complete physical address
Page tables and page table entries
Page tables are data structures maintained by the operating system to store the mappings between virtual page numbers and physical frame numbers
Each process has its own page table, allowing for independent virtual address spaces and memory protection
Page table entry (PTE) contains information such as:
Physical frame number (PFN) corresponding to the virtual page
Valid/invalid bit indicating whether the page is currently in physical memory
Access permissions (read, write, execute) for the page
Other control bits (dirty bit, accessed bit) for memory management purposes
If the requested page is not present in physical memory (page fault), the operating system handles the fault by:
Loading the page from secondary storage into an available physical frame
Updating the page table with the new mapping
Resuming the execution of the program from the faulting instruction
TLB design performance impact
TLB overview
Translation Lookaside Buffer () is a cache memory that stores recently used page table entries
Accelerates address translation by reducing the need to access main memory for page table lookups
avoids the costly process of walking the page table in memory
requires accessing the page table in memory, incurring a performance penalty
TLB size and associativity
TLB size determines the number of page table entries that can be stored in the TLB
Larger TLB can hold more entries, reducing the likelihood of TLB misses and improving performance
Increasing TLB size comes with hardware complexity and power consumption trade-offs
TLB associativity refers to the number of ways or sets in the TLB
Fully associative TLBs allow any virtual page to be mapped to any entry, providing the best flexibility but requiring a more complex hardware search mechanism
Set-associative TLBs divide the TLB into sets, and each set can hold multiple entries
Virtual page number is used to determine the set, and tag bits are compared within the set to find a match
Direct-mapped TLBs have a one-to-one mapping between virtual pages and TLB entries, simplifying the hardware but potentially increasing conflict misses
TLB replacement policies
TLB replacement policy determines which entry is evicted when a new entry needs to be added to a full TLB
Common replacement policies include:
: Evicts the entry that has been accessed least recently
: Evicts the entry that was added to the TLB earliest
Random replacement: Randomly selects an entry to be evicted
Choice of replacement policy affects the TLB hit rate and overall performance
LRU policy tends to perform well in many workloads by keeping frequently accessed entries in the TLB
FIFO and random policies are simpler to implement but may not be as effective in capturing locality
Virtual vs cache memory interaction
Memory hierarchy integration
Memory hierarchy in a computer system consists of multiple levels of cache memory, main memory (DRAM), and secondary storage (hard disk, SSD)
Virtual memory and cache memory work together to provide fast and efficient access to data and instructions
When a program accesses a memory location:
Virtual address is first translated to a physical address using the TLB and page tables
Physical address is then used to access the cache memory hierarchy
If the requested data is found in the cache (cache hit), it is quickly retrieved and supplied to the processor
If the data is not found in the cache (cache miss), the request is forwarded to the next level of the memory hierarchy (main memory)
Performance considerations
Interaction between virtual memory and cache memory can impact performance
TLB misses can stall the processor while the address translation is performed, affecting the overall memory access latency
Larger TLBs and higher associativity can help reduce TLB misses and improve performance
Page faults can incur significant overhead due to the need to access secondary storage and transfer data to main memory
Effective page replacement algorithms (LRU, Clock) can minimize the occurrence of page faults
Locality of reference exhibited by programs influences the effectiveness of both caches and virtual memory
Good temporal and spatial locality leads to higher cache hit rates and fewer page faults
Techniques such as prefetching, memory interleaving, and cache-conscious data placement can exploit locality and optimize performance
Coherence and consistency
Virtual memory and cache memory must maintain coherence and consistency
Coherence ensures that multiple copies of the same data in different caches are kept up to date
Write-back caches update main memory only when a modified block is evicted, reducing memory traffic
Write-through caches immediately update main memory on every write, ensuring consistency but increasing memory traffic
Consistency models define the ordering and visibility of memory operations across processors or cores
Sequential consistency guarantees that memory operations appear to execute in program order
Relaxed consistency models allow for reordering of memory operations to improve performance, but require explicit synchronization primitives to enforce ordering when necessary