You have 3 free guides left 😟

Light

You have 3 free guides left 😟

5.2 Cache memory: design, mapping, and replacement policies

4 min read•august 13, 2024

Cache memory is a crucial component in modern computer systems, bridging the performance gap between fast processors and slower main memory. It leverages the principle of locality to store frequently accessed data, reducing memory access latency and improving overall system performance.

Designing an effective cache involves carefully balancing parameters like size, , and associativity. Various mapping techniques and replacement policies are used to optimize cache performance, each with its own trade-offs between simplicity, flexibility, and efficiency.

Cache Memory Purpose and Benefits

Bridging the Performance Gap

Top images from around the web for Bridging the Performance Gap

CPU cache - Wikipedia View original
Is this image relevant?
Makalah cache memory ~ Mas-syaiful blog's View original
Is this image relevant?
Memory hierarchy - Wikipedia View original
Is this image relevant?
CPU cache - Wikipedia View original
Is this image relevant?
Makalah cache memory ~ Mas-syaiful blog's View original
Is this image relevant?

1 of 3

Top images from around the web for Bridging the Performance Gap

CPU cache - Wikipedia View original
Is this image relevant?
Makalah cache memory ~ Mas-syaiful blog's View original
Is this image relevant?
Memory hierarchy - Wikipedia View original
Is this image relevant?
CPU cache - Wikipedia View original
Is this image relevant?
Makalah cache memory ~ Mas-syaiful blog's View original
Is this image relevant?

1 of 3

Cache memory is a small, fast memory located close to the processor that stores frequently accessed data and instructions
Reduces the average time to access memory by providing faster access to frequently used data
Helps bridge the performance gap between the fast processor and slower main memory

Reducing Memory Access Latency

Cache memory reduces the number of accesses to main memory
Improves overall system performance by reducing memory access latency
Increases the effective memory bandwidth by servicing more memory requests from the cache

Principle of Locality

The principle of locality, which includes both temporal and spatial locality, makes cache memory effective
Temporal locality refers to the tendency of a processor to access the same memory locations repeatedly within a short period (loops, frequently called functions)
Spatial locality refers to the tendency of a processor to access memory locations that are close to each other (sequential instructions, array elements)

Cache Memory Design Parameters

Cache Size

Cache size refers to the total capacity of the cache memory, typically measured in bytes or kilobytes (4KB, 8KB, 16KB)
A larger cache size can store more data but may increase access latency and power consumption
The optimal cache size depends on factors such as the processor architecture, workload characteristics, and power and area constraints

Block Size

Block size, also known as size, is the unit of data transfer between the main memory and the cache
A larger block size can exploit spatial locality by bringing in more adjacent data from memory
However, a larger block size may lead to a higher and increased memory traffic due to bringing in unnecessary data
Common block sizes include 32 bytes, 64 bytes, and 128 bytes

Associativity

Associativity determines how many possible locations in the cache a particular block from main memory can be mapped to
cache: Each block in main memory maps to only one possible location in the cache
cache: A block can be placed anywhere in the cache, providing the most flexibility but requiring more complex hardware for tag comparison
N-way set-associative cache: The cache is divided into sets, each containing N blocks (2-way, 4-way, 8-way)
- A block can be placed in any of the N locations within a specific set, providing a balance between flexibility and hardware complexity

Cache Mapping Techniques: Direct vs Associative vs Set-Associative

Direct Mapping

Direct mapping is the simplest cache mapping technique, where each block in main memory maps to a unique location in the cache based on its memory address
Offers fast access times but may lead to higher conflict misses
The cache location is determined by the formula: $Cache\_Index = (Block\_Address) \bmod (Number\_of\_Blocks\_in\_Cache)$

Associative Mapping

Associative mapping allows a block to be placed anywhere in the cache, providing the most flexibility in terms of block placement
Requires comparing the tag of each block in the cache simultaneously, which increases hardware complexity and power consumption
Provides the lowest miss rate but at the cost of higher access latency and hardware overhead

Set-Associative Mapping

Set-associative mapping is a compromise between direct and associative mapping
The cache is divided into sets, and each set contains multiple blocks (2-way, 4-way, 8-way)
A block can be placed in any location within its corresponding set, determined by a portion of its memory address
Reduces conflict misses compared to direct mapping while maintaining lower hardware complexity than fully associative mapping
The cache location is determined by the formulas:
- $Set\_Index = (Block\_Address) \bmod (Number\_of\_Sets)$
- $Tag = (Block\_Address) / (Number\_of\_Sets)$

Cache Replacement Policies: LRU vs FIFO vs Random

Least Recently Used (LRU)

LRU policy replaces the block that has been accessed least recently
Assumes that blocks that have been recently accessed are more likely to be accessed again in the near future
Generally provides good performance by exploiting temporal locality
Requires additional hardware to track the access order of blocks (timestamp or stack)

First-In, First-Out (FIFO)

FIFO policy replaces the block that was brought into the cache earliest, regardless of its access history
Simpler to implement than LRU but may not perform as well in exploiting locality
Maintains a queue of blocks in the order they were brought into the cache
The oldest block in the queue is replaced when a new block needs to be cached

Random Replacement

policy randomly selects a block to be replaced when the cache is full
Simplest to implement but may not effectively capture locality, leading to suboptimal performance
Does not require any additional hardware or tracking mechanisms
May be used in low-cost or low-power systems where simplicity is prioritized over performance

Advanced Replacement Policies

Advanced replacement policies, such as Pseudo-LRU and Least Frequently Used (LFU), aim to balance performance and hardware complexity
Pseudo-LRU approximates the behavior of LRU using a binary tree structure, reducing the hardware overhead
LFU replaces the block that has been accessed the least number of times, prioritizing frequently accessed blocks
The choice of replacement policy depends on factors such as cache size, associativity, workload characteristics, and design constraints (performance, power, area)

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

About Fiveable Blog Careers Testimonials Code of Conduct Terms of Use Privacy Policy CCPA Privacy Policy

Resources

Cram Mode AP Score Calculators Study Guides Practice Quizzes Glossary Crisis Text Line Request a Feature

Stay Connected

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

About Fiveable Blog Careers Testimonials Code of Conduct Terms of Use Privacy Policy CCPA Privacy Policy

Resources

Cram Mode AP Score Calculators Study Guides Practice Quizzes Glossary Crisis Text Line Request a Feature

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

Back

Glossary

You have 3 free guides left 😟

You have 3 free guides left 😟

5.2 Cache memory: design, mapping, and replacement policies

Cache Memory Purpose and Benefits

Bridging the Performance Gap

Top images from around the web for Bridging the Performance Gap

Top images from around the web for Bridging the Performance Gap

Reducing Memory Access Latency

Principle of Locality

Cache Memory Design Parameters

Cache Size

Block Size

Associativity

Cache Mapping Techniques: Direct vs Associative vs Set-Associative

Direct Mapping

Associative Mapping

Set-Associative Mapping

Cache Replacement Policies: LRU vs FIFO vs Random

Least Recently Used (LRU)

First-In, First-Out (FIFO)

Random Replacement

Advanced Replacement Policies

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

Resources

Stay Connected

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

About Us

Resources

© 2024 Fiveable Inc. All rights reserved.

AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

Back

Next