Abstract
Cache memory is nearest to the CPU and stores all the recent instructions. The term cache means a safe place for hiding and storing things.
Scope
This article talks about cache in detail.
Definition
Cache memory is a small fast memory that holds copies of recently accessed instructions and data.
The effectiveness of the cache mechanism is based on the property of locality of reference. It means many instructions for local areas of the program are executed repeatedly. The active segments of the program are placed in cache memory by reducing the total execution time.
When the processor makes a request for memory reference, the request is first sought in the cache. If we get the memory reference that is requested we call it “CACHE HIT” otherwise “CACHE MISS”.
In the cache miss, the requested element is brought from a subsequent memory level from the memory hierarchy and placed in the cache.

A block of elements is transferred from main memory to cache memory by expecting that the next requested element will be residing in the neighboring locality of the current requested element (spatial locality) and this has to happen under one memory access time.
The performance of cache memory is calculated in terms of hit ratio.
Hit ratio = hit/(hit + miss) = no. of hits/ total access.
Cache performance can be improved by using:
- Higher cache block size
- Higher associativity
- Reduce miss rate
- Reduce miss penalty
- Reduce the time to hit in the cache.
Cache underlying principle is the locality of reference
In the locality of reference, huge data is placed in a small area (data which is needed) so that the access time will be less and the performance will be higher.
The locality of reference is of two types:
- Spatial locality: The adjacent words in the block are referenced by the CPU in the near future.
- Temporal locality: The same word in the block are referenced by the CPU in the near future
The cache is organized not in bytes but as blocks of cache lines with each line containing the same no. of bytes. (16-64)
Cache lines do not have a fixed address, which enables the cache system to populate each cache line with a unique (non-contiguous) address.
There are three methods for filling a cache lines
- Fully associated – The most flexible.
- Direct mapped – The most basic
- Self associated – A combination of the two.
Summary
- This is a special memory used for speeding up the transfer of data and instruction.
- The cache stores the data and programs that are frequently used by the CPU. So any time some data is required it is first checked in the cache memory.
Special thanks to Ami Jangid for contributing to this article on takeUforward. If you also wish to share your knowledge with the takeUforward fam, please check out this article