A cache is a small, fast store that keeps copies of data you access often, so the system does not have to fetch it the slow way every time. The first time you read something, it comes from the slow original source; the cache holds onto a copy, and the next read comes back almost instantly from that copy. That is the whole idea: trade a little memory for a lot of speed. Caches matter because they make software feel fast — your processor, your browser, your apps, and the network all cache constantly. This explainer covers how caching works, where it lives, a concrete example, and the classic pitfalls.
How caching works
A cache sits between a consumer and a slower data source. When data is requested, the system first checks the cache. A hit means the copy is there and is returned fast. A miss means it is not, so the system fetches from the slow source, returns it, and usually stores a copy for next time. Because cache space is limited, old or rarely used entries get evicted to make room, following a policy such as discarding the least recently used item.
The catch is freshness. A cached copy can drift out of date if the original changes. So caches attach a lifetime or an expiry rule, and the system decides when to refresh or discard the copy. Deciding that correctly is the famously tricky part. If you want to see this pattern applied to web delivery, a content delivery network is essentially a cache spread across the globe.
Where caches live
| Layer |
What it caches |
Why |
| CPU cache |
Recently used memory |
Memory is slow relative to the processor |
| Browser cache |
Images, scripts, pages |
Avoid re-downloading on every visit |
| Application cache |
Query results, computed values |
Skip repeating expensive work |
| Database cache |
Frequent query results |
Reduce disk and compute load |
| CDN |
Static files near users |
Cut network distance and latency |
The pattern repeats at every layer: keep a fast copy of the slow thing, close to whoever needs it.
A concrete example
Picture a web page that shows a list of top articles, built from a database query that takes a moment to run. Without a cache, every visitor triggers that query, and under load the database strains. With a cache, the first visitor triggers the query; the result is stored for, say, sixty seconds; every visitor in that window gets the stored copy instantly. The database does a tiny fraction of the work, and the page feels snappy. After sixty seconds the entry expires and the next request refreshes it.
Common misconceptions
- Caching is free speed. It costs memory and adds complexity, and a stale cache can serve wrong data. It is a trade-off, not a freebie.
- Cache everything. Caching rapidly changing or rarely repeated data wastes space and risks staleness. Cache the slow, repeated reads.
- A cache and main memory are the same. A cache is a strategy of keeping fast copies; it can live in memory, on disk, or across a network.
- Invalidation is easy. Knowing when a copy is no longer valid is the classic hard problem. Most caching bugs are really invalidation bugs.
FAQ
What is the point of a cache?
Speed. By keeping a fast copy of data you use repeatedly, a cache avoids redoing slow work — a database query, a download, a computation — every single time.
What is cache invalidation?
Deciding when a cached copy is out of date and must be refreshed or removed. It is hard because the original data can change at any time, and serving a stale copy causes subtle bugs.
Is a cache the same as memory?
Not exactly. A cache is the idea of storing fast copies of frequently used data. It often lives in memory, but it can also be on disk or distributed across servers.
What should I not cache?
Data that changes constantly, is sensitive, or is cheap to fetch anyway. Caching those adds risk and complexity without much speed benefit.
Where to go next
Learn what a CDN is, understand what a cookie does in web apps, and see what a load balancer does.