LFU Cache (Least Frequently Used)
“Designing an LFU Cache is the ultimate exercise in composite data structures, it forces you to synchronize multiple hash maps and linked lists to achieve O(1) performance for a complex priority problem.”
TL;DR
LFU Cache evicts the least frequently used key, with LRU as the tiebreaker. Achieve O(1) get and put by combining a key-to-node hash map, a frequency-to-doubly-linked-list map, and a min_freq tracker. On access, promote nodes to higher frequency tiers. On eviction, pop from the tail of the min_freq list. This frequency-tiered design powers Redis caching, CDN edge nodes, and is the data structure foundation for persistent memory in AI agents.
1. Introduction: The Evolution of Caching
In the world of high-performance computing, memory is your most precious resource. Whether you are building a web server, a database engine, or a machine learning inference pipeline, you will eventually hit the “Memory Wall.” You cannot fit everything in RAM. You must choose what to keep and what to discard.
This choice is governed by Cache Replacement Policies.
- The FIFO Approach (Queue): Discard what came in first. Simple, but often wrong because old data can be popular.
- The LRU Approach (Least Recently Used): Discard what hasn’t been used for the longest time. This works well for “Temporal Locality”, if you used it recently, you’ll likely use it again soon.
- The LFU Approach (Least Frequently Used): Discard what is used least often. This is superior for items that have long-term popularity but might be accessed sporadically.
Today we tackle the LFU Cache, one of the most challenging data structures to implement with O(1) efficiency. Unlike LRU, which needs a single list, LFU typically requires managing a dynamic collection of lists or a heap. We will explore why it is hard, how to solve it with a “Frequency-Tiered Doubly Linked List” (Linked-List-of-Linked-Lists), and how it connects to the future of persistent intelligence in AI.
1.1 Why This Problem Matters
- System Design Interview Staple: It tests your ability to combine data structures.
- Real-world Impact: Redis, CDN Edge Nodes, and CPU Branch Predictors all use variations of LFU.
- Complexity Theory: It demonstrates how Amortized Analysis allows O(1) even for complex operations.
2. The Problem Statement
Implement the LFUCache class:
LFUCache(int capacity): Initializes the object with the capacity of the data structure.int get(int key): Returns the value of thekeyif it exists in the cache. Otherwise, returns-1.void put(int key, int value): Updates the value of thekeyif present, or inserts it if not.- If the cache reaches its
capacity, it must invalidate and remove the least frequently used key. - If there is a tie (multiple keys with the same frequency), the least recently used (LRU) key among them must be removed.
- If the cache reaches its
Constraint: Both get and put must run in O(1) average time.
3. Why LFU is Hard: The O(1) Trap
If you’ve solved the LRU Cache, you know that a Hash Map combined with a Doubly Linked List (DLL) gives you O(1) because moving a node to the head of the list is a constant-time pointer operation.
In LFU, we have two axes of priority:
- Frequency: How many times was the key accessed?
- Recency: If frequencies are tied, which was used longest ago?
The Naive (and Slow) Approach
You could use a Min-Heap where each node stores (frequency, last_access_time, value).
get(key): O(log N) to update the heap.put(key): O(log N) to insert/evict. For a high-performance system (like a CDN handling millions of hits), O(log N) is often unacceptable. We need the constant-time performance of a hash table.
4. The Data Structure Design: The Frequency-Tiered DLL
To solve LFU in O(1), we must avoid sorting. Instead, we use a linked list of linked lists.
4.1 The Three Core Components
key_to_node(Hash Map): Maps thekeyto aNodeobject. Each node stores itsvalueand its currentfrequency.freq_to_list(Hash Map): Maps eachfrequency(an integer) to a Doubly Linked List. All keys infreq_to_list[5]have been accessed exactly 5 times.min_freq(Global Variable): Tracks the minimum frequency currently present in the cache. This is our “shortcut” to find the eviction candidate instantly.
4.2 The “Promotion” Flow
When a key is accessed:
- Identify its current frequency
F. - Remove it from the
freq_to_list[F]. - If
freq_to_list[F]is now empty andFwas themin_freq, incrementmin_freq. - Move the key to
freq_to_list[F+1]. - Update its internal frequency counter to
F+1.
5. Implementation in Python
We first define a robust Node and DoublyLinkedList helper.
class Node:
"""A single node representing a cache entry."""
def __init__(self, key, value):
self.key = key
self.val = value
self.freq = 1
self.prev = None
self.next = None
class DoublyLinkedList:
"""A standard DLL with dummy head and tail for easy removal."""
def __init__(self):
self.head = Node(None, None)
self.tail = Node(None, None)
self.head.next = self.tail
self.tail.prev = self.head
self.size = 0
def add_to_front(self, node):
"""Adds a node directly after the dummy head (MRU position)."""
node.next = self.head.next
node.prev = self.head
self.head.next.prev = node
self.head.next = node
self.size += 1
def remove(self, node):
"""Removes a node from its current position in the list."""
node.prev.next = node.next
node.next.prev = node.prev
self.size -= 1
def pop_tail(self):
"""Removes and returns the node just before the dummy tail (LRU position)."""
if self.size == 0:
return None
node = self.tail.prev
self.remove(node)
return node
The LFUCache Orchestrator
class LFUCache:
def __init__(self, capacity: int):
self.capacity = capacity
self.size = 0
self.min_freq = 0
self.key_to_node = {} # key -> Node
self.freq_to_list = {} # freq -> DoublyLinkedList
def _update_node(self, node):
"""Internal helper to promote a node to a higher frequency tier."""
old_freq = node.freq
# 1. Remove from current frequency list
self.freq_to_list[old_freq].remove(node)
# 2. Update global min_freq if necessary
if self.freq_to_list[old_freq].size == 0 and old_freq == self.min_freq:
self.min_freq += 1
# 3. Increment node frequency and add to new list
node.freq += 1
new_freq = node.freq
if new_freq not in self.freq_to_list:
self.freq_to_list[new_freq] = DoublyLinkedList()
self.freq_to_list[new_freq].add_to_front(node)
def get(self, key: int) -> int:
if self.capacity == 0 or key not in self.key_to_node:
return -1
node = self.key_to_node[key]
self._update_node(node)
return node.val
def put(self, key: int, value: int) -> None:
if self.capacity == 0:
return
if key in self.key_to_node:
# Case 1: Key exists, update value and promote frequency
node = self.key_to_node[key]
node.val = value
self._update_node(node)
else:
# Case 2: New key
if self.size >= self.capacity:
# Eviction: Get the list for the current min_freq
evict_candidate = self.freq_to_list[self.min_freq].pop_tail()
del self.key_to_node[evict_candidate.key]
self.size -= 1
# Create and add the new node
new_node = Node(key, value)
self.key_to_node[key] = new_node
# For a brand new node, frequency is always 1
if 1 not in self.freq_to_list:
self.freq_to_list[1] = DoublyLinkedList()
self.freq_to_list[1].add_to_front(new_node)
self.min_freq = 1
self.size += 1
6. Implementation Deep Dive: Synchronizing the Maps
The beauty and the danger of this design lie in the synchronization. If you remove a node from a DLL but forget to update key_to_node, you have a Memory Leak. If you update min_freq incorrectly, you will evict the wrong item, leading to a Bypass of the LFU logic.
6.1 The “Dummy” Pattern
Why do we use dummy head and tail nodes in the DLL?
- It eliminates “if-else” checks for
node.next is None. self.head.next = self.tailat initialization means the code foradd_to_frontworks identical for the first node and the thousandth node.
6.2 The min_freq Reset
Crucially, when a new key is added, min_freq is always reset to 1. Many students fail by trying to “calculate” if 1 is really the min frequency. It is, by definition, because the new key is the newest frequency-1 item.
7. Comparative Performance Analysis
| Operation | LRU (LinkedHashMap) | LFU (Tiered DLL) | LFU (Min-Heap) |
|---|---|---|---|
| Get | O(1) | O(1) | O(log N) |
| Put | O(1) | O(1) | O(log N) |
| Eviction | O(1) | O(1) | O(1) |
| Space | O(N) | O(N) | O(N) |
| Complexity | Simple | High | Medium |
Hidden Constants
While both LRU and Tiered-DLL LFU are O(1), the Hidden Constant for LFU is significantly higher. You are managing multiple hash map lookups.
8. Beyond LFU: Adaptive Replacement Cache (ARC)
Is LFU the ultimate policy? No. LFU has a fatal flaw: its memory is “sticky.” If a key was very popular in the morning but is never used again, it will stay in the cache forever because its frequency is too high.
This is why advanced systems like PostgreSQL and ZFS use ARC (Adaptive Replacement Cache).
- ARC maintains two lists: one for recency (LRU) and one for frequency (LFU).
- It dynamically resizes the “target size” for each list.
9. Real-World Applications
9.1 Database Buffer Pools
Databases like SQLite or MySQL use LFU-like logic to keep the “Indices” in memory while allowing the “Data Pages” to be evicted quickly.
9.2 Machine Learning Feature Stores
In the ML System Design world, we use LFU for Feature Caching.
- Globally popular features stay in the fastest cache layer.
9.3 Speech Technology
In Speech Tech, we use LFU to cache Phonetic Embeddings. Common words have their acoustic-to-text mappings cached in an LFU tier to save GPU inference time.
10. Interview Strategy
- Start with LRU: Briefly explain why LRU is insufficient (it ignores long-term popularity).
- Propose the Tiered Map: Draw a diagram showing
freq -> DLL. This is the clearest way to explain O(1) LFU. - Explain the Eviction Tie-break: Mention that you use LRU within each frequency tier.
- Discuss min_freq: Explain how you maintain the shortcut to the least frequent list.
11. Edge Cases and Resilience
When implementing LFU in a production environment (not just LeetCode), you face edge cases that break naive implementations.
11.1 The Zero-Capacity Cache
What if capacity is 0?
- Naive code might try to
put()and then immediatelyevict(). - If you initialize
min_freq = 1blindly, you might crash when trying to accessfreq_to_list[1]. - Production fix: Always guard with
if capacity <= 0: return.
11.2 Frequency Overflow
In a long-running Redis instance, a key might be accessed 2 billion times.
- If your frequency counter is a 32-bit signed integer, it might wrap around to negative.
- Impact: The “Most Popular” item suddenly becomes the “Least Frequent” (negative billion) and gets evicted immediately.
- Fix: Implementation of “Frequency Aging” (Halving all frequencies every week) or using 64-bit counters.
11.3 Memory Alignment and Pointers
In Python, a Node object is heavy (PyObject structure). In C++, a struct Node is small.
- However, the Pointer Overhead in LFU is 2x that of LRU.
- LRU:
prev,next. - LFU:
prev,next,key,value,freq. - This reduced Cache Locality (CPU Cache, not LFU Cache) can make LFU slower in practice than LRU, even if Big-O is same.
12. Concurrency: Making LFU Thread-Safe
The implementation above is single-threaded. In a real web server (e.g., using Go or Java), multiple threads will call get and put simultaneously.
12.1 The Global Lock (Coarse-Grained)
The simplest fix is to wrap every method in mutex.lock().
- Pros: Correctness is guaranteed.
- Cons: Massive contention. If thread A is evicting (updating 6 pointers), thread B is blocked. This destroys throughput.
12.2 Lock Stripping (Fine-Grained)
We can shard the LFU cache.
- Create 16 separate
LFUCacheinstances. shard_id = hash(key) % 16.- Lock only the specific shard.
- Trade-off: The “Least Frequently Used” is now only local to the shard, not global. You might evict a frequent item in Shard A while Shard B holds garbage. This is an acceptable trade-off for speed (used in ConcurrentHashMap).
12.3 Lock-Free LFU (The Holy Grail)
Research structures like Window-TinyLFU use probabilistic counters (Count-Min Sketch) which can be updated atomically without locks.
- Instead of a precise “Frequency=19,203”, we accept “Frequency ~ 19k”.
- This allows
get()operations to be wait-free.
13. Advanced Variant: Window-TinyLFU
Most modern databases (like Cassandra and Dgraph) rely on Caffeine, a Java cache library that implements Window-TinyLFU.
13.1 The Problem with Standard LFU
LFU is slow to adapt to new trends.
- Old viral content has
freq=10,000. - New breaking news has
freq=10. - The new content is evicted because
10 < 10,000. This is bad.
13.2 The TinyLFU Architecture
- Admission Window: A small LRU region (1% of size) acts as a “Doorman.” New items enter here.
- Main Cache: Segmented into Protected (hot) and Probation (cold) regions.
- The Combat: When the Window is full, the victim from the Window “fights” the victim from the Probation region.
- We compare their approximate frequencies using a Count-Min Sketch.
- If the newcomer is more popular than the incumbent, the incumbent is evicted.
- If the newcomer is weak, it is rejected (not cached at all).
This “Doorman” approach prevents “One-Hit Wonders” (items accessed once) from polluting the main cache history.
14. Complexity Analysis: A Mathematical Proof
Let’s rigorously prove the Time Complexity.
Space Complexity
key_to_node: Stores N nodes.O(N).freq_to_list: Stores N nodes distributed across lists.O(N).- Total Space =
2Npointers +Nintegers. -> O(N).
Time Complexity (Put)
- Check Hash Map:
O(1)amortized. - Create Node:
O(1). - Link to DLL Head:
O(1)(pointer assignment). - If Full, Pop Tail:
O(1)(pointer assignment). - Remove from Map:
O(1). - Update
min_freq:O(1)(assignment). - Total: O(1).
Time Complexity (Get)
- Check Hash Map:
O(1). - Unlink Node:
O(1)(pointer assignment). - Link to New List:
O(1). - Total: O(1).
Why Average vs Worst Case?
In Python/Java, hash map collisions can degrade to O(K) or O(log K).
- Therefore, strictly speaking, LFU is
O(log N)in the worst case of Hash Collisions. - However, with a good cryptographic hash, we assume
O(1).
15. Summary of Optimization patterns
When designing high-performance data structures, we see recurring themes in LFU:
- Space-Time Tradeoff: We use 2x memory (Maps + Lists) to buy 100x speed (O(1) vs O(log N)).
- Lazy Balancing: We don’t sort the lists. We assume simple appending (MRU) is “good enough” for the internal ordering.
- Amortization: We don’t clean up empty frequency buckets immediately (unless we want to save space), allowing operations to flow faster.
16. Historical Context: From cache_mem to TinyLFU
The history of LFU is a history of trying to fix its “Stickiness.”
1960s-1990s: Mathematical Idealism
In the early days of caching (IBM Mainframes), LFU was known as the “Optimum” policy under certain static probability distributions. If you know that ‘A’ is accessed 20% of the time, and ‘B’ 5% of the time, LFU is provably optimal.
- Problem: Real-world distributions change.
- Solution: Ignoring it. RAM was small; LRU was used because LFU was too expensive to implement (Heap overhead).
2000s: The LIRS/ARC Revolution
Researchers realized that “Scan Resistance” was the killer feature.
- ARC (Adaptive Replacement Cache): Invented by IBM. It tracked “Ghosts” (recently evicted items) to decide whether to enlarge the Frequency or Recency list.
- LIRS (Low Inter-reference Recency Set): Used in Linux Kernel and MySQL. It defines “hot” not by count, but by the “distance” between the last two accesses.
2015-Present: The Probabilistic Era (W-TinyLFU)
In 2015, Gil Einziger strictly solved the Space Overhead problem.
- Bloom Filters: Using approximate counting (Count-Min Sketch) fits the frequency of millions of items into Kilobytes.
- Window-TinyLFU: Used in Caffeine (Java), Ristretto (Go). This is currently the state-of-the-art. It combines a small LRU window (to capture new bursts) with a large LFU main region (to capture long-term popularity).
17. The Comprehensive Test Suite
When you write this in an interview code-pad, you need to verify it. Do not just write put(1,1). Write a suite that targets the edge cases.
import unittest
class TestLFUCache(unittest.TestCase):
def test_basic_functional(self):
# The standard Example
lfu = LFUCache(2)
lfu.put(1, 1) # State: [1:1]
lfu.put(2, 2) # State: [2:1, 1:1]
self.assertEqual(lfu.get(1), 1) # State: [1:2, 2:1]
lfu.put(3, 3) # Evicts 2 (freq 1). State: [3:1, 1:2]
self.assertEqual(lfu.get(2), -1)
self.assertEqual(lfu.get(3), 3) # State: [3:2, 1:2]
lfu.put(4, 4) # Tie-break. 1 has freq 2, 3 has freq 2.
# 1 was accessed at step 3. 3 was accessed at step 6.
# 1 is LRU among freq-2 items. Evict 1.
self.assertEqual(lfu.get(1), -1)
self.assertEqual(lfu.get(3), 3) # State: [3:3, 4:1]
self.assertEqual(lfu.get(4), 4) # State: [4:2, 3:3]
def test_frequency_promotion(self):
# Ensure items move up the ladder correctly
lfu = LFUCache(3)
lfu.put(1, 10)
# Access 100 times
for i in range(100):
lfu.get(1)
self.assertEqual(lfu.key_to_node[1].freq, 101)
# Add new items
lfu.put(2, 20) # freq 1
lfu.put(3, 30) # freq 1
# This update should NOT evict 1, even though it's old
lfu.put(4, 40)
# Cache full (3 items). Candidates: 2 (freq 1), 3 (freq 1). 1 (freq 101).
# Evict 2 (LRU of freq 1).
self.assertEqual(lfu.get(2), -1)
self.assertEqual(lfu.get(1), 10)
def test_zero_capacity(self):
lfu = LFUCache(0)
lfu.put(1, 1)
self.assertEqual(lfu.get(1), -1)
def test_update_value_does_not_reset_freq(self):
lfu = LFUCache(2)
lfu.put(1, 1)
lfu.get(1) # freq 2
lfu.put(1, 100) # update value. freq should be 3 now!
self.assertEqual(lfu.key_to_node[1].freq, 3)
self.assertEqual(lfu.get(1), 100)
def test_tie_breaking(self):
lfu = LFUCache(2)
lfu.put(1, 1) # freq 1
lfu.put(2, 2) # freq 1
lfu.get(1) # 1 promoted to freq 2
lfu.get(2) # 2 promoted to freq 2
# Both freq 2. 1 is LRU because 2 was accessed last.
lfu.put(3, 3)
self.assertEqual(lfu.get(1), -1) # 1 should be gone
self.assertEqual(lfu.get(2), 2) # 2 should stay
if __name__ == '__main__':
unittest.main()
17.1 How to Debug LFU Instability
If you implement this and it fails randomly:
- Check
min_freqmaintenance: This is the #1 bug. If you remove the last node frommin_freqlist, do you definitely incrementmin_freq? - Check Empty List cleanup: Do you delete the DLL from the hash map when it’s empty? It’s not strictly necessary but saves memory.
- Check Node pointers: When moving a node from List A to List B, do you reset
node.prevandnode.next? TheDoublyLinkedList.add_to_frontmethod should handle this overwriting safely, but verify.
18. Conclusion
When designing systems, LFU reminds us that metadata (frequency) is as important as the data itself. Start with the Linked Hash Map. Add the Frequency dimension. Handle the edge cases. You have now mastered one of the hardest data structures in the interview repertoire.
19. Key Takeaways
- States are Synchronized: Designing complex structures is about maintaining invariants across multiple containers.
- Frequency vs. Recency: LFU is about the “Volume” of access; LRU is about the “Proximity” of access.
- Future-Proofing: As AI agents move toward long-term memory, the algorithms of LFU will become the foundation of “Thinking Architectures.”
FAQ
How does LFU Cache achieve O(1) for both get and put?
Three data structures work in concert: a hash map for O(1) key lookup, a frequency-to-DLL map for O(1) node promotion between frequency tiers, and a min_freq variable for O(1) eviction candidate identification. All doubly linked list pointer operations are constant time.
How does LFU break ties when multiple keys have the same frequency?
Within each frequency tier, keys are ordered by recency in a doubly linked list. Most recently accessed keys sit at the front. On eviction, the tail node (least recently used among the least frequent) is removed. This provides LRU behavior as the tiebreaker.
Why does min_freq reset to 1 when inserting a new key?
A newly inserted key has frequency 1 by definition, making it always the minimum possible frequency. Regardless of existing keys’ frequencies, the new key becomes the lowest-frequency candidate, so min_freq is unconditionally set to 1 on every insertion.
What is Window-TinyLFU and why is it better than standard LFU?
Standard LFU is sticky: old popular items accumulate high frequency counts and never get evicted even after becoming irrelevant. Window-TinyLFU adds an LRU admission window and uses a Count-Min Sketch for approximate frequency tracking. New items compete against incumbents based on recent popularity, preventing cache pollution by stale entries.
Originally published at: arunbaby.com/dsa/0060-lfu-cache