A hash table is a data structure that stores key-value pairs and can find any value by its key in roughly constant time. Ask it for the value behind the key "email" and it returns the answer almost instantly, no matter whether the table holds ten entries or ten million. It is the structure powering Python dictionaries, JavaScript Maps, and Java HashMaps, and it is one of the most important ideas a programmer can understand.
How a hash table works
The trick is a hash function: a small piece of math that turns a key into a number. That number, reduced to fit the table size, becomes an index into an internal array. So instead of scanning every entry to find "email," the table computes where "email" must live and jumps straight there.
That direct jump is why lookup, insert, and delete are all O(1) on average. There is no searching; there is computing an address and reading it.
-- a Python dict is a hash table
prices = {"apple": 1.20, "banana": 0.50}
prices["cherry"] = 3.00 -- insert
print(prices["banana"]) -- lookup, near-instant
del prices["apple"] -- delete
Why lookups are so fast
| Operation |
Hash table (average) |
Plain array (unsorted) |
| Find by key |
O(1) |
O(n), scan everything |
| Insert |
O(1) |
O(1) at end |
| Delete by key |
O(1) |
O(n), find then shift |
| Find by position |
not its job |
O(1) |
The hash table trades away ordering and positional access to win on key lookups. An array is the reverse: great by position, slow by value. This is exactly the kind of trade-off learning Python fast makes concrete, since its dictionaries are hash tables you will use daily.
Collisions and how they are handled
Different keys sometimes hash to the same slot. That is a collision, and it is normal, not a bug. Tables resolve it in two main ways:
- Chaining. Each slot holds a small list of entries that landed there. A lookup hashes to the slot, then scans the short list.
- Open addressing. If a slot is taken, the table probes nearby slots by a fixed rule until it finds space.
As long as the table is not too full, collisions stay rare and lookups stay fast. When it fills past a threshold, the table resizes and rehashes everything into a bigger array. That occasional resize is why the guarantee is "constant time on average," not "always."
Where the trade-offs hide
- No reliable ordering. A classic hash table does not keep keys in sorted order. Some languages preserve insertion order as a separate feature, but do not assume sorting.
- No fast range queries. Asking for "all keys between 10 and 20" means scanning everything; a sorted tree handles that better.
- Worst case is O(n). With a poor hash function and adversarial keys, everything can land in one slot. Modern standard libraries use strong hash functions to avoid this.
- Memory overhead. A hash table keeps spare capacity so collisions stay rare, so it uses more memory than a tightly packed array.
Common misconceptions
- "Hash tables are sorted." They are not, by design. If you need sorted output, sort the keys when you read them or use a tree-based map.
- "Collisions mean data is lost." No. Collisions are handled internally; you never lose a value because two keys collided.
- "A hash table is the same as encryption." Hashing for a table is about fast lookup, not security. Cryptographic hashing is a different goal with different functions.
What to skip
- Writing your own hash table for production. The built-in
dict, Map, or HashMap is faster, tested, and uses a good hash function. Build one only to learn.
- Using a hash table when you need order. Reach for a sorted structure instead of fighting the table.
- Worrying about collisions in everyday code. The standard library handles them; you rarely need to think about them.
FAQ
What is the difference between a hash table and a dictionary?
In most languages they are the same thing. "Dictionary" is the name Python uses, "Map" is JavaScript and Java, and "hash table" is the underlying technique they all use.
Why is a hash table faster than searching a list?
A list search checks items one by one until it finds a match. A hash table computes where the key should be and jumps there directly, skipping the scan.
What makes a good hash function?
It should spread keys evenly across slots and be quick to compute. Standard libraries ship strong ones, so you almost never write your own.
Can two keys have the same value?
Yes. Keys must be unique, but many keys can map to the same value. The uniqueness rule applies only to the keys.
Where to go next
See what a data structure is, what an array is and how it works, and Big O notation explained simply.