Hash Tables — Dictionary & HashMap Guide

DodaTech Updated 2026-06-24 6 min read

In this tutorial, you'll learn about Hash Tables. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

A Hash Table (also called hash map or dictionary) is a data structure that maps keys to values using a hash function. It provides O(1) average time for insertions, deletions, and lookups — making it one of the most powerful data structures.

What You'll Learn

You will learn how hash functions work, collision resolution strategies (chaining and open addressing), load factor and resizing, and how to implement a Hash Table from scratch.

Why It Matters

Hash tables power databases, caches, symbol tables in compilers, and every dictionary/map in modern programming languages. They are the backbone of fast lookups in applications handling millions of records.

Real-World Use

Durga Antivirus Pro uses a Hash Table for its malware signature database. When scanning a file, the hash of each file chunk is computed and looked up in the signature Hash Table. A match in O(1) time identifies a known threat instantly.

How Hash Tables Work

graph TD
    K1[Key: 'name'] --> H1[Hash: 4] --> B1[Bucket 4: 'Alice']
    K2[Key: 'age'] --> H2[Hash: 1] --> B2[Bucket 1: 30]
    K3[Key: 'city'] --> H3[Hash: 4] --> C1[Collision! Chaining]
    C1 --> N1[Node: 'Paris']
    style B1 fill:#4a90d9,stroke:#333,color:#fff
    style B2 fill:#4a90d9,stroke:#333,color:#fff
    style C1 fill:#e67e22,stroke:#333,color:#fff

The hash function converts a key into an array index. Ideally, keys are distributed uniformly across buckets. When two keys hash to the same index, a collision occurs.

Hash Table Implementation with Chaining

Chaining stores multiple entries in the same bucket using a Linked List:

class HashTable:
    def __init__(self, capacity=10):
        self.capacity = capacity
        self.table = [[] for _ in range(capacity)]
        self.size = 0

    def _hash(self, key):
        return hash(key) % self.capacity

    def put(self, key, value):
        index = self._hash(key)
        bucket = self.table[index]
        for i, (k, v) in enumerate(bucket):
            if k == key:
                bucket[i] = (key, value)
                return
        bucket.append((key, value))
        self.size += 1
        if self.size > self.capacity * 0.75:
            self._resize()

    def get(self, key):
        index = self._hash(key)
        for k, v in self.table[index]:
            if k == key:
                return v
        raise KeyError(key)

    def remove(self, key):
        index = self._hash(key)
        bucket = self.table[index]
        for i, (k, v) in enumerate(bucket):
            if k == key:
                del bucket[i]
                self.size -= 1
                return
        raise KeyError(key)

    def _resize(self):
        old_table = self.table
        self.capacity *= 2
        self.table = [[] for _ in range(self.capacity)]
        self.size = 0
        for bucket in old_table:
            for key, value in bucket:
                self.put(key, value)

    def __len__(self):
        return self.size

    def __contains__(self, key):
        try:
            self.get(key)
            return True
        except KeyError:
            return False

ht = HashTable()
ht.put("name", "Alice")
ht.put("age", 30)
ht.put("city", "Paris")
print("name:", ht.get("name"))
print("age:", ht.get("age"))
print("Has 'city':", "city" in ht)
ht.remove("age")
print("Has 'age':", "age" in ht)
print("Size:", len(ht))

Expected output:

name: Alice
age: 30
Has 'city': True
Has 'age': False
Size: 2

class HashTable {
    constructor(capacity = 10) {
        this.capacity = capacity;
        this.table = new Array(capacity).fill(null).map(() => []);
        this.size = 0;
    }

    _hash(key) {
        let hash = 0;
        for (let i = 0; i < String(key).length; i++) {
            hash = (hash * 31 + String(key).charCodeAt(i)) >>> 0;
        }
        return hash % this.capacity;
    }

    put(key, value) {
        const index = this._hash(key);
        const bucket = this.table[index];
        for (let i = 0; i < bucket.length; i++) {
            if (bucket[i][0] === key) {
                bucket[i][1] = value;
                return;
            }
        }
        bucket.push([key, value]);
        this.size++;
        if (this.size > this.capacity * 0.75) this._resize();
    }

    get(key) {
        const index = this._hash(key);
        for (const [k, v] of this.table[index]) {
            if (k === key) return v;
        }
        return undefined;
    }

    remove(key) {
        const index = this._hash(key);
        const bucket = this.table[index];
        for (let i = 0; i < bucket.length; i++) {
            if (bucket[i][0] === key) {
                bucket.splice(i, 1);
                this.size--;
                return;
            }
        }
    }

    _resize() {
        const oldTable = this.table;
        this.capacity *= 2;
        this.table = new Array(this.capacity).fill(null).map(() => []);
        this.size = 0;
        for (const bucket of oldTable) {
            for (const [key, value] of bucket) {
                this.put(key, value);
            }
        }
    }
}

let ht = new HashTable();
ht.put("name", "Alice");
ht.put("age", 30);
ht.put("city", "Paris");
console.log("name:", ht.get("name"));
console.log("Has 'city':", ht.get("city") !== undefined);
ht.remove("age");
console.log("Has 'age':", ht.get("age") !== undefined);

Expected output:

name: Alice
Has 'city': true
Has 'age': false

Collision Resolution Strategies

Strategy	Description	Pros	Cons
Chaining	Linked List per bucket	Simple, handles high load	Extra memory for pointers
Linear Probing	Find next empty slot	Cache-friendly	Primary clustering
Quadratic Probing	Jump by i² slots	Reduces clustering	May miss empty slots
Double Hashing	Second hash for step	Minimal clustering	More computation

Load Factor and Resizing

The load factor is size / capacity. When it exceeds a threshold (typically 0.75), the table resizes (doubles capacity) to maintain O(1) performance. Resizing is O(n) but happens infrequently.

Built-in Hash Tables

# Python dict
d = {"name": "Alice", "age": 30}
d["city"] = "Paris"
print(d.get("name"))
print(d.get("missing", "default"))

// JavaScript Map
let map = new Map();
map.set("name", "Alice");
map.set("age", 30);
console.log(map.get("name"));
console.log(map.has("missing"));

Expected output:

Alice
default
Alice
false

Common Pitfalls

1. Using Mutable Keys

Lists, dicts, or objects that can change are not valid hash keys. Modifying them after insertion makes them unretrievable.

2. Poor Hash Function Design

A bad hash function causes collisions, degrading performance to O(n). Use built-in hash implementations when possible.

3. Ignoring Load Factor

If the load factor gets too high, performance collapses. Always resize before the load factor exceeds 0.7–0.75.

4. Thread Safety

Python dicts and JavaScript Maps are not thread-safe. Concurrent modifications can corrupt internal state.

5. Hash Table Ordering

Standard hash tables do not guarantee insertion order. Python 3.7+ dicts preserve insertion order; JavaScript Maps do too. Older versions do not.

Practice Questions

1. Find the first non-repeating character in a string.

def first_non_repeating(s):
    counts = {}
    for ch in s:
        counts[ch] = counts.get(ch, 0) + 1
    for ch in s:
        if counts[ch] == 1:
            return ch
    return None

print(first_non_repeating("swiss"))

Expected output: w

2. Two Sum — find indices of two numbers that add to a target.

def two_sum(nums, target):
    seen = {}
    for i, num in enumerate(nums):
        complement = target - num
        if complement in seen:
            return [seen[complement], i]
        seen[num] = i
    return []

print(two_sum([2, 7, 11, 15], 9))

Expected output: [0, 1]

3. Group anagrams together from a list of strings.

4. Challenge: Implement an LRU cache that supports get and put in O(1) time, using a Hash Table and doubly Linked List.

5. Real-world task: Durga Antivirus Pro's threat detection system tracks file hashes. Write a function that reads a list of file paths, computes SHA-256 hashes, stores them in a Hash Table, and detects duplicates.

What's Next

Binary Trees — Traversals & Operations

Trie Data Structure — Prefix Tree Guide

Queues Guide

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

← Previous Queues — FIFO, Priority & Deque Guide Next → Binary Trees — Traversals & Operations Guide

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Data Structures Algorithms