You have 3 free guides left 😟
Unlock your guides
You have 3 free guides left 😟
Unlock your guides

9.3 Hash Table Implementation and Analysis

4 min readjuly 19, 2024

Hash tables are powerful data structures that offer lightning-fast lookups and insertions. They use clever hashing techniques to map keys to array indices, enabling constant-time operations on average. This efficiency makes them ideal for various applications.

However, hash tables aren't without challenges. Collisions can occur when multiple keys map to the same index. Techniques like and help resolve these conflicts, maintaining the structure's performance even under heavy loads.

Hash Table Implementation

Hash table implementation with arrays

Top images from around the web for Hash table implementation with arrays
Top images from around the web for Hash table implementation with arrays
  • maps keys to indices in the hash table
    • Modulo hash function: hash(key)=keymodmhash(key) = key \bmod m (m is size of hash table)
    • Multiplication hash function: hash(key)=m(kAmod1)hash(key) = \lfloor m(kA \bmod 1) \rfloor (A is a constant, 0 < A < 1)
    • Universal hashing: randomly choose hash function from a family of hash functions to minimize collisions
  • techniques handle keys mapping to the same index
    • Separate chaining uses linked lists to store colliding keys at each index
      • Colliding keys are stored in the same linked list at the corresponding index
      • Linked list allows for dynamic growth of colliding keys
    • Open addressing probes for the next empty slot in the hash table
      • : probe the next slot until an empty slot is found (increment by 1)
      • : use a quadratic function to determine the probing sequence (i2i^2, where i is the number of attempts)
      • Double hashing: use a second hash function to determine the probing step size

Time complexity of hash operations

  • is O(1) for , deletion, and search
    • Assumes a well-distributed hash function and low (α<0.75\alpha < 0.75)
    • Constant time operations on average, making hash tables efficient
  • Worst case complexity is O(n) for insertion, deletion, and search
    • Occurs when all keys map to the same index, causing collisions
    • Degrades to linear search in the worst case scenario
  • Load factor (α\alpha) affects the performance of hash table operations
    • α=nm\alpha = \frac{n}{m}, where n is the number of keys and m is the size of the hash table
    • Higher load factors increase the likelihood of collisions
    • Keeping α\alpha low (e.g., < 0.75) ensures good performance and reduces collisions

Hash Table Optimization and Applications

Rehashing and dynamic resizing

  • rebuilds the hash table with a larger size when the load factor exceeds a threshold
    • Involves creating a new hash table with a larger size (e.g., double the size)
    • Reinserting all keys from the old hash table into the new hash table
    • Rehashing is an expensive operation but necessary to maintain performance
  • Dynamic resizing adjusts the size of the hash table based on the load factor
    • Doubling the size of the hash table when the load factor exceeds a threshold (e.g., 0.75)
    • Halving the size of the hash table when the load factor falls below a threshold (e.g., 0.25)
    • Ensures optimal performance by maintaining a low load factor and reducing collisions

Hash tables vs other data structures

  • Hash tables vs. arrays
    • Hash tables provide O(1) average case for search, insertion, and deletion
    • Arrays provide O(1) access by index but O(n) search, insertion, and deletion (due to shifting elements)
    • Hash tables are more efficient for search-heavy workloads
  • Hash tables vs. balanced binary search trees (BSTs)
    • Hash tables provide O(1) average case for search, insertion, and deletion
    • Balanced BSTs (AVL, Red-Black) provide O(log n) worst case for search, insertion, and deletion
    • Balanced BSTs maintain keys in sorted order, enabling efficient range queries
    • Hash tables are faster for individual key lookups, while BSTs are better for range queries and sorted order

Applications of hash tables

  • Caching: store frequently accessed data in memory using hash tables
    • Reduces the need for expensive disk or network operations (database queries, API calls)
    • Improves performance by serving data from the cache instead of recomputing or fetching from slower storage
  • Indexing: create an index using hash tables for fast data retrieval
    • Example: indexing words in a document for quick search
    • Hash table maps words to their locations in the document, enabling efficient keyword search
  • Counting: use hash tables to count the frequency of elements in a dataset
    • Example: counting the number of occurrences of each word in a text
    • Hash table maps elements (words) to their counts, updating the counts as elements are processed
    • Useful for analytics, data mining, and statistics
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.
Glossary
Glossary