SwissTables: High Performance HashMaps
Understand how SwissTables provide a major performance improvement using its innovative control word metadata and SIMD-powered lookups.
Hash maps (or dictionaries) are fundamental data structures which offer efficient key-value storage and retrieval. With Go 1.24, there's been a significant upgrade under the hood: the introduction of a Swiss Table-based map implementation.
But what are Swiss Tables, and why is this a big deal? Let's dive in!
Chaining: A Refresher
Most traditional hash map implementations, including Go's map before version 1.24, use a technique called chaining to handle hash collisions (when two different keys hash to the same bucket).
Imagine an array of buckets. When you insert a key-value pair:
The key is hashed to determine its bucket index.
If the bucket is empty, the key-value pair is stored there.
If the bucket is already occupied by other key/s (case of collision), the new pair is typically added to a linked list of entries within that bucket.
While chaining offers a great way to handle collisions, it introduces the following performance impacts:
Cache Inefficiency: Traversing linked lists can lead to poor CPU cache locality. Pointers in the list can point to disparate memory locations, causing CPU cache misses and slowing down lookups, especially with many collisions.
Memory Overhead: Each entry in the linked list requires extra space for the "next" pointer, adding to memory consumption.
Open Addressing: A Refresher
Open addressing is another technique used in hash tables to handle collisions.
Unlike chaining, open addressing attempts to find another empty slot directly within the hash table itself for the colliding key. This avoids the use of a separate linked-list and avoids the performance impacts associated with chaining.
Insertion/Retrieval:
When you insert a key-value pair(same algorithm is used for searching as well):
The key is hashed to determine its index.
If the index is empty, the key-value pair is stored there.
If the index is occupied, the algorithm "probes"(searches) for an alternative empty index according to a predefined sequence(linear/quadratic/double hashing etc). This probing continues until an empty slot is found for insertion.
Deletion:
Deletion in open addressing can be tricky. Simply clearing a slot could break the probe sequence for other keys that might have collided with the deleted key and were placed further down the line in the array.
To address this, a special "tombstone" or "deleted" marker is often used.
When a key is deleted, its slot is marked as "deleted" instead of truly empty.
During insertion, a "deleted" slot can be treated as empty and reused.
During search, a "deleted" slot is treated as occupied, and the probing continues past it. This adds some complexity and can degrade performance over time if many deletions occur without subsequent insertions reusing those slots, leading to "false" full slots that still need to be probed.
Disadvantages of Open Addressing
Sensitivity to Load Factor: Performance degrades sharply as the number of elements increases with respect to the array size. It's generally recommended to keep the load factor below a certain threshold (e.g., 0.5 to 0.7 for linear and quadratic probing, slightly higher for double hashing) to maintain good performance. This often means the hash table/map needs to be larger than the actual number of elements.
Deletion Complexity: Deletion is more complex due to the need for "tombstones" to preserve probe sequences, which can impact performance.
Table Resizing (Rehashing): When the load factor exceeds the threshold, the entire table must be resized (typically doubled) and all existing elements must be re-hashed and re-inserted into the new, larger table. This is an expensive operation (O(N) where N is the number of elements) and can cause temporary performance spikes.
Swiss Maps:
Swiss Tables, popularised by Google's Abseil C++ library, takes a different approach, primarily using open addressing(instead of chaining) with a clever variation of linear probing leveraging a dedicated metadata array. This design leads to significant performance and memory improvements.
Here are the key characteristics and advantages:
Open Addressing with Optimised Probing:
Instead of storing collided elements in linked lists, Swiss Tables uses Open Addressing to store all entries directly within the main table (an array). While all the data is stored within the main table, the array is broken into logical groups of 8 slots each.
When a collision occurs, the table probes for the next available group using probing. We’ll dive into how the probing works in the next section of control bytes.
Dense Metadata Array (Control Word):
Alongside the array of key-value slots, Swiss Tables maintain a compact array of "control bytes" called Control Word, which is of 64 bits(8 bytes). Each control byte stores metadata about the a corresponding slot in the group, typically:
Empty: The slot is free.
Deleted (Tombstone): The slot previously held an entry that has been removed. This is important so probing sequences aren't prematurely terminated as discussed in Open Addressing.
Full: The slot contains an active entry. In this case, the control byte also stores the H2 hash, which is the lower 7 bits of the full hash of the key stored in that slot.
H1 & H2 Hash:
H1 and H2 hash are computed out of the hash value of the key, we’re looking to insert/update or delete in the Swiss Map.
Hash_Value = hash_fn(key) // 64 bit hash value
H1 Hash = Upper 57 bits of Hash_Value
H2 Hash = Lower 7 bits of Hash_Value
When you insert a key-value pair:
Compute
hash(key)
and break the hash into H1 and H2 hash.The H1 hash is used to select the group, which can be found using H1 Hash%Number of Groups.
Within a group, we must first determine whether any slot already contains this key, in which case this could an update rather than a new insertion.
If no slot contains the key, then we look for an empty slot to place this key. All slots have equal preference to hold the key in any group.
If no slot is empty, then we continue the probe sequence by searching the next group.
If one or more slots in the group contain the key, we have a list of candidate slots, since we can have cases of collisions. So for such cases, we need to check the key from the actual array.
The magic of SwissTables is in the implementation detail of how it determines if the group contains the key or not. Instead of iterating over all the slots in the group, SwissTables leverage the Control Word!
SwissTables do a byte-by-byte equality comparison within the control word, where we compare each byte in the control word with the H2 hash we computed.
However, instead of doing a byte-by-byte comparison by using multiple instructions, SwissTable implementations use SIMD (Single Instruction, Multiple Data) instructions. Modern CPUs using SIMD can compare a key's H2 hash against multiple control bytes in a single instruction, dramatically speeding up the search for a potential match or an empty slot.
This operation is very powerful, as we have effectively performed 8 steps of a probe sequence at once, in parallel with the help of Control Word and SIMD.
Advantages of SwissTable:
Fast Lookups: When searching for a key, the map first computes the hash and identifies a starting group of slots. It then quickly scans the control bytes for that group. The Control Word metadata is designed to be scanned very efficiently using SIMD instructions, making lookups extremely fast.
By avoiding linked list pointers for each element and packing metadata tightly, Swiss Tables can achieve higher memory density, meaning less wasted space and more data fitting into the cache.
Disadvantages of SwissTable:
Extra 1 byte of metadata space for every slot. While this isn’t a lot, it is something good to keep in mind!
Deletions mark a slot with a tombstone. Just as we discussed with Open Addressing, can degrade performance over time if many deletions occur without subsequent insertions reusing those slots, leading to "false" full slots that still need to be probed.
This concludes this article on the Swiss Tables! Hopefully, this article gives you a good insight into why SwissTables are highly performant when compared to traditional hashmap using Chaining!
Go has implemented some more optimisations over SwissTables. If you want me to dive into that in another article with sample code, share your wish via this poll -
You can find the code and understand how Go leveraged Extendible Hashing to optimise SwissTables further below -
👉 Connect with me here: Pratik Pandey on LinkedIn
nice read!
Gate Smashers video from my college days as a refresher: https://youtu.be/ZEyPqqRTO00?si=L59iBn0SmmMxNUMp