Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save yamafaktory/7ca9954688d3a08f44e66be22be384ce to your computer and use it in GitHub Desktop.
Save yamafaktory/7ca9954688d3a08f44e66be22be384ce to your computer and use it in GitHub Desktop.
The solution to this problem with siphash was to reinitialize it every time you create a new hashmap. Specifically: when you create a HashMap you have to give it something that implements BuildHasher. The default way to do this is call RandomState::new, which pulls some randomness and stores it. Then whenever you go to hash something you call build_hasher on that which initializes a SipHasher13 using the state stored in the RandomState, which is different between hashmaps but always the same for the same hashmap. FxHashMap uses the other provided way to do this: BuildHasherDefault<H> implements BuildHasher as long as H implements Hasher + Default.
But we could be doing the same thing for FxHasher we do for SipHasher13, just initializing them to a value that's randomly chosen on HashMap creation. That solves the reinsertion performance problem without any impact on hashing performance. There's just currently no simple way to get that functionality into something that implements BuildHasher if all you have is something that implements Hasher. You basically need to copy the code from RandomState since RandomStates isn't generic on the Hasher.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment