Class Hashes
This class provides a number of hash functions for bit vectors. For this reason, they are sometimes tweaked with respect to the official C code, which can assume at least byte granularity. In particular, we provide:
In particular, SpookyHash is available in the 4-word-state version and in the 12-word-state version. The 12-word-state version delegates to the 4-word-state version for short keys (i.e., less than 12 words).
All functions (with the exception of the 12-word-state version of SpookyHash) provide preprocessing: a linear pass on the bit vector generates a vector storing the internal state of the function after each mixing round. Using the preprocessed state, one can hash any prefix of the bit vector in constant time.
It is your responsibility to use preprocessing correctly: using the state vector of the wrong bit vector will generate unpredictable results.
-
Method Summary
Modifier and TypeMethodDescriptionstatic void
jenkins
(long[] triple, long seed, long[] h) Jenkins 64-bit hashing (all three values produced) for a triple of longs.static long
Jenkins 64-bit hashing.static long
Jenkins 64-bit hashing.static void
Jenkins 64-bit hashing (all three values produced).static long
Constant-time Jenkins 64-bit hashing for any prefix.static void
Constant-time Jenkins 64-bit hashing for any prefix (all three values produced).static void
A simple test to check the relative speed of various hashes on your architecture.static long
MurmurHash 64-bitstatic long
Constant-time MurmurHash 64-bit hashing for any prefix.static long
Constant-time MurmurHash 64-bit hashing reusing precomputed state partially.static long
MurmurHash3 64-bitstatic void
MurmurHash3 128-bitstatic long
Constant-time MurmurHash3 64-bit hashing for any prefix.static long
Constant-time MurmurHash3 64-bit hashing reusing precomputed state partially.static void
Constant-time MurmurHash3 128-bit hashing for any prefix.static void
murmur3
(BitVector bv, long prefixLength, long[] hh1, long[] hh2, long[] cc1, long[] cc2, long lcp, long[] h) Constant-time MurmurHash3 128-bit hashing reusing precomputed state partially.static long[][]
preprocessJenkins
(BitVector bv, long seed) Preprocesses a bit vector so that Jenkins 64-bit hashing can be computed in constant time on all prefixes.static long[]
preprocessMurmur
(BitVector bv, long seed) Preprocesses a bit vector so that MurmurHash 64-bit can be computed in constant time on all prefixes.static long[][]
preprocessMurmur3
(BitVector bv, long seed) Preprocesses a bit vector so that MurmurHash3 can be computed in constant time on all prefixes.static long[]
preprocessSpooky4
(BitVector bv, long seed) Preprocesses a bit vector so that SpookyHash 4-word-state can be computed in constant time on all prefixes.static void
SpookyHash 12-word-state (up to four values produced).static void
spooky4
(long[] triple, long seed, long[] tuple) SpookyHash (up to four values produced) for a triple of longs.static void
spooky4
(long x, long y, long seed, long[] tuple) SpookyHash (up to four values produced) for a pair of longs.static long
SpookyHash 4-word-state (up to four values produced).static void
SpookyHash 4-word-state (up to four values produced).static long
Constant-time SpookyHash hashing reusing precomputed state.static void
Constant-time SpookyHash 4-word-state hashing reusing precomputed state.
-
Method Details
-
jenkins
Jenkins 64-bit hashing (all three values produced).This code is based on the
lookup8.c
, and in particular on the version consuming 64 bits at a time, but it has been slightly modified to work correctly with any bit vector length (not just multiples of 64). Moreover, we return all three values generated by the algorithm.- Parameters:
bv
- a bit vector.seed
- a seed for the hash.h
- a triple of long values in which the three generated hashes will be saved.
-
jenkins
Jenkins 64-bit hashing.- Parameters:
bv
- a bit vector.seed
- a seed for the hash.- Returns:
- the third of the three hash values returned by
jenkins(BitVector, long, long[])
.
-
jenkins
Jenkins 64-bit hashing.- Parameters:
bv
- a bit vector.- Returns:
- the third of the three hash values returned by
jenkins(BitVector, long, long[])
with seed 0.
-
preprocessJenkins
Preprocesses a bit vector so that Jenkins 64-bit hashing can be computed in constant time on all prefixes.- Parameters:
bv
- a bit vector.seed
- a seed for the hash.- Returns:
- an array of three element; each element is an array containing
the state of the variables
a
,b
andc
during the hash computation; these vector must be passed tojenkins(BitVector, long, long[], long[], long[])
(and analogous methods) in this order. - See Also:
-
jenkins
public static void jenkins(BitVector bv, long prefixLength, long[] aa, long[] bb, long[] cc, long[] h) Constant-time Jenkins 64-bit hashing for any prefix (all three values produced).- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which the hash must be computed.aa
- the first state array returned bypreprocessJenkins(BitVector, long)
.bb
- the second state array returned bypreprocessJenkins(BitVector, long)
.cc
- the third state array returned bypreprocessJenkins(BitVector, long)
.h
- a triple of long values in which the three generated hashes will be saved.
-
jenkins
Constant-time Jenkins 64-bit hashing for any prefix.- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which the hash must be computed.aa
- the first state array returned bypreprocessJenkins(BitVector, long)
.bb
- the second state array returned bypreprocessJenkins(BitVector, long)
.cc
- the third state array returned bypreprocessJenkins(BitVector, long)
.- Returns:
- the third of the three hash values returned by
jenkins(BitVector, long, long[], long[], long[], long[])
.
-
jenkins
public static void jenkins(long[] triple, long seed, long[] h) Jenkins 64-bit hashing (all three values produced) for a triple of longs.- Parameters:
triple
- three longs.seed
- a seed for the hash.h
- a triple of long values in which the three generated hashes will be saved.
-
murmur
MurmurHash 64-bitThis code is based on a mix of the sources that can be found at MurmurHash's web site, and in particular on the version consuming 64 bits at a time, which has been merged with the 2A version to obtain an incremental implementation.
- Parameters:
bv
- a bit vector.seed
- a seed for the hash.- Returns:
- the hash.
-
murmur
Constant-time MurmurHash 64-bit hashing for any prefix.- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which the hash must be computed.state
- the state array returned bypreprocessMurmur(BitVector, long)
.- Returns:
- the hash for the prefix of
bv
orprefixLength
bits.
-
murmur
Constant-time MurmurHash 64-bit hashing reusing precomputed state partially.- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which the hash must be computed.state
- the state array returned bypreprocessMurmur(BitVector, long)
.lcp
- the length of the longest common prefix betweenbv
and the vector over whichstate
was computed.- Returns:
- the hash for the prefix of
bv
orprefixLength
bits.
-
preprocessMurmur
Preprocesses a bit vector so that MurmurHash 64-bit can be computed in constant time on all prefixes.- Parameters:
bv
- a bit vector.seed
- a seed for the hash.- Returns:
- an array containing the state of the variables
h
during the hash computation; these vector must be passed tomurmur(BitVector, long, long[])
(and analogous methods) in this order. - See Also:
-
murmur3
MurmurHash3 128-bitThis code is based on a mix of the sources that can be found at MurmurHash's web site.
Warning: this code is still experimental.
- Parameters:
bv
- a bit vector.seed
- a seed for the hash.h
- a pair of long values in which the two generated hashes will be saved.
-
murmur3
MurmurHash3 64-bitThis code is based on a mix of the sources that can be found at MurmurHash's web site.
Warning: this code is still experimental.
- Parameters:
bv
- a bit vector.seed
- a seed for the hash.- Returns:
- a hash.
-
murmur3
public static void murmur3(BitVector bv, long prefixLength, long[] hh1, long[] hh2, long[] cc1, long[] cc2, long[] h) Constant-time MurmurHash3 128-bit hashing for any prefix.Warning: this code is still experimental.
- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which the hash must be computed.hh1
- the first component of the state array returned bypreprocessMurmur3(BitVector, long)
.hh2
- the second component of the state array returned bypreprocessMurmur3(BitVector, long)
.cc1
- the third component of the state array returned bypreprocessMurmur3(BitVector, long)
.cc2
- the fourth component of the state array returned bypreprocessMurmur3(BitVector, long)
.h
- a pair of long values in which the two generated hashes will be saved.
-
murmur3
public static long murmur3(BitVector bv, long prefixLength, long[] hh1, long[] hh2, long[] cc1, long[] cc2) Constant-time MurmurHash3 64-bit hashing for any prefix.Warning: this code is still experimental.
- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which the hash must be computed.hh1
- the first component of the state array returned bypreprocessMurmur3(BitVector, long)
.hh2
- the second component of the state array returned bypreprocessMurmur3(BitVector, long)
.cc1
- the third component of the state array returned bypreprocessMurmur3(BitVector, long)
.cc2
- the fourth component of the state array returned bypreprocessMurmur3(BitVector, long)
.- Returns:
- a hash.
-
murmur3
public static void murmur3(BitVector bv, long prefixLength, long[] hh1, long[] hh2, long[] cc1, long[] cc2, long lcp, long[] h) Constant-time MurmurHash3 128-bit hashing reusing precomputed state partially.Warning: this code is still experimental.
- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which the hash must be computed.hh1
- the first component of the state array returned bypreprocessMurmur3(BitVector, long)
.hh2
- the second component of the state array returned bypreprocessMurmur3(BitVector, long)
.cc1
- the third component of the state array returned bypreprocessMurmur3(BitVector, long)
.cc2
- the fourth component of the state array returned bypreprocessMurmur3(BitVector, long)
.lcp
- the length of the longest common prefix betweenbv
and the vector over whichstate
was computed.h
- a pair of long values in which the two generated hashes will be saved.
-
murmur3
public static long murmur3(BitVector bv, long prefixLength, long[] hh1, long[] hh2, long[] cc1, long[] cc2, long lcp) Constant-time MurmurHash3 64-bit hashing reusing precomputed state partially.Warning: this code is still experimental.
- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which the hash must be computed.hh1
- the first component of the state array returned bypreprocessMurmur3(BitVector, long)
.hh2
- the second component of the state array returned bypreprocessMurmur3(BitVector, long)
.cc1
- the third component of the state array returned bypreprocessMurmur3(BitVector, long)
.cc2
- the fourth component of the state array returned bypreprocessMurmur3(BitVector, long)
.lcp
- the length of the longest common prefix betweenbv
and the vector over whichstate
was computed.- Returns:
- a hash.
-
preprocessMurmur3
Preprocesses a bit vector so that MurmurHash3 can be computed in constant time on all prefixes.Warning: this code is still experimental.
- Parameters:
bv
- a bit vector.seed
- a seed for the hash.- Returns:
- an array of four component arrays containing the state of the
variables
h1
,h2
,c1
andc2
during the hash computation; these vector must be passed tomurmur3(BitVector, long, long[], long[], long[], long[])
(and analogous methods) in this order. - See Also:
-
spooky4
SpookyHash 4-word-state (up to four values produced).- Parameters:
bv
- a bit vector.seed
- a seed for the hash.tuple
- a tuple of longs in which up to four generated hashes will be saved.
-
spooky4
SpookyHash 4-word-state (up to four values produced).- Parameters:
bv
- a bit vector.seed
- a seed for the hash.- Returns:
- the first of the four available hash values.
-
spooky12
SpookyHash 12-word-state (up to four values produced).This method will automatically switch to
spooky4(BitVector, long, long[])
when the provided bit vector is shorter than 768 bits.- Parameters:
bv
- a bit vector.seed
- a seed for the hash.tuple
- a tuple of longs in which up to four generated hashes will be saved.
-
preprocessSpooky4
Preprocesses a bit vector so that SpookyHash 4-word-state can be computed in constant time on all prefixes.- Parameters:
bv
- a bit vector.seed
- a seed for the hash.- Returns:
- an array containing the four internal words of state during the
hash computation; it can be passed to
spooky4(BitVector, long, long, long[], long[])
(and analogous methods). - See Also:
-
spooky4
Constant-time SpookyHash 4-word-state hashing reusing precomputed state.- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which to compute the hash.seed
- a seed for the hash.state
- the state vector returned bypreprocessSpooky4(BitVector, long)
; note thatseed
must be the same.tuple
- a tuple of longs in which up to four generated hashes will be saved.
-
spooky4
Constant-time SpookyHash hashing reusing precomputed state.- Parameters:
bv
- a bit vector.prefixLength
- the length of the prefix ofbv
over which to compute the hash. *seed
- a seed for the hash.state
- the state vector returned bypreprocessSpooky4(BitVector, long)
; note thatseed
must be the same.- Returns:
- the first of the four available hash values.
-
spooky4
public static void spooky4(long[] triple, long seed, long[] tuple) SpookyHash (up to four values produced) for a triple of longs.- Parameters:
triple
- three longs.seed
- a seed for the hash.tuple
- a tuple of longs in which up to four generated hashes will be saved.
-
spooky4
public static void spooky4(long x, long y, long seed, long[] tuple) SpookyHash (up to four values produced) for a pair of longs.- Parameters:
x
- the first long.y
- the second long.seed
- a seed for the hash.tuple
- a tuple of longs in which up to four generated hashes will be saved.
-
main
A simple test to check the relative speed of various hashes on your architecture.- Parameters:
arg
- the length of the bit vector to hash, and then the number of evaluations.
-