5.4.0 - New structures for simple selection on LongBigArrayBitVector instances. - Bulk select methods are now part of the interface (with default implementations). - Replaced LongArrayBitVector.word() with LongBigArrayBitVector.word() in EliasFanoMonotoneBigLongBigList. Previously, big arrays of lower bits with more than 2^31 elements would have caused an assertion failure or an out-of-bounds exception. - To make EliasFanoMonotoneBigLongBigList completely free from limitations, also its upper bits now are implemented using a LongBigArrayBitVector. This change required to bump the serial version UID. Also MappedEliasFanoMonotoneLongBigList had to be adapted, and its UID bumped. - Several problem related to "int vs. long" in the new classes were fixed. 5.3.1 - New EliasFanoMonotoneBigLongBigList which uses a LongBigArrayBitVector for the lower bits and avoids the limitations of EliasFanoMonotoneLongBigList. Instances can be mapped in memory with MappedEliasFanoMonotoneLongBigList. 5.3.0 - New MappedEliasFanoMonotoneBigList class makes it possible to map into memory the lower bits of an Elias-Fano list. - Completely revamped ZFastTrie implementation. 5.2.3 - Thanks to FileLines*Iterable, all classes can read term lists using arbitrary decompressors. - All non-deprecated function classes now support byte arrays. - Official support starts now at Java 9. By mistake, even if compiled for Java 8, Sux4J was using Math.multiplyHigh(), which appears only in the JDK library since Java 9. 5.2.2 - Sux4J is now dually licensed under the Lesser GNU Public License 2.1+ or the Apache Software License 2.0. 5.2.1 - Removed (almost) unused dependencies. - Upgraded to newest versions of Apache Commons. 5.2.0 - Massive (unfortunately, sometimes not backward-compatible) overhauling of the interface specification and of the implementation to reduce the amount of parameter checks and special cases. - The default size of the bucket at creation in a BucketedHashStore has been raised to 256 (it used to be one) to avoid that calls to checkAndRetry() on a newly created store are too expensive. - BucketedHashStore now return the bucket index as a long (this is unfortunately not backward-compatible for some applications). This change was necessary as buckets can be more than Integer.MAX_VALUE. All user classes now throw an exception if the number of keys is so large than the resulting number of buckets would be greater than Integer.MAX_VALUE. 5.1.5 - Fast forward iterations in EliasFanoMonotoneLongBigList/EliasFanoPrefixSumBigList and skipping by value when iterating over an EliasFanoIndexedMonotoneLongBigList. 5.1.4 - Several new methods for EliasFanoIndexedMonotoneLongBigList. 5.1.3 - Released to fix dependency on broken dsiutils 2.1.13. 5.1.2 - New indexed version of EliasFanoMonotoneLongBigList. - OSGi metadata and default modularization. - Manual reduction in strength in arithmetic operations everywhere. 5.1.1 - Added automatic module name. 5.1.0 - GOV/GV functions now allow arbitrary data sizes, thanks to the new LongBigArrayBitVector class in the DSI utilities. - Fixed SLF4J dependencies (again). 5.0.4 - We now use the classifier in naming artifacts. 5.0.3 - Fixed build.xml: moved to Java 9, more explicit dependency naming. 5.0.2 - Fixed old bug in SimpleSelect. Thanks to Richard Webb for reporting this bug. 5.0.1 - New EliasFanoMonotoneLongBigList16 class supporting very large lists with fixed lower-bits width 16 (bypassing the LongArrayBitVector 128Gib limit). 5.0.0 - Major revamp of the it.unimi.dsi.sux4j.mph package. 4.3.0 - Several fixes and improvements. 4.2.0 - Java 8-only. - New compressed functions (GV3CompressedFunction and GV4CompressedFunction) which store a mapping key/value using a number of bits per key close to the empirical entropy of the list of values. - Parallel, multithreaded construction for GOV3Function, GOV4Function, GOVMinimalPerfectHashFunction, GV3CompressedFunction and GV4CompressedFunction). 4.1.0 - We now use in all the new structures of the mph package a modulo-free range reduction based on a integer-simulated floating point computation (see http://www.drdobbs.com/tools/fast-high-quality-parallel-random-number/231000484). This has improved lookup performance by 5-10% but it has made necessary to bump the serial version UIDs. - Fixed wrong mask in Rank12. It could lead to unpredictable errors. 4.0.0 - Complete revamp of the mph package. New algorithms provide better space and faster lookup. Please read carefully the package documentation. - Older maps such as MWHCFunction, MinimalPerfectHashFunction and TwoStepsMWHCFunction have been deprecated, and will be removed sooner or later. - All serial version UIDs have been updated. - We now use SpookyHash. - We now support, for all functions for which it is possible, raw (i.e., non-lexicographical) transformation strategies (as they are faster). - Fixed a lot of documentations, main methods, etc. - Several functions can now be built from the command line using byte-arrays as keys; strings are read without any decoding. 3.2.2 - Improved speed for selection in a word thanks to an idea of Giuseppe Ottaviano. Now we refer to the implementation found in the DSI utilities (it.unimi.dsi.bits.Fast.select()). 3.2.1 - Fixed dependencies. 3.2.0 - New Rank11 and Rank12 implementations with new space/speed tradeoffs. Taken from Simon Gog and Mathias Petri's paper. - New ranking structure for MinimalPerfectHashFunction based on Rank11, which made it necessary to bump the serialVersionUID. - New HashDisplaceCompressMinimalPerfectHashFunction class implementing a new technique by Belazzougui, Botelho and Dietzfelbinger, providing minimal perfect hashing at 2.05 bits per key. Construction time is an order of magnitude larger than MinimalPerfectHashFunction, and queries are ~50% slower, so we will keep the other implementation as a reference for the time being. 3.1.2 - Revised adapter (now renamed SignedFunctionStringMap) to accept generic Object2Long functions, discovering dynamically whether they implement Size64. 3.1.1 - New thin adapter SignedObject2LongFunctionStringMap to adapt the new Sux4J self-signing functions to the StringMap (big) interface. The adapter implements also a constructor based on an Iterable that can be used directly with MG4J's IndexBuilder. - New bulk get() methods for EliasFanoLongBigList, EliasFanoMonotoneLongBigList and select() for SimpleSelect. They are built one upon each other and they are an order of magnitude faster than picking consecutive elements using getLong()/select(). 3.1.0 - MinimalPerfectHashFunction, MWHCFunction, LcpMonotoneMinimalPerfectHashFunction, TwoStepsLcpMonotoneMinimalPerfectHashFunction and ZFastTrieDistributorMonotoneMinimalPerfectHashFunction have now built-in signing. The effort was relatively small, and the gain in speed is huge, as we can use part of the hash computed by the ChunkedHashStore as a signature. - New "dictionary" option in MWHCFunction which stores the signature as a value. It gives the fastest and most compact probabilistic dictionary. - Big switch to builders for MinimalPerfectHashFunction, MWHCFunction, TwoStepsMWHCFunction, LcpMonotoneMinimalPerfectHashFunction, TwoStepsLcpMonotoneMinimalPerfectHashFunction and ZFastTrieDistributorMonotoneMinimalPerfectHashFunction. As a final goal, all multiple constructors will be eliminated in favour of Guava-style static nested Builder classes. - Major speed improvements in ZFastTrieDistributorMonotoneMinimalPerfectHashFunction. 3.0.11 - Now we use XorShift1024StarRandom everywhere. 3.0.10 - New parameter to MWHCFunction makes it possible to build arbitrary functions from the command line. 3.0.9 - Added for efficiency a lowerBitsMask field to EliasFanoMonotoneLongBigList, which made it necessary to bump the serialVersionUID. - Aligned the behaviour of EliasFanoMonotoneLongBigList with that of MG4J's implementation--the upper bound is not necessarily strict. - Fixed subtle bug in ChunkedHashStore that could cause an out-of-bounds access to an array in the (very rare) case of a retry. - The documentation of Jenkins hashes was still stating, erroneously, that we return the first computed value (we return the third one). - Deprecated length() methods have been removed. - Added UTF-32 support to all classes. 3.0.8 - Significantly improved speed of SimpleSelect and all related classes (Elias-Fano lists, etc.). - Improved speed of MinimalPerfectHashFunction. - Integer overflow bugs fixes (thanks to Tim Potter). 3.0.7 - Some monotone hash functions would crash during construction due to an integer overflow. Thanks to Tim Potter for reporting this bug. 3.0.6 - Switched to SLF4J for logging. 3.0.5 - The Ivy dependency file has been Maven-normalized (now we have a default "compile" scope). 3.0.3 - Replaced methods from it.unimi.dsi.bits.Fast with equivalent methods in Integer/Long, as recent JVMs intrinsify such methods. 3.0.2 - Fixed minor inconsistencies in the values returned on empty functions (some implementations would actually throw an exception). Thanks to Valentin Tablan for reporting the problem. 3.0 - WARNING: This release has minor binary incompatibilities with previous releases, mainly due to the move from the interface it.unimi.dsi.util.LongBigList to the now standard it.unimi.dsi.fasutil.longs.LongBigList. It is part of a parallel release of fastutil, the DSI Utilities, Sux4J, MG4J, WebGraph, etc. that were all modified to fit the new interface, and that prepare the way for our "big" versions, that is, supporting >2^31 entries in arrays (simulated), elements in lists, terms, documents, nodes, etc. Please read our (short) "Moving Java to Big Data" document (JavaBig.pdf) for details. - We now require Java 6. - it.unimi.dsi.util.LongBigList is dead. Long live to it.unimi.dsi.fastutil.longs.LongBigList. We're sorry for the nuisance--adapting the code should be very easy (and we warned you anyway :). - MWHCFunction has been re-engineered so that it will use very little space beside that actually required by the function. Previously, a number of large bit vectors were allocated at the same time, and they have been replaced by a judicious use of OfflineIterable. All classes that use MWHCFunction will benefit (at the expense of slightly increased disk usage and access). - All classes should support big collections. They use the new Size64 interface from fastutil. Implementations still support the old (deprecated) length() method for backward compatibility. - New FileLinesBigList class. - We now have a MurmurHash3 full implementation. 2.0.1 - Sux4J is now distributed under the GNU Lesser General Public License 3. - Major rewriting of the hypergraph peeling code. Now we use less memory and we are definitely faster. - MWHCFunction and MinimalPerfectHashFunction accept a temporary directory for the chunked hash store files. - Improved ChunkedHashStore architecture that allow arbitrary values, so functions can be built without keeping values in memory. - Fixed bug in MinimalPerfetHashFunction that was causing exceptions when using more than a billion keys (thanks to Wei Liu for reporting and fixing this bug). - Fixed bug in some rank/select classes that was causing integer overflow errors when building structure over bit vectors with >2Gi bits. - AbstractLongBitVector.equals() now uses getLong() on word boundaries. - Fixed pernicious bug in Select9. 2.0 - General revamp, restructuring, improvements, new coherent names for classes. Most of the code has been rewritten or improved. - New (partial) structures for balanced parentheses. - New build system based on ChunkedHashStore that works for billions of keys. - New structures for balanced parentheses (partially implemented). - Faster Select9 operation: some broadword operations were implemented in a redundant way. - SparseSelect/SparseRank would give an incorrect (or at least little useful) value for numBits() when using shared data. - Fixed problem with SparseSelect: some methods inherited from EliasFanoMonotoneBigList were causing exceptions because the size of the SparseSelect is the number of bits, not the number of ones. - New hollow trie implementation based on balanced parentheses. - Fixed an old and severe bug in MinimalPerfectHashFunction, that was causing the generated functions not to be perfect. 1.0.4 - New progressive hash-computation methods that provide Jenkins hash in constant time on all prefixes (after a linear-time preprocessing). 1.0.3 - New TwoStepsMWHCFunction that records in s