Package it.unimi.dsi.sux4j.bits


package it.unimi.dsi.sux4j.bits
Ranking and selection structures.

This package provides a number of implementations of rank/select queries for bits vectors. Ranking is counting the number of ones in an initial segment of a bit vector. Selection is finding the position of the r-th one. Both operation can be performed in constant time on an array of n bits using o(n) additional bits, but in practice linear data structures with small constants and theoretically non-constant time work much better. Sux4J proposes a number of new, very efficient implementation of rank and select oriented to 64-bit processors (in other words: they will be fairly slow on 32-bit processors). The implementations are based on broadword programming and described in Sebastiano Vigna, “Broadword Implementation of Rank/Select Queries”, in Proc. of the 7th International Workshop on Experimental Algorithms, WEA 2008, volume 5038 of Lecture Notes in Computer Science, pages 154−168. Springer, 2008.

For dense arrays, Rank9 is the basic rank implementation; Rank16 is slightly slower but occupies much less space. Selection can be performed using SimpleSelect for reasonably uniform bit arrays, or using Select9, which occupies more space but guarantees practical constant-time evaluation.

For sparse arrays (e.g., representation of pointers in a bitstream) we provide SparseRank and SparseSelect. Their main feature is that they do not require the original bit array, as they use an EliasFanoMonotoneLongBigList to implement a succint dictionary containing the positions of bits set. If the bit array is sufficiently sparse, such a representation provides significant gains in space occupancy.

All structures can be serialized. Since in some cases the original bit vector is stored inside the structure, to avoid saving and loading twice the same vector we suggest to pack all structures into a RankSelect instance.

Note that all methods in this package are considered low-level and do not perform bound checks on their arguments. Bound checks can be enabled, however, by enabling assertions.