Class ArrayCountingBloomFilter
- java.lang.Object
-
- org.apache.commons.collections4.bloomfilter.ArrayCountingBloomFilter
-
- All Implemented Interfaces:
BitMapExtractor,BloomFilter<CountingBloomFilter>,CellExtractor,CountingBloomFilter,IndexExtractor
public final class ArrayCountingBloomFilter extends java.lang.Object implements CountingBloomFilter
A counting Bloom filter using an int array to track cells for each enabled bit.Any operation that results in negative counts or integer overflow of counts will mark this filter as invalid. This transition is not reversible. The operation is completed in full, no exception is raised and the state is set to invalid. This allows the cells for the filter immediately prior to the operation that created the invalid state to be recovered. See the documentation in
isValid()for details.All the operations in the filter assume the cells are currently valid, for example
cardinalityorcontainsoperations. Behavior of an invalid filter is undefined. It will no longer function identically to a standard Bloom filter that is the merge of all the Bloom filters that have been added to and not later subtracted from the counting Bloom filter.The maximum supported number of items that can be stored in the filter is limited by the maximum array size combined with the
Shape. For example an implementation using aShapewith a false-positive probability of 1e-6 andInteger.MAX_VALUEbits can reversibly store approximately 75 million items using 20 hash functions per item with a memory consumption of approximately 8 GB.- Since:
- 4.5.0-M1
- See Also:
Shape,CellExtractor
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.commons.collections4.bloomfilter.CellExtractor
CellExtractor.CellPredicate
-
-
Field Summary
-
Fields inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
SPARSE
-
-
Constructor Summary
Constructors Constructor Description ArrayCountingBloomFilter(Shape shape)Constructs an empty counting Bloom filter with the specified shape.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanadd(CellExtractor other)Adds the specified CellExtractor to this Bloom filter.int[]asIndexArray()Return a copy of the IndexExtractor data as an int array.intcardinality()Gets the cardinality (number of enabled bits) of this Bloom filter.intcharacteristics()Gets the characteristics of the filter.voidclear()Clears the filter to by resetting it to its initial, unpopulated state.booleancontains(BitMapExtractor bitMapExtractor)Returnstrueif this filter contains the bits specified in the bit maps produced by the bitMapExtractor.booleancontains(IndexExtractor indexExtractor)Returnstrueif this filter contains the indices specified IndexExtractor.ArrayCountingBloomFiltercopy()Creates a new instance of thisArrayCountingBloomFilterwith the same properties as the current one.intgetMaxCell()Gets the maximum allowable value for a cell count in this Counting filter.intgetMaxInsert(CellExtractor cellExtractor)Determines the maximum number of times the Cell Extractor could have been added.ShapegetShape()Gets the shape that was used when the filter was built.booleanisValid()Returnstrueif the internal state is valid.booleanprocessBitMaps(java.util.function.LongPredicate consumer)Each bit map is passed to the predicate in order.booleanprocessCells(CellExtractor.CellPredicate consumer)Performs the given action for eachcellwhere the cell count is non-zero.booleanprocessIndices(java.util.function.IntPredicate consumer)Each index is passed to the predicate.booleansubtract(CellExtractor other)Adds the specified CellExtractor to this Bloom filter.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.commons.collections4.bloomfilter.BitMapExtractor
asBitMapArray, processBitMapPairs
-
Methods inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
contains, contains, estimateIntersection, estimateN, estimateUnion, isEmpty, isFull
-
Methods inherited from interface org.apache.commons.collections4.bloomfilter.CountingBloomFilter
getMaxInsert, getMaxInsert, getMaxInsert, getMaxInsert, merge, merge, merge, merge, remove, remove, remove, remove, uniqueIndices
-
-
-
-
Constructor Detail
-
ArrayCountingBloomFilter
public ArrayCountingBloomFilter(Shape shape)
Constructs an empty counting Bloom filter with the specified shape.- Parameters:
shape- the shape of the filter
-
-
Method Detail
-
add
public boolean add(CellExtractor other)
Description copied from interface:CountingBloomFilterAdds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
otherwill be incremented by their corresponding values in theother.This method will return
trueif the filter is valid after the operation.- Specified by:
addin interfaceCountingBloomFilter- Parameters:
other- the CellExtractor to add.- Returns:
trueif the addition was successful and the state is valid- See Also:
CountingBloomFilter.isValid(),CountingBloomFilter.subtract(CellExtractor)
-
asIndexArray
public int[] asIndexArray()
Description copied from interface:IndexExtractorReturn a copy of the IndexExtractor data as an int array.Indices ordering and uniqueness is not guaranteed.
The default implementation of this method creates an array and populates it. Implementations that have access to an index array should consider returning a copy of that array if possible.
- Specified by:
asIndexArrayin interfaceIndexExtractor- Returns:
- An int array of the data.
-
cardinality
public int cardinality()
Description copied from interface:BloomFilterGets the cardinality (number of enabled bits) of this Bloom filter.This is also known as the Hamming value or Hamming number.
- Specified by:
cardinalityin interfaceBloomFilter<CountingBloomFilter>- Returns:
- the cardinality of this filter
-
characteristics
public int characteristics()
Description copied from interface:BloomFilterGets the characteristics of the filter.Characteristics are defined as bits within the characteristics integer.
- Specified by:
characteristicsin interfaceBloomFilter<CountingBloomFilter>- Returns:
- the characteristics for this bloom filter.
-
clear
public void clear()
Description copied from interface:BloomFilterClears the filter to by resetting it to its initial, unpopulated state.- Specified by:
clearin interfaceBloomFilter<CountingBloomFilter>
-
contains
public boolean contains(BitMapExtractor bitMapExtractor)
Description copied from interface:BloomFilterReturnstrueif this filter contains the bits specified in the bit maps produced by the bitMapExtractor.- Specified by:
containsin interfaceBloomFilter<CountingBloomFilter>- Parameters:
bitMapExtractor- theBitMapExtractorto provide the bit maps.- Returns:
trueif this filter is enabled for all bits specified by the bit maps
-
contains
public boolean contains(IndexExtractor indexExtractor)
Description copied from interface:BloomFilterReturnstrueif this filter contains the indices specified IndexExtractor.Specifically this returns
trueif this filter is enabled for all bit indexes identified by theIndexExtractor.- Specified by:
containsin interfaceBloomFilter<CountingBloomFilter>- Parameters:
indexExtractor- the IndexExtractor to provide the indexes- Returns:
trueif this filter is enabled for all bits specified by the IndexExtractor
-
copy
public ArrayCountingBloomFilter copy()
Creates a new instance of thisArrayCountingBloomFilterwith the same properties as the current one.- Specified by:
copyin interfaceBloomFilter<CountingBloomFilter>- Returns:
- a copy of this BloomFilter.
-
getMaxCell
public int getMaxCell()
Description copied from interface:CountingBloomFilterGets the maximum allowable value for a cell count in this Counting filter.- Specified by:
getMaxCellin interfaceCountingBloomFilter- Returns:
- the maximum allowable value for a cell count in this Counting filter.
-
getMaxInsert
public int getMaxInsert(CellExtractor cellExtractor)
Description copied from interface:CountingBloomFilterDetermines the maximum number of times the Cell Extractor could have been added.- Specified by:
getMaxInsertin interfaceCountingBloomFilter- Parameters:
cellExtractor- the extractor of cells.- Returns:
- the maximum number of times the CellExtractor could have been inserted.
-
getShape
public Shape getShape()
Description copied from interface:BloomFilterGets the shape that was used when the filter was built.- Specified by:
getShapein interfaceBloomFilter<CountingBloomFilter>- Returns:
- The shape the filter was built with.
-
isValid
public boolean isValid()
Returnstrueif the internal state is valid.This flag is a warning that an addition or subtraction of cells from this filter resulted in an invalid cell for one or more indexes. For example this may occur if a cell for an index was set to negative following a subtraction operation, or overflows the value specified by
getMaxCell()following an addition operation.A counting Bloom filter that has an invalid state is no longer ensured to function identically to a standard Bloom filter instance that is the merge of all the Bloom filters that have been added to and not later subtracted from this counting Bloom filter.
Note: The change to an invalid state may or may not be reversible. Implementations are expected to document their policy on recovery from an addition or removal operation that generated an invalid state.
Implementation note
The state transition to invalid is permanent.
This implementation does not correct negative cells to zero or integer overflow cells to
Integer.MAX_VALUE. Thus the operation that generated invalid cells can be reversed by using the complement of the original operation with the same Bloom filter. This will restore the cells to the state prior to the invalid operation. Cells can then be extracted usingprocessCells(CellPredicate).- Specified by:
isValidin interfaceCountingBloomFilter- Returns:
trueif the state is valid
-
processBitMaps
public boolean processBitMaps(java.util.function.LongPredicate consumer)
Description copied from interface:BitMapExtractorEach bit map is passed to the predicate in order. The predicate is applied to each bit map value, if the predicate returnsfalsethe execution is stopped,falseis returned, and no further bit maps are processed.If the extractor is empty this method will return true.
Any exceptions thrown by the action are relayed to the caller.
- Specified by:
processBitMapsin interfaceBitMapExtractor- Parameters:
consumer- the function to execute- Returns:
trueif all bit maps returnedtrue,falseotherwise.
-
processCells
public boolean processCells(CellExtractor.CellPredicate consumer)
Description copied from interface:CellExtractorPerforms the given action for eachcellwhere the cell count is non-zero.Some Bloom filter implementations use a count rather than a bit flag. The term
Cellis used to refer to these counts.Any exceptions thrown by the action are relayed to the caller. The consumer is applied to each cell. If the consumer returns
falsethe execution is stopped,falseis returned, and no further pairs are processed.- Specified by:
processCellsin interfaceCellExtractor- Parameters:
consumer- the action to be performed for each non-zero cell.- Returns:
trueif all cells return true from consumer,falseotherwise.
-
processIndices
public boolean processIndices(java.util.function.IntPredicate consumer)
Description copied from interface:IndexExtractorEach index is passed to the predicate. The predicate is applied to each index value, if the predicate returnsfalsethe execution is stopped,falseis returned, and no further indices are processed.Any exceptions thrown by the action are relayed to the caller.
Indices ordering and uniqueness is not guaranteed.
- Specified by:
processIndicesin interfaceCellExtractor- Specified by:
processIndicesin interfaceIndexExtractor- Parameters:
consumer- the action to be performed for each non-zero bit index.- Returns:
trueif all indexes return true from consumer,falseotherwise.
-
subtract
public boolean subtract(CellExtractor other)
Description copied from interface:CountingBloomFilterAdds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
otherwill be decremented by their corresponding values in theother.This method will return true if the filter is valid after the operation.
- Specified by:
subtractin interfaceCountingBloomFilter- Parameters:
other- the CellExtractor to subtract.- Returns:
trueif the subtraction was successful and the state is valid- See Also:
CountingBloomFilter.isValid(),CountingBloomFilter.add(CellExtractor)
-
-