Class LayeredBloomFilter<T extends BloomFilter<T>>
- java.lang.Object
-
- org.apache.commons.collections4.bloomfilter.LayeredBloomFilter<T>
-
- Type Parameters:
T- The type of Bloom Filter that is used for the layers.
- All Implemented Interfaces:
BitMapExtractor,BloomFilter<LayeredBloomFilter<T>>,BloomFilterExtractor,IndexExtractor
public class LayeredBloomFilter<T extends BloomFilter<T>> extends java.lang.Object implements BloomFilter<LayeredBloomFilter<T>>, BloomFilterExtractor
Layered Bloom filters are described in Zhiwang, Cen; Jungang, Xu; Jian, Sun (2010), "A multi-layer Bloom filter for duplicated URL detection", Proc. 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE 2010), vol. 1, pp. V1-586-V1-591, doi:10.1109/ICACTE.2010.5578947, ISBN 978-1-4244-6539-2, S2CID 3108985In short, Layered Bloom filter contains several bloom filters arranged in layers.
- When membership in the filter is checked each layer in turn is checked and if a match is found
trueis returned. - When merging each bloom filter is merged into the newest filter in the list of layers.
- When questions of cardinality are asked the cardinality of the union of the enclosed Bloom filters is used.
The net result is that the layered Bloom filter can be populated with more items than the Shape would indicate and yet still return a false positive rate in line with the Shape and not the over population.
This implementation uses a LayerManager to handle the manipulation of the layers.
- Level 0 is the oldest layer and the highest level is the newest.
- There is always at least one enclosed filter.
- The newest filter is the
targetinto which merges are performed. - Whenever the target is retrieved, or a
mergeoperation is performed the code checks if any older layers should be removed, and if so removes them. It also checks it a new layer should be added, and if so adds it and sets thetargetbefore the operation.
- Since:
- 4.5.0-M2
-
-
Field Summary
-
Fields inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
SPARSE
-
-
Constructor Summary
Constructors Constructor Description LayeredBloomFilter(Shape shape, LayerManager<T> layerManager)Constructs a new instance.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description intcardinality()Gets the cardinality (number of enabled bits) of this Bloom filter.intcharacteristics()Gets the characteristics of the filter.voidcleanup()Forces the execution of the cleanup Consumer that was provided when the associated LayerManager was built.voidclear()Clears the filter to by resetting it to its initial, unpopulated state.booleancontains(BitMapExtractor bitMapExtractor)Returnstrueif this filter contains the bits specified in the bit maps produced by the bitMapExtractor.booleancontains(BloomFilter other)Returnstrueif this any layer contained by this filter contains the specified filter.booleancontains(BloomFilterExtractor bloomFilterExtractor)Returnstrueif each filter within thebloomFilterExtractorexits within this filter.booleancontains(Hasher hasher)Returnstrueif this filter contains the bits specified in the hasher.booleancontains(IndexExtractor indexExtractor)Returnstrueif this filter contains the indices specified IndexExtractor.LayeredBloomFilter<T>copy()Creates a new instance of thisLayeredBloomFilterwith the same properties as the current one.intestimateN()Estimates the number of items in the Bloom filter.intestimateUnion(BloomFilter other)Estimates the number of items in the union of this Bloom filter with the other bloom filter.int[]find(BitMapExtractor bitMapExtractor)Finds the layers in which the BitMapExtractor is found.int[]find(BloomFilter bf)Finds the layers in which the Bloom filter is found.int[]find(Hasher hasher)Finds the layers in which the Hasher is found.int[]find(IndexExtractor indexExtractor)Finds the layers in which the IndexExtractor is found.SimpleBloomFilterflatten()Create a standard (non-layered) Bloom filter by merging all of the layers.Tget(int depth)Gets the Bloom filter at the specified depthintgetDepth()Gets the depth of the deepest layer.ShapegetShape()Gets the shape that was used when the filter was built.booleanisEmpty()Determines if all the bits are off.booleanmerge(BitMapExtractor bitMapExtractor)Merges the specified hasher into this Bloom filter.booleanmerge(BloomFilter bf)Merges the specified Bloom filter into this Bloom filter.booleanmerge(IndexExtractor indexExtractor)Merges the specified IndexExtractor into this Bloom filter.voidnext()Forces and advance to the next layer.booleanprocessBitMaps(java.util.function.LongPredicate predicate)Each bit map is passed to the predicate in order.booleanprocessBloomFilters(java.util.function.Predicate<BloomFilter> bloomFilterPredicate)Processes the Bloom filters in depth order with the most recent filters first.booleanprocessIndices(java.util.function.IntPredicate predicate)Each index is passed to the predicate.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.commons.collections4.bloomfilter.BitMapExtractor
asBitMapArray, processBitMapPairs
-
Methods inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
estimateIntersection, isFull, merge, uniqueIndices
-
Methods inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilterExtractor
asBloomFilterArray, processBloomFilterPair
-
Methods inherited from interface org.apache.commons.collections4.bloomfilter.IndexExtractor
asIndexArray
-
-
-
-
Constructor Detail
-
LayeredBloomFilter
public LayeredBloomFilter(Shape shape, LayerManager<T> layerManager)
Constructs a new instance.- Parameters:
shape- the Shape of the enclosed Bloom filterslayerManager- the LayerManager to manage the layers.
-
-
Method Detail
-
cardinality
public int cardinality()
Description copied from interface:BloomFilterGets the cardinality (number of enabled bits) of this Bloom filter.This is also known as the Hamming value or Hamming number.
- Specified by:
cardinalityin interfaceBloomFilter<T extends BloomFilter<T>>- Returns:
- the cardinality of this filter
-
characteristics
public int characteristics()
Description copied from interface:BloomFilterGets the characteristics of the filter.Characteristics are defined as bits within the characteristics integer.
- Specified by:
characteristicsin interfaceBloomFilter<T extends BloomFilter<T>>- Returns:
- the characteristics for this bloom filter.
-
cleanup
public void cleanup()
Forces the execution of the cleanup Consumer that was provided when the associated LayerManager was built.
-
clear
public final void clear()
Description copied from interface:BloomFilterClears the filter to by resetting it to its initial, unpopulated state.- Specified by:
clearin interfaceBloomFilter<T extends BloomFilter<T>>
-
contains
public boolean contains(BitMapExtractor bitMapExtractor)
Description copied from interface:BloomFilterReturnstrueif this filter contains the bits specified in the bit maps produced by the bitMapExtractor.- Specified by:
containsin interfaceBloomFilter<T extends BloomFilter<T>>- Parameters:
bitMapExtractor- theBitMapExtractorto provide the bit maps.- Returns:
trueif this filter is enabled for all bits specified by the bit maps
-
contains
public boolean contains(BloomFilter other)
Returnstrueif this any layer contained by this filter contains the specified filter.If the
otheris a BloomFilterExtractor each filter within theotheris checked to see if it exits within this filter.- Specified by:
containsin interfaceBloomFilter<T extends BloomFilter<T>>- Parameters:
other- the other Bloom filter- Returns:
trueif this filter contains the other filter.
-
contains
public boolean contains(BloomFilterExtractor bloomFilterExtractor)
Returnstrueif each filter within thebloomFilterExtractorexits within this filter.- Parameters:
bloomFilterExtractor- the BloomFilterExtractor that provides the filters to check for.- Returns:
trueif this filter contains all of the filters contained in thebloomFilterExtractor.
-
contains
public boolean contains(Hasher hasher)
Description copied from interface:BloomFilterReturnstrueif this filter contains the bits specified in the hasher.Specifically this returns
trueif this filter is enabled for all bit indexes identified by thehasher. Using the bit map representations this is effectively(this AND hasher) == hasher.- Specified by:
containsin interfaceBloomFilter<T extends BloomFilter<T>>- Parameters:
hasher- the hasher to provide the indexes- Returns:
- true if this filter is enabled for all bits specified by the hasher
-
contains
public boolean contains(IndexExtractor indexExtractor)
Description copied from interface:BloomFilterReturnstrueif this filter contains the indices specified IndexExtractor.Specifically this returns
trueif this filter is enabled for all bit indexes identified by theIndexExtractor.- Specified by:
containsin interfaceBloomFilter<T extends BloomFilter<T>>- Parameters:
indexExtractor- the IndexExtractor to provide the indexes- Returns:
trueif this filter is enabled for all bits specified by the IndexExtractor
-
copy
public LayeredBloomFilter<T> copy()
Creates a new instance of thisLayeredBloomFilterwith the same properties as the current one.- Specified by:
copyin interfaceBloomFilter<T extends BloomFilter<T>>- Returns:
- a copy of this
LayeredBloomFilter.
-
estimateN
public int estimateN()
Description copied from interface:BloomFilterEstimates the number of items in the Bloom filter.By default this is the rounding of the
Shape.estimateN(cardinality)calculation for the shape and cardinality of this filter.This produces an estimate roughly equivalent to the number of Hashers that have been merged into the filter by rounding the value from the calculation described in the
Shapeclass Javadoc.Note:
- if cardinality == numberOfBits, then result is Integer.MAX_VALUE.
- if cardinality > numberOfBits, then an IllegalArgumentException is thrown.
- Specified by:
estimateNin interfaceBloomFilter<T extends BloomFilter<T>>- Returns:
- an estimate of the number of items in the bloom filter. Will return Integer.MAX_VALUE if the estimate is larger than Integer.MAX_VALUE.
- See Also:
Shape.estimateN(int),Shape
-
estimateUnion
public int estimateUnion(BloomFilter other)
Description copied from interface:BloomFilterEstimates the number of items in the union of this Bloom filter with the other bloom filter.This produces an estimate roughly equivalent to the number of unique Hashers that have been merged into either of the filters by rounding the value from the calculation described in the
Shapeclass Javadoc.estimateUnionshould only be called with Bloom filters of the same Shape. If called on Bloom filters of differing shape this method is not symmetric. Ifotherhas more bits anIllegalArgumentExceptionmay be thrown.- Specified by:
estimateUnionin interfaceBloomFilter<T extends BloomFilter<T>>- Parameters:
other- The other Bloom filter- Returns:
- an estimate of the number of items in the union. Will return Integer.MAX_VALUE if the estimate is larger than Integer.MAX_VALUE.
- See Also:
BloomFilter.estimateN(),Shape
-
find
public int[] find(BitMapExtractor bitMapExtractor)
Finds the layers in which the BitMapExtractor is found.- Parameters:
bitMapExtractor- the BitMapExtractor to search for.- Returns:
- an array of layer indices in which the Bloom filter is found.
-
find
public int[] find(BloomFilter bf)
Finds the layers in which the Bloom filter is found.- Parameters:
bf- the Bloom filter to search for.- Returns:
- an array of layer indices in which the Bloom filter is found.
-
find
public int[] find(Hasher hasher)
Finds the layers in which the Hasher is found.- Parameters:
hasher- the Hasher to search for.- Returns:
- an array of layer indices in which the Bloom filter is found.
-
find
public int[] find(IndexExtractor indexExtractor)
Finds the layers in which the IndexExtractor is found.- Parameters:
indexExtractor- the Index extractor to search for.- Returns:
- an array of layer indices in which the Bloom filter is found.
-
flatten
public SimpleBloomFilter flatten()
Create a standard (non-layered) Bloom filter by merging all of the layers. If the filter is empty this method will return an empty Bloom filter.- Specified by:
flattenin interfaceBloomFilterExtractor- Returns:
- the merged bloom filter.
-
get
public T get(int depth)
Gets the Bloom filter at the specified depth- Parameters:
depth- the depth of the filter to return.- Returns:
- the Bloom filter at the specified depth.
- Throws:
java.util.NoSuchElementException- if depth is not in the range [0,getDepth())
-
getDepth
public final int getDepth()
Gets the depth of the deepest layer. The minimum value returned by this method is 1.- Returns:
- the depth of the deepest layer.
-
getShape
public final Shape getShape()
Description copied from interface:BloomFilterGets the shape that was used when the filter was built.- Specified by:
getShapein interfaceBloomFilter<T extends BloomFilter<T>>- Returns:
- The shape the filter was built with.
-
isEmpty
public boolean isEmpty()
Description copied from interface:BloomFilterDetermines if all the bits are off. This is equivalent tocardinality() == 0.Note: This method is optimized for non-sparse filters. Implementers are encouraged to implement faster checks if possible.
- Specified by:
isEmptyin interfaceBloomFilter<T extends BloomFilter<T>>- Returns:
trueif no bits are enabled,falseotherwise.
-
merge
public boolean merge(BitMapExtractor bitMapExtractor)
Description copied from interface:BloomFilterMerges the specified hasher into this Bloom filter. Specifically all bit indexes that are identified by thebitMapExtractorwill be enabled in this filter.Note: This method should return
trueeven if no additional bit indexes were enabled. Afalseresult indicates that this filter may or may not contain all the indexes enabled in thebitMapExtractor. This state may occur in complex Bloom filter implementations like counting Bloom filters.- Specified by:
mergein interfaceBloomFilter<T extends BloomFilter<T>>- Parameters:
bitMapExtractor- The BitMapExtractor to merge.- Returns:
- true if the merge was successful
-
merge
public boolean merge(BloomFilter bf)
Description copied from interface:BloomFilterMerges the specified Bloom filter into this Bloom filter.Specifically all bit indexes that are identified by the
otherwill be enabled in this filter.Note: This method should return
trueeven if no additional bit indexes were enabled. Afalseresult indicates that this filter may or may not contain theotherBloom filter. This state may occur in complex Bloom filter implementations like counting Bloom filters.- Specified by:
mergein interfaceBloomFilter<T extends BloomFilter<T>>- Parameters:
bf- The bloom filter to merge into this one.- Returns:
- true if the merge was successful
-
merge
public boolean merge(IndexExtractor indexExtractor)
Description copied from interface:BloomFilterMerges the specified IndexExtractor into this Bloom filter. Specifically all bit indexes that are identified by theindexExtractorwill be enabled in this filter.Note: This method should return
trueeven if no additional bit indexes were enabled. Afalseresult indicates that this filter may or may not contain all the indexes of theindexExtractor. This state may occur in complex Bloom filter implementations like counting Bloom filters.- Specified by:
mergein interfaceBloomFilter<T extends BloomFilter<T>>- Parameters:
indexExtractor- The IndexExtractor to merge.- Returns:
- true if the merge was successful
-
next
public void next()
Forces and advance to the next layer. This method will clean-up the current layers and generate a new filter layer. In most cases is it unnecessary to call this method directly.
-
processBitMaps
public boolean processBitMaps(java.util.function.LongPredicate predicate)
Description copied from interface:BitMapExtractorEach bit map is passed to the predicate in order. The predicate is applied to each bit map value, if the predicate returnsfalsethe execution is stopped,falseis returned, and no further bit maps are processed.If the extractor is empty this method will return true.
Any exceptions thrown by the action are relayed to the caller.
- Specified by:
processBitMapsin interfaceBitMapExtractor- Parameters:
predicate- the function to execute- Returns:
trueif all bit maps returnedtrue,falseotherwise.
-
processBloomFilters
public final boolean processBloomFilters(java.util.function.Predicate<BloomFilter> bloomFilterPredicate)
Processes the Bloom filters in depth order with the most recent filters first. Each filter is passed to the predicate in turn. The function exits on the firstfalsereturned by the predicate.- Specified by:
processBloomFiltersin interfaceBloomFilterExtractor- Parameters:
bloomFilterPredicate- the predicate to execute.- Returns:
trueif all filters passed the predicate,falseotherwise.
-
processIndices
public boolean processIndices(java.util.function.IntPredicate predicate)
Description copied from interface:IndexExtractorEach index is passed to the predicate. The predicate is applied to each index value, if the predicate returnsfalsethe execution is stopped,falseis returned, and no further indices are processed.Any exceptions thrown by the action are relayed to the caller.
Indices ordering and uniqueness is not guaranteed.
- Specified by:
processIndicesin interfaceIndexExtractor- Parameters:
predicate- the action to be performed for each non-zero bit index.- Returns:
trueif all indexes return true from consumer,falseotherwise.
-
-