Fastest way to merge multiple integer sets
Posted on 18 September 2016
Problem
You are given multiple integer sets and you need to merge them into a single set in the fastest possible way.
Solution
Suppose we have 3 arrays of integer sets and we need to merge them. The fastest solution would be possible
in time of O(n1 + n2 + n3)
where n1
, n2
, n3
are the lengths of the three sets respectively. The solution
lies in using a bitvector (also called as bitset or a bitarray) to represent elements using the index
and then iterating over them.
 Construct a bitarray
 Iterate over the first array and for each integer set the corresponding bit to true
 Repeat the above process for remaining arrays
 Any duplicate element will result in setting
true
an already set bit  The resultant bitarray is the merge of all the arrays
public void mergeArrays(int[] array1, int[] array2, int[] array3) {
BitSet bitSet = new BitSet();
// start merging
for(int num : array1) {
bitSet.setBit(num, true);
}
for(int num : array2) {
bitSet.setBit(num, true);
}
for(int num : array3) {
bitSet.setBit(num, true);
}
// output
int index = 1;
do {
index = bitSet.getNextSetBit(index);
if(index < 0) {
break;
}
System.out.println(index);
} while(true);
}
The solution above can be extended to as many arrays as are provided in the problem definition. The time to
sort will still remain O(N)
where N
is the sum of total number of elements across all provided arrays.
Optimizations available

One can make use of sparsedbitarrays to reduce memory consumption. Refer to [brettwooldridge/SparseBitSet] (https://github.com/brettwooldridge/SparseBitSet) for one such implementation.

If the arrays are really, really huge  an implementation that uses filebased persistence of a bitarray can be used. Refer to one such implementation available in the jerrycore project.