# Primitive Set Operations in Eclipse Collections

Eclipse Collections has a rich assortment of data structures, and one of them is a Set. Recently, I worked on an issue to implement `union`

, `intersect`

and `difference`

operations in sets for primitive types.

The next few sections will cover each operation’s objective, design considerations, and code implementation. The last section will cover the takeaways.

# Union — What does this operation do?

Method signature: `setA.union(setB)`

`union`

as the name indicates, it takes elements from two sets and combines them into one.

Set A — 1, 2, 3, 4.

Set B — 4, 5.

Union — 1, 2, 3, 4, 5.

# Union — Design considerations

**Size**— My first approach was to take elements from`Set A`

and add all the elements from`Set B`

to the resulting set. But, what if`Set B`

is larger than`Set A`

. Is it then optimal to create a set from`Set A`

first and then merge`Set B`

into the resulting set? In the case of union, we know that the elements from both sets are part of the final set; hence it is reasonable to start with the larger set and iterate over the smaller set to add them to the larger set.**Empty set(s)**— Does it matter that one of the two sets is empty?

If one of the two sets is empty, the expected result is the non-empty set. If both sets are empty, then the expected result is an empty set. This can be confirmed by writing useful unit tests.

# Union — Code implementation

🛈 Eclipse Collections has an existing API `withAll`

which allows you to add elements from one set to the other.

`default MutableIntSet union(IntSet set)`

{

if (this.size() > set.size())

{

return this.toSet().withAll(set);

}

else

{

return set.toSet().withAll(this);

}

}

Unit tests for union covering scenarios for equal-sized, unequal-sized, and empty sets.

private void assertUnion(MutableIntSet set1, MutableIntSet set2, MutableIntSet expected)

{

MutableIntSet actual = set1.union(set2);

Assert.assertEquals(expected, actual);

}@Test

public void union()

{

this.assertUnion(

this.newWith(1, 2, 3),

this.newWith(3, 4, 5),

this.newWith(1, 2, 3, 4, 5));

this.assertUnion(

this.newWith(1, 2, 3, 6),

this.newWith(3, 4, 5),

this.newWith(1, 2, 3, 4, 5, 6));

this.assertUnion(

this.newWith(1, 2, 3),

this.newWith(3, 4, 5, 6),

this.newWith(1, 2, 3, 4, 5, 6));this.assertUnion(

this.newWith(),

this.newWith(),

this.newWith());

this.assertUnion(

this.newWith(),

this.newWith(3, 4, 5),

this.newWith(3, 4, 5));

this.assertUnion(

this.newWith(1, 2, 3),

this.newWith(),

this.newWith(1, 2, 3));

}

# Intersect — What does this operation do?

Method signature: `setA.intersect(setB)`

Intersect takes two elements from two sets and only retains the common elements in the resulting set.

Set A — 1, 2, 3, 4.

Set B — 3, 4, 5, 6.

Intersect — 3, 4.

# Intersect — Design considerations

**Size**— This matters in the case of`intersect`

too, but it works differently than the case above. Since we are only interested in the common elements between the two sets, the smaller set between them will be our starting point. For the sake of this argument, consider a case of million items in`Set A`

and three items in`Set B`

and we are interested in the common elements. It would make sense to start with the smaller set as we know the result set will contain only three elements*at most*.**Empty set (s)**— In this case, if either or both sets are empty, the resulting set will be empty. This can be confirmed quickly by adding unit tests.

# Intersect — Code Implementation

🛈 Eclipse Collections has an existing API that allows us to `select`

all elements that evaluate true for the given `Predicate.`

`default MutableIntSet intersect(IntSet set)`

{

if (this.size() < set.size())

{

return this.select(set::contains);

}

else

{

return set.select(this::contains, this.newEmpty());

}

}

Unit tests for intersect covering scenarios for equal-sized, unequal-sized, and empty sets.

# Difference — What does this operation do?

Method signature: `setA.difference(setB)`

Difference takes elements that are unique to `Set A`

only and not `Set B`

.

Set A — 1, 2, 3, 4.

Set B — 3, 4, 5, 6.

Difference— 1, 2.

# Difference — Design considerations

As you may have guessed, the size of these two sets doesn't matter, as our starting point will always be `Set A`

. If` Set A`

is empty, then the resulting set will be empty, and if `Set B`

is empty, then the resulting set will be `Set A`

.

# Difference — Code implementation

🛈 Eclipse Collections has an existing API that allows us to `reject`

which returns all elements that evaluate false for the `Predicate.`

In other words, it is the opposite of `select`

.

`default MutableIntSet difference(IntSet set)`

{

return this.reject(set::contains);

}

Unit tests for difference covering scenarios for equal-sized, unequal-sized, and empty sets.

# Takeaways

**Naming convention —**Naming is of utmost significance when it comes to code readability. In Eclipse Collections, we have chosen to name these operations as`union`

,`intersect`

and`difference`

to effectively communicate the mathematical set operations.**Write Unit Tests**— This point is worth reiterating. I started writing unit tests to confirm my understanding of how these new APIs should behave. Once I had the first implementation ready, the next step was to verify that the unit tests were still passing after adding the API’s optimization logic.**Think Optimization**— In my first implementation, the new APIs worked as expected. To further improve the code, I added optimization improvements to the algorithm, such as a simple size check, and other minor code refactoring.

*I am a committer for the **Eclipse Collections** OSS project at Eclipse Foundation. **Eclipse Collections** is open for contributions.*