How to Remove Duplicates from a Collection in Julia
Overview
This section introduces how to remove and check for duplicates in collections in Julia. The unique()
function, which eliminates duplicates, is algorithmically straightforward, but can be bothersome to implement on your own, and may not be efficient. The allunique()
function, which checks for the absence of duplicate elements, is easy enough to implement that one might not have sought it out, so it’s worth getting familiar with.
Code
unique()
julia> x = [3, 1, 4, 1, 5, 9, 2, 6, 5];
julia> y = unique(x)
7-element Vector{Int64}:
3
1
4
5
9
2
6
allunique()
1
In fact, the purpose of this post is to introduce allunique()
. One common sense way to check if a collection has duplicate elements is to see if length(unique(x)) == length(x)
by applying unique()
to see if the number of elements has decreased.
This method might seem too easy to be overconfident about its efficiency; however, the unique()
function has to look at every element of an array of length $n$ at least once, which means its time complexity is $O (n)$. This could certainly be a noticeable burden when frequently checking for duplicates in code, and allunique()
offers a clear advantage in performance as it may vary its implementation based on the array’s length and can stop its calculation upon finding a duplicate, thereby efficiently determining success.
julia> allunique(x)
false
julia> allunique(y)
true
Complete Code
x = [3, 1, 4, 1, 5, 9, 2, 6, 5];
y = unique(x)
allunique(x)
allunique(y)
Environment
- OS: Windows
- julia: v1.9.0