Unique! and count

Hi,
I’m trying to determine unique counts of values in an array
as an example, given the following array
data = [‘a’, ‘b’, ‘a’, ‘c’]
i wanna get: unique_array = [‘a’, ‘b’, ‘c’] and count_array = [2,1,1]
in python I can do like this: unique_array, count_array = np.unique(data, return_counts=True)
and i can also solve with julia like this: unique_array = unique!(data)
but when count, I use: count(i=>i==‘a’,data). I wonder if there are some other solutions in case I don’t know the value of data (a,b,c)

Something like `count_array = [count(==(x), data) for x in unique_array]` should work, though this loops over the data many times so if you have a lot of data to crunch it might be worth to look at something smarter.

3 Likes

Sounds like you want `countmap` in the StatsBase.jl package.

6 Likes

Or

``````function uniquecount(data)
unique_array = unique(data)
counts = Dict(unique_array .=> 0)
for (i, c) in enumerate(data)
counts[c] += 1
end
keys(counts), values(counts)
end
``````
1 Like

For data input as: `data = rand('a':'z', 1000)`, StatsBase’s `countmap()` (including collecting keys and values) seems to be 25% faster than the `count()` comprehension, and ~3x faster than `uniquecount()`.

2 Likes

Thanks, I got it