Yes, pretty much what @kevbonham wrote: some statistics, relying on the fact that is highly unlikely not to get clashes (birthday paradox) and therefore not use up all the available random numbers.
I took N random UInt16s and looked how many unique ones are in there on average, plus a 90% quantile. Then calculate back N from the unique IDs the server would see.
Code
using StatsBase
using Plots
uniquerands(i) = rand(UInt16, i) |> unique |> length
function collect_stats(rng)
len = length(rng)
means = zeros(len)
low = zeros(len)
up = zeros(len)
Threads.@threads for i in 1:len
ur = Int[]
for _ in 1:500
push!(ur, uniquerands(rng[i]))
end
means[i] = mean(ur)
l, u = quantile(ur, (0.05, 0.95))
low[i] = l
up[i] = u
end
means, low, up
end
i = 300_000:10_000:600_000 |> collect
s = collect_stats(i)
plot(i, s[1], ribbon = (s[1] .- s[2], s[3] .- s[1]), label = "mean with 90% quantile")
hline!([typemax(UInt16)], label = "typemax(UInt16)")
scatter!([405_000, 495_000], [65400, 65500], xerr = [10_000, 20_000], label = "estimate for unique users")
plot!(legend = :bottomright, ylims = (65_000, 65600), xlabel = "unique users", ylabel = "unique IDs")
