Are you measuring with something like:
julia> const v = rand(10_000);
julia> const p = zeros(Int, 10_000);
julia> using BenchmarkTools
julia> @btime sortperm!($p, $v);
666.553 μs (1 allocation: 16 bytes)
(use bigger vector if 10_000 small)
?