The slowest zipper

I have an array of 1000 points, an Array{Array{Float64,1}} with length 1000. To separate the points into vectors of coordinates for each dimension, my Python instinct is to unpack the zipped points. For example, in Julia,

points = [rand(2) for i=1:1000]
x, y = zip(points...)

but this takes forever. I have to close the terminal window to kill it. Am I doing something wrong?

For xy vectors I usually do xs = first.(points) and ys = last.(points). Or more generally getindex.(points, Ref(1)).

1 Like

Ok, that works quickly, thank you. So, for an n-dimensional point the line would be
X = [getindex.(points, Ref(n)) for n in 1:length(points[1])]
or something similar?

I haven’t used the getindex version much, but I think that would work. You could also do something like

X = [[points[i][j] for i in eachindex(points)] for j in eachindex(points[1])]

if it doesn’t work or is slow.

The problem here is that the ... operator in zip(points...) means you are calling the zip function with 1000 arguments (one for each point). That’s not something the Julia compiler is well-optimized for because it’s not something that people usually want. It takes forever because the compiler is (unhelpfully) trying to generate a specialized implementation for zipping 1000 different arguments.

By the way, if your actual use case involves storing thousands of arrays of some small and fixed size, GitHub - JuliaArrays/StaticArrays.jl: Statically sized arrays for Julia is a fantastic tool to make your code faster and your life easier.

4 Likes

Ok, thanks @rdeits, that makes sense :+1: