Is there a clean one-liner to subset a vector of vectors?

I have something like this:

julia> A = [[1,[2,3]], [4,[5,6]], [7,[8,9]]]
3-element Vector{Vector{Any}}:
 [1, [2, 3]]
 [4, [5, 6]]
 [7, [8, 9]]

julia> B = [A[i][2][1] for i in 1:length(A)]
3-element Vector{Int64}:
 2
 5
 8

julia> C = [A[i][2][2] for i in 1:length(A)]
3-element Vector{Int64}:
 3
 6
 9

Is there a cleaner way to extract B and C vectors out than a loop or just writing them out? Ie, my code looks like this:

B = [A[i][2][1] for i in 1:length(A)]
C = [A[i][2][2] for i in 1:length(A)]
D = ...

I really did read a bunch of questions on similar problems but didn’t manage to understand. I was about to put the comprehension in a function then loop over that, but then thought there has to be a way I am missing.

Do you have a meaningful name for what that second element represents? Or better yet, what its first/last components are? Use those names as functions:

frombulator(x) = x[2][1]
proznicator(x) = x[2][2]

Then your B and C are broadcasts:

B = frombulator.(A)
C = proznicator.(A)

Of course this works best if you have good meaningful names you can give these operations, and that’s often the hardest part of most programming.

5 Likes

For this case you could try composing first and last then broadcasting.

julia> B = (first ∘ last).(A)
3-element Vector{Int64}:
 2                                                                 
 5
 8                                                                
julia> C = (last ∘ last).(A)
3-element Vector{Int64}:                                           
 3
 6                                                                 
 9

To generalize we may need a helper.

julia> nth(n) = x->x[n]
nth (generic function with 1 method)                              

julia> (nth(1) ∘ nth(2)).(A)                                      
3-element Vector{Int64}:
 2                                                                
 5
 8                                                                
julia> (nth(2) ∘ nth(2)).(A)                                      
3-element Vector{Int64}:
 3
 6                                                                
 9

Also rather than 1:length(A) consider using eachindex(A)

2 Likes

Another option to add to the pile:

getij(i,j) = x -> getindex.(getindex.(x,i),j)
B = getij(2,1)(A)
C = getij(2,2)(A)
1 Like

A neat way to refer to nested parts of a data structure in general is to use optics. Here they fit perfectly:

julia> using AccessorsExtra

julia> getall(A, @o _[∗][2][1])
3-element Vector{Int64}:
 2
 5
 8

Can be more involved to grasp the concept at first, but becomes very convenient after that.

Btw, if you care about performance, it could be useful to rethink the data structure. Elements in your A array aren’t type stable, notice the Any in 3-element Vector{Vector{Any}}.

3 Likes

Thanks all for the diversity of ideas, it has been very helpful.

Another option that is usually one of the most readable/logical:

using TensorCast
@cast B[i] := A[i][2][1]
@cast C[i] := A[i][2][2]
1 Like