I have an array A of tuples Array{TupleFloat64,Float64},1}, but am not able to make two arrays out of this like:
I1 = A[:][1]
I2 = A[:][2
even though I can access the individual elements with A[i][j]
Please give a minimal working example in the future (in this case, a small example case of your array A
. Also please quote your code with backticks.
One solution is
I1 = first.(A)
I2 = last.(A)
Another is
I1 = [x[1] for x in A]
Dear David,
Thank you for providing a solution. I am an experienced C and C++ programmer but a newcomer to Julia. Is there a deeper reason why A[:][1] does not work to make an array with the first element of the tupple?
Kind regards,
Jacques
A[:]
is the same as A
for a vector, regardless what is inside, indexing out all elements.
julia> A=[(1, 2), (2, 3), (3, 4)]
3-element Array{Tuple{Int64,Int64},1}:
(1, 2)
(2, 3)
(3, 4)
julia> A[:]
3-element Array{Tuple{Int64,Int64},1}:
(1, 2)
(2, 3)
(3, 4)
so A[:][1]
is the same as A[1]
.
If you have a two-dimensional array you can extract columns with colon indexing though.
julia> A = [1 2;2 3;3 4]
3×2 Array{Int64,2}:
1 2
2 3
3 4
julia> A[:, 1]
3-element Array{Int64,1}:
1
2
3
Dear Gunar,
Thanks for the additional info. What confused is that, as you say, the colon indexing can be used to extract columns in a two-dimensional array, A[:,1], so naively I thought/hoped that A[:][1] would do something similar for an array of tuples. I still have problems really grasping the interplay (or non-interplay) of tuples and arrays. The original cause of my problem is that a function returning multiple values automatically returns them in a tuple. If they would be returned in an array, nothing of this would happen.
Thank,
Jacques
A missing piece of information might be that repeated brackets isn’t a special syntax but just repeated indexing, as witnessed by quoting it.
julia> :(A[i][j])
:((A[i])[j])
If you have control of the function you can choose to package your multiple values in a vector before returning them.
julia> f(i) = [i, i + 1]
f (generic function with 1 method)
julia> A = f.([1, 2, 3])
3-element Array{Array{Int64,1},1}:
[1, 2]
[2, 3]
[3, 4]
This doesn’t in itself make it easier to slice out a part than with a vector of tuples but makes it a little more convenient to repackage it in an array, that can be sliced.
julia> B = reduce(hcat, A)
2×3 Array{Int64,2}:
1 2 3
2 3 4
julia> B[1, :]
3-element Array{Int64,1}:
1
2
3
However, you are probably better off with some kind of dot vectorization or comprehension solution like David proposed.
Returning an array would be much more expensive than returning a tuple, as an array has a significant amount of overhead compared to a tuple (which can be exactly zero-overhead in many circumstances), which is why it’s not done. Also, I’m not convinced that that would help in your case anyway, as a vector-of-vectors is still a completely different structure than a 2D matrix, and something like A[:][1]
would not work even if each element of A
were a vector.
Maybe you can describe more about what the actual problem you’re trying to solve is?
Dear Robin,
Thanks for taking the time to replying to this. I understand your point about an array of arrays not being a matrix. The actual problem is that I have a function f(s) computing various measurements for a specific parameter value s. I am calling this function on an array of parameter values using f.(s_array). What I want to end up with is to have several arrays of measurements, which will effectively hold m1(s_array), m2(s_array), … where m1,m2 are my measurements.
What I could is to define a struct I guess, but I wanted to avoid this complication. Or as Gunnar just wrote, I could also put the measurement in an array in the function and then return the array. In the latter case I will end up with an array of arrays, while in the former case I would end up with an array of structs… not sure if that is any better. At the end of the day I want to plot each measurement versus the parameter values. Seems like a very standard problem.
Cheers,
Jacques
I see, thanks. One easy solution is what @dpsanders suggested above and using a comprehension:
julia> function f(x)
x, 2x
end
f (generic function with 1 method)
julia> xs = [1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> ys = f.(xs)
3-element Array{Tuple{Int64,Int64},1}:
(1, 2)
(2, 4)
(3, 6)
julia> [y[1] for y in ys]
3-element Array{Int64,1}:
1
2
3
julia> [y[2] for y in ys]
3-element Array{Int64,1}:
2
4
6
this should be pretty efficient and easy to generalize. If you don’t want to use the comprehension syntax, you can also broadcast the getindex
function (that’s the function that’s called when you do foo[bar]
):
julia> getindex.(ys, 1)
3-element Array{Int64,1}:
1
2
3
julia> getindex.(ys, 2)
3-element Array{Int64,1}:
2
4
6
If you want to get really fancy, there’s one more trick you can try, as long as all of your returned values are of the same type and that type is isbits
(so, a primitive immutable type like Float64
or a struct
or tuple
made up of primitive immutables). If all of your parameter values are Float64 or Int, for example, then this will work. All we have to do is rely on the fact that isbits
types are stored inline in arrays, and we can freely transform an array-of-tuples into a matrix with reinterpret()
:
julia> r = reinterpret(Int, ys, (2, 3))
2×3 Array{Int64,2}:
1 2 3
2 4 6
julia> r[1, :]
3-element Array{Int64,1}:
1
2
3
julia> r[2, :]
3-element Array{Int64,1}:
2
4
6
Edit: be warned that it’s easy to forget the column-major layout order and get the reinterpret size wrong. I had to edit this post because I got it wrong myself!
Which of these is appropriate will depend on your particular needs, and I suggest using GitHub - JuliaCI/BenchmarkTools.jl: A benchmarking framework for the Julia language to determine which performs best for your case.
Dear Robin,
Thanks for the detailed explanation. To me it seems that the getindex function gets really close to what I was originally looking for!
The reinterpret won’t work in this case because the return values are of different types (floats and complex).
Cheers,
Jacques
This is extremely helpful. Alternatively, is there an efficient way to do this using view
? I put together the following benchmark for comprehension vs. view
vs. getindex
and was surprised by the results. I guess I expected view
to be faster for large arrays. Am I missing something, or is there an good reason for view
being slow? Why are there so many allocations?
using BenchmarkTools
N = 10
data = [rand(3) for i ∈ 1:N]
@btime [q[2] for q in $data]
@btime getindex.($data, 2)
@btime view.($data,2)
returns
57.476 ns (2 allocations: 176 bytes)
48.976 ns (1 allocation: 160 bytes)
108.226 ns (11 allocations: 640 bytes)
For N = 1000000
, I get:
7.812 ms (3 allocations: 7.63 MiB)
7.815 ms (2 allocations: 7.63 MiB)
35.088 ms (1000002 allocations: 53.41 MiB)
Don’t worry about the vector-of-tuples part of this, just look at what each scalar operation you’re doing is. getindex.(data, 2)
calls getindex(d, 2)
for each d
in data
, while view.(data, 2)
calls view(d, 2)
for each d
.
In your case, d
is a 3-element vector, so getindex(d, 2)
returns the second index, which is just a Float64
. That’s an extremely cheap operation. On the other hand, view(d, 2)
actually constructs a View
representing the second element of d
. While a View
may be cheaper to allocate than a new Array
(that’s the whole point, after all), they’re not cheaper to allocate than just a single Float64
.
On the other hand, if you did something like:
data = [rand(10000) for i in 1:N]
@btime getindex.($data, Ref(1:1000))
@btime view.($data, Ref(1:1000))
then you might see view
coming out ahead, since it will avoid creating a 1000-element copy of each element of data
.
This makes sense. Correct me if I’m wrong, but in my usage of view.()
I was essentially creating an array of N
views vs a view to an N
length Array.
Also broadcasting works fine:
julia> A = [(1, 2), (2, 3), (3, 4)];
julia> I1 = (x->x[1]).(A)
3-element Array{Int64,1}:
1
2
3
julia> I2 = (x->x[2]).(A)
3-element Array{Int64,1}:
2
3
4
It seems the API has changed in v1.5 (or earlier versions). Now reinterpret
no longer supports the third argument. We have to combine reinterpret
and reshape
to achieve the same purpose.
julia> r = reshape(reinterpret(Int, ys), (2, 3))
2×3 reshape(reinterpret(Int64, ::Array{Tuple{Int64,Int64},1}), 2, 3) with eltype Int64:
1 2 3
2 4 6
Here, reinterpret(Int, ys)
first yields a 1D array.
You’re right–this changed in v1.0.
Late to the party but I like this construction:
julia> A = [(1, 2), (2, 3), (3, 4)]
3-element Array{Tuple{Int64,Int64},1}:
(1, 2)
(2, 3)
(3, 4)
julia> tmp = map(x -> getindex.(A, x), 1:2)
2-element Array{Array{Int64,1},1}:
[1, 2, 3]
[2, 3, 4]
julia> out = reduce(hcat, tmp)
3×2 Array{Int64,2}:
1 2
2 3
3 4
To initialize an array of tuples from arrays can be done as follows:
function f(x,y,i)
x[i], y[i]
end
U1=[1,2,3,4]
U2=[5,6,7,8]
f.(U1,U2,1…)
julia> function f(x,y,i)
x[i], y[i]
end
f (generic function with 1 method)
julia> U1=[1,2,3,4]
4-element Vector{Int64}:
1
2
3
4
julia> U2=[5,6,7,8]
4-element Vector{Int64}:
5
6
7
8
julia> f.(U1,U2,1…)
4-element Vector{Tuple{Int64, Int64}}:
(1, 5)
(2, 6)
(3, 7)
(4, 8)