Reshape Singleton Array to Scalar

Hey everyone,

I have a vector of vectors, x, and a vector of scalars, x_proposed. Both have the same number of elements:

x = [ [.3, .4, .3 ], [.5, .4, .1], 1., 2., [3., 4.] ]
x_proposed = [ 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.]

Now I would like to resize x_proposed such that it has the same form of x. I managed to build a function that does (almost) the job:

function resize(θ, θ_proposed)
         result = eltype(θ)[]
         counter = 1
         len = length.(θ)
         for i in eachindex(len)
           push!(result, ( θ_proposed[(counter):(counter+len[i]-1)] ) )
           counter += len[i]
         end
         result
end

#Works but has only vectors at each index
test = resize(x, x_proposed)

The problem here is that now all elements test are vectors itself, even the singletons. Question: Is there somehow a fast way that I can do this? This sounds like a super trivial question, but I have struggled to complete it and also not found any answers in Google. The main problem is that

θ_proposed[(counter):(counter+len[i]-1)] 

will always show a range and thus an array is initialized…

Edit:
I can make an intermediate function:

function make_scalar(x)
    length(x) > 1 ? x : x[1]    
end

which basically does what I want to do in the double loop section here:

make_scalar( invlink(distribution[2][1], θ_proposed_reshaped[4] ) )

However, I really do think there might be faster, and definitely cleaner ways to handle this?

First of all, you should realize that operating on an array like your x, in which some elements are vectors and some are scalars, is not going to be super efficient in any case because the elements of x are abstractly typed — that is, x is effectively an array of pointers to “boxes” with type tags, and the type of each element must be checked at runtime before dynamically dispatching to operations like length. So if you are in a performance-critical situation you should probably re-think your data structure.

That being said, here is how I would tend to write the function you requested:

function rechunk(θ, θ_proposed)
    result = sizehint!(similar(θ, 0), length(θ))
    i = firstindex(θ_proposed)
    for x in θ
        len = length(x)
        push!(result, x isa AbstractVector ? θ_proposed[i:i+len-1] : θ_proposed[i])
        i += len
    end
    return result
end

Note that going in the other direction is much easier:

reduce(vcat, x)
3 Likes

Echoing a bit of @stevengj’s comments, Julia is not a language where “all data has to be an array.” You might get better performance from something like this:

julia> using StaticArrays

julia> struct MyStruct{T}
           v1::SVector{3,T}
           v2::SVector{3,T}
           s1::T
           s2::T
           v3::SVector{2,T}
       end

julia> function MyStruct(v1::AbstractVector, v2::AbstractVector, s1::Number, s2::Number, v3::AbstractVector)
           T = promote_type(eltype(v1), eltype(v2), typeof(s1), typeof(s2), eltype(v3))
           MyStruct{T}(v1, v2, s1, s2, v3)
       end

I named the fields generically, but one of the advantages of building a struct is that you can use names that are meaningful. Also, if the lengths of the vectors might vary, you could use additional type parameters to encode the lengths, e.g.,

julia> struct MyStruct{T,N1,N2,N3}
           v1::SVector{N1,T}
           v2::SVector{N2,T}
           s1::T
           s2::T
           v3::SVector{N3,T}
       end

If you need to convert back and forth to a single-vector representation, you can create a custom reinterpret function for your type. For example (warning: untested):

function Base.reinterpret(::Type{MyStruct{T,N1,N2,N3}, xv::SVector{N,T}) where {T,N1,N2,N3,N}
    N1+N2+N3+2 == N || throw(DimensionMismatch("dims don't add up"))
    MyStruct{T,N1,N2,N3}(xv[1:N1], xv[N1+1:N1+N2], xv[N1+N2+1], xv[N1+N2+2], xv[N1+N2+3:end])
end
6 Likes

Thanks for your answer that is what I was looking for! I agree that I have to think about the final form of the structure.

Thanks also for your answer and approach! I will have a look at it :slight_smile: