Pre-allocating an SVector and filling it with values?

Hey guys

I think I have misunderstood why SVectors are good to use, so I would like to be corrected, if this is a bad use case. Suppose I have some code like this:

function readBi4Time(Bi4Files::Array{String,1}=_dirFiles())
    nBi4     = size(Bi4Files)[1]

    j  = Array{Float64,1}(undef,nBi4)

    ParticleString = ["TimeStep"]
    for i = 1:nBi4
        ft = open(Bi4Files[i],read=true)
        readuntil(ft,ParticleString[1])
        read(ft,Int32)
        j[i] = read(ft,Float64)
        close(ft)
    end
    return j
end

The only thing this code does is filling the array j with values read in from some files. If I wanted to do this by pre-allocating an SVector how would I go about doing this or would it even make sense to do?

Kind regards

You want to pre-allocate a static vector in place of j? It wouldn’t make sense. If you don’t know how long your array is going to be and you need to mutate it, you’ve violated the two key assumptions of SVector. Unless you’re heavily reusing the resulting vector, the cost of IO is going to overwhelmingly dominate any SVector speedup, besides.

If you really need a static array, you could use something like this:

function readBi4Time(Bi4File)
    ParticleString = "TimeStep"
    ft = open(Bi4Files[i],read=true)
    readuntil(ft,ParticleString
    read(ft,Int32)
    j_i = read(ft,Float64)
    close(ft)
    return j_i
end

j = @SVector [ReadBi4Time(filestr) for filestr in Bi4Files]

SVector and MVector are good in very tight inner loops, where you know the size at compile time (i.e. it is part of a type). If you want to construct an SVector piecewise, then you need to create an MVector instead, and then build an SVector from it:

julia> function mv(arr::Vector{T}, ::Val{N}) where {N,T}
       m=MVector{N,T}(undef)
       for i=1:N
       @inbounds m[i]=arr[i]
       end
       return SVector(m)
       end
julia> r=rand(8);

julia> @btime mv($r, Val(5))
  2.759 ns (0 allocations: 0 bytes)
5-element SArray{Tuple{5},Float64,1,5} with indices SOneTo(5):
 0.47602648124187863
[...]

Note that the allocation for the short-lived intermediate MVector was elided by the compiler. You need to write your code such that it is blindingly obvious that no reference to the MVector will escape under any circumstances.

Your code does not look like the requirements for good static array performance are fulfilled.

2 Likes

Sorry, I must not have been clear enough, I do know the length of the vector and I don’t need to mutate it since it is raw data I am reading.

Thanks for giving me something to continue with.

Kind regards

Okay, I see, thanks for pointing out the relation between M and S vectors. Could you try rephrasing “escape under any circumstances”? I am bit new to this but fascinated by it is possible to 0 allocations in optimal cases.

Kind regards

The compiler tries to figure out how objects are used. If julia/llvm understands all uses between allocation and death of an object, then there is no need to allocate the object at all: Instead all uses can be unpacked.

If you e.g. put a reference to the object (the MVector) into an array, or return it, or maybe call a non-inlined function, or your code is too complicated for the rudimentary automated proof system, then the compiler takes the safe route of just doing what you wrote: Allocate the MVector. If your code is simple enough, then the allocation can get elided.

Maybe-calling a non-inlined function is easy to do by mistake: For example, if there are no @inbounds annotations and the automated proof system fails to prove inbounds-ness, then there is a possible error path. The error path generates an error message. The code that generates the error message might not be inlined, the compiler doesn’t understand what this code does, and hence cannot prove that this code won’t write a reference to the MVector into some other global object.

1 Like

For static vectors, both the length of the vector and the values of all the elements must be known at compile time. While you may know the length of the vector, the compiler does not: you’re determining it at runtime with size(Bi4Files)[1]). If you’re preallocating and then assigning values to a vector, you’re mutating it. My workaround lets the compiler evaluate the length of the input array in global scope.

Okay, I see that point thanks for explaining, and showing me another way to do a loop. I fixed some small mistakes with parentheses/indices if someone wants to use it in the future:

function readBi4Time(Bi4Files)
   ParticleString = "TimeStep"
   ft = open(Bi4Files,read=true)
   readuntil(ft,ParticleString)
   read(ft,Int32)
   j_i = read(ft,Float64)
   close(ft)
   return j_i
end

j = @SVector [readBi4Time(filestr) for filestr in Bi4Files

Thanks, now it makes much more sense! Will try to keep that in mind when I want to make more efficient code, very exciting to learn about.

That’s not true. The length of the vector must be known at compile time, but the values of the elements must only be known when the vector is constructed. For example:

f(x,y) = SVector(x,y,x+y)

constructs an SVector with values determined at runtime when f is invoked.

4 Likes