Better way to improve a function with splat operator used in the input

linwaytin · March 29, 2023, 2:31am

I have a multi-variate function like

function f(input::Vararg{Int, N}) where {N}
# some stuff
end

I need to call it using the form f(x1..., xi, xn...), where x1 and xn are vectors determined in the runtime.

The splat operator is convenient, but seems not performant. With some simple experiments, I noticed that it takes >100ns on my machine.

I’m thinking about writing a function like f(x1, xi, xn), but the lengths of x1 and xn have to be specialized in different situations, because they are not fixed in the problem. Is this possible? Or is there a better way to handle this issue?

Thanks.

tomerarnon · March 29, 2023, 3:21am

Why do you need to specialize on the lengths of x1 and xn?
I would reconsider whether there is a way that avoids specialization.

If it is truly unavoidable, but x1 and xn have only a small number of lengths that they can be, I would consider dispatching on a Tuple or a SVector (from StaticArrays) instead of (or in addition to) splatting.

linwaytin · March 29, 2023, 12:13pm

Maybe I misused that word. Do you mean I just need a function with signature like (Tuple, Number, Tuple)? Is the length of the tuples not important?

In addition, currently I use Vector to store the partial input. Why are Tuple and SVector better in this case?

Thank you.

stevengj · March 29, 2023, 12:28pm

The compiler knows the length of a Tuple or SVector, so it can compile specialized code — if x = (a,b,c), then f(x...) compiles to exactly the same thing as f(a,b,c).

The compiler doesn’t know the length of a Vector, so at runtime it needs to construct the function call and dynamically dispatch to the correct method. This is expensive.

If you always know the size of your container statically (at compile time), then use a Tuple, or a StaticArray if you want to perform vector operations. Otherwise, don’t splat — just pass your array directly, and declare your function as function f(input::AbstractVector{<:Integer}) or similar.

linwaytin · March 29, 2023, 8:46pm

Thank you for the explanation and suggestion.

I have a following struct which contains the sets of indices.

struct IndexSet
I::Vector{Vector{Vector{Int}}}
end 

I[1] == [[1], [2], [3]]
I[2] == [[1,2], [3,4]]
I[3] == [[1,2,3],[2,3,4],[3,4,5],[5,6,7]]

That is, the i-th element of I is a index set with each element of length i.
If I can rewrite this to something like Vector{Vector{NTuple{i, Int}}}, then I think the splat operator can be compiled to the most efficient code.

The problem is then if I can make I to contain Tuples with different lengths for different i. Could you please give some hints about if this is possible? Thanks.

linwaytin · March 29, 2023, 9:47pm

I just played with that idea. It seems Vector{Vector}} is a concrete type and I can put vectors of NTuple in it.

tomerarnon · March 31, 2023, 1:58am

It is a concrete type, but probably not in the way that you think. I.e. if v isa Vector{Vector}, then v[i][j] is not type stable, because the compiler doesn’t know what sort of things the internal vector can hold. So it’s concrete in the sense that it can exist, but it’s really no better than a Vector{Any}

What is IndexSet used for?

linwaytin · March 31, 2023, 6:14am

Thank you! So is Vector{Vector} like an array of pointers in c?

The IndexSet is used for storing a sequence of sets of indices. The element I[k] contains a bunch of indices of length k. You can check the example given in my previous post.

For a fixed k, I[k] has a stable type. Can I make the compiler use the information of k? Or is there a better way to store heterogeneous elements?

linwaytin · March 31, 2023, 6:25am

Maybe the simplest way is using Tuple for I, instead of Vector, so that I can have an I like Tuple{Vector{Tuple{Int64}}, Vector{Tuple{Int64, Int64}}}. Does this work?

The length of I can be determined in the compile time from some constant.

linwaytin · March 31, 2023, 2:11pm

I found Tuple{(Vector{NTuple{k, Int}} for k=1:N)...} is valid, for a given N. For example, if N=2 I get

Tuple{Vector{Tuple{Int64}}, Vector{Tuple{Int64, Int64}}}

However, if I put it in a struct, like

 struct IndexSetN{N}
     I::Tuple{(Vector{NTuple{k, Int}} for k=1:N)...}
     function IndexSetN{N}(I) where N
          new{N}(I)
     end
end

I got

ERROR: MethodError: no method matching (::Colon)(::Int64, ::TypeVar)
Closest candidates are:
  (::Colon)(::T, ::Any, ::T) where T<:Real at range.jl:41
  (::Colon)(::A, ::Any, ::C) where {A<:Real, C<:Real} at range.jl:10
  (::Colon)(::T, ::Any, ::T) where T at range.jl:40
  ...
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

Does anybody know how to make this work? Or is it just impossible to define a struct like that?

Thanks.

mikmoore · March 31, 2023, 2:52pm

This is not possible. But you could simply do

struct IndexSetN{T}
# or `IndexSet{N,T}` if you still want the `N` parameter for other purposes
  I::T
  # ...
end

and then pass your Tuple type directly as T. You can use the inner constructor to validate it, if necessary.

linwaytin · March 31, 2023, 9:03pm

Thank you. Eventually I have this

struct ISetTuple{IndexSet}
    I::IndexSet

    function ISetTuple{N}() where N
        IndexSet = Tuple{(Vector{NTuple{k, Int}} for k=1:N)...}
        I = Tuple(NTuple{k, Int}[] for k in 1:N)
        new{IndexSet}(I, N)
    end
end

Then I can call like ISetTuple{3}() to construct a object. Julia is very expressive!

Unfortunately, I just noticed that this is very unlikely to make the compiler know how to optimize the splat operator, which is my initial question, because which element of I is being used is only known in the runtime.
This problem is due to my poor design of the code.

Thanks for everyone here. I learned a lot!

uniment · April 1, 2023, 3:51am

Not quite. You can’t actually create an instance of a Vector{Vector}, even though isconcretetype. An array of pointer arrays is instead a Vector{Vector{Any}}. (Note that Vector{Vector} is equal to Vector{Vector{<:Any}}.)

Example:

julia> let x=[Any[]]
           @show typeof(x)
           @show isconcretetype(typeof(x))
           @show x isa Vector{Vector{Any}}
           @show x isa Vector{Vector}
           @show x isa Vector{<:Vector}
       end;
typeof(x) = Vector{Vector{Any}}
isconcretetype(typeof(x)) = true
x isa Vector{Vector{Any}} = true
x isa Vector{Vector} = false
x isa Vector{<:Vector} = true

Are the indices always sequential? in which case you could use unitranges 1:3, 2:4, etc.

linwaytin · April 1, 2023, 10:40am

Thanks for the example.

They are not sequential in general. Sorry my example is misleading.

stevengj · April 1, 2023, 2:50pm

Sure you can:

julia> v = Vector[ [1,2,3], ["a","b","c"] ]
2-element Vector{Vector}:
 [1, 2, 3]
 ["a", "b", "c"]

A Vector{Vector} is an array whose elements can be different types of Vector, e.g. Vector{Int} and Vector{String} in this example.

Under the hood, I think it’s stored in much the same way as Vector{Any} — an array of jl_value_t pointers, which point to a “boxed” value that includes both a type tag and the actual data (or a pointer thereto).

uniment · April 1, 2023, 8:34pm

Touché.

Topic		Replies	Views
Splatting, tuples, vectors General Usage question , splat	2	243	March 17, 2024
Evaluate function combining Float & Vector General Usage	8	453	August 8, 2021
Concerns about splatting General Usage	9	3495	October 25, 2017
Why is the splat operator slow? Performance	3	1358	March 15, 2022
Efficient eval, splat, eval? Performance question	2	356	April 30, 2021

Better way to improve a function with splat operator used in the input

Related topics