How to define a function with argument dependent return

I need to set a function with N dimensional array return. And the parameter N is dependent on the argument. However, Julia cannot recognize this before compile. Here is my approach:

function DeltaTensor(total_dimension::Int64;tensororder=total_dimension*2)::Array{Float64,tensororder}
    shape::NTuple{tensororder, Int64} = Tuple(fill(2, tensororder))
    out::Array{Float64,tensororder} = zeros(Float64, shape)
    
    @inbounds @simd for i in CartesianIndices(shape)
        if all(isequal(i[j], i[1]) for j in 1:length(shape))
            out[i] = 1.0
        end
    end
    
    return out
end

This function DeltaTensor need parameter total_dimension then output an array with order tensororder, which is a Delta tensor, having all 0 elements except diagonal parts. But the @code_warntype tells us that the return type cannot be derived:

@code_warntype DeltaTensor(2)
MethodInstance for DeltaTensor(::Int64)
  from DeltaTensor(total_dimension::Int64; tensororder) @ Main In[20]:1
Arguments
  #self#::Core.Const(DeltaTensor)
  total_dimension::Int64
Body::Array{Float64}
1 ─ %1 = (total_dimension * 2)::Int64
β”‚   %2 = Main.:(var"#DeltaTensor#21")(%1, #self#, total_dimension)::Array{Float64}
└──      return %2

So, is it possible to let Julia knowing more information from function arguments or there are some other methods to achieve this issue?

I think value types are what you’re looking for

2 Likes

As @gdalle pointed out, Val types do help. Also, your code can be simplified significantly:

function delta_tensor(total_dimension::Int64)
    return DeltaTensor(total_dimension, Val(total_dimension * 2))
end

function delta_tensor(total_dimension::Int64, ::Val{tensororder}
                    )::Array{Float64,tensororder} where {tensororder}
    shape = ntuple(Returns(2), tensororder)
    return [float(allequal(i.I)) for i in CartesianIndices(shape)]
end

julia> @code_warntype delta_tensor(2, Val(4))
MethodInstance for delta_tensor(::Int64, ::Val{4})
  from delta_tensor(total_dimension::Int64, ::Val{tensororder}) where tensororder in Main at /tmp/tensor.jl:20
Static Parameters
  tensororder = 4
Arguments
  #self#::Core.Const(DeltaTensor)
  total_dimension::Int64
  _::Core.Const(Val{4}())
Locals
  #40::var"#40#41"
  shape::NTuple{4, Int64}
Body::Array{Float64, 4}
1 ─ %1  = Core.apply_type(Main.Array, Main.Float64, $(Expr(:static_parameter, 1)))::Core.Const(Array{Float64, 4})
β”‚   %2  = Main.Returns(2)::Core.Const(Returns{Int64}(2))
β”‚         (shape = Main.ntuple(%2, $(Expr(:static_parameter, 1))))
β”‚         (#40 = %new(Main.:(var"#40#41")))
β”‚   %5  = #40::Core.Const(var"#40#41"())
β”‚   %6  = Main.CartesianIndices(shape::Core.Const((2, 2, 2, 2)))::Core.Const(CartesianIndices((2, 2, 2, 2)))
β”‚   %7  = Base.Generator(%5, %6)::Core.Const(Base.Generator{CartesianIndices{4, NTuple{4, Base.OneTo{Int64}}}, var"#40#41"}(var"#40#41"(), CartesianIndices((2, 2, 2, 2))))
β”‚   %8  = Base.collect(%7)::Array{Float64, 4}
β”‚   %9  = Base.convert(%1, %8)::Array{Float64, 4}
β”‚   %10 = Core.typeassert(%9, %1)::Array{Float64, 4}
└──       return %10

Note that I renamed the function to delta_tensor to follow Julia conventions. Also, the suggested implementation is significantly more efficient:

(@v1.8) julia v1.8.5> @benchmark delta_tensor(2)
BenchmarkTools.Trial: 10000 samples with 962 evaluations.
 Range (min … max):  67.966 ns …  1.820 ΞΌs  β”Š GC (min … max): 0.00% … 95.41%
 Time  (median):     78.095 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   88.547 ns Β± 99.345 ns  β”Š GC (mean Β± Οƒ):  8.08% Β±  6.83%

  β–β–ƒβ–‚β–…β–‡β–ˆβ–†β–…β–…β–„β–„β–ƒβ–‚β–‚β–β–β–                                           β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–‡β–†β–…β–…β–…β–†β–…β–…β–„β–„β–„β–„β–ƒβ–β–ƒβ–ƒβ–ƒβ–β–β–β–β–β–β–β–β–…β–†β–†β–†β–‡β–ˆβ–‡β–†β–…β–…β–…β–„β–†β–†β–… β–ˆ
  68 ns        Histogram: log(frequency) by time       177 ns <

 Memory estimate: 256 bytes, allocs estimate: 2.

(@v1.8) julia v1.8.5> @benchmark DeltaTensor(2)
BenchmarkTools.Trial: 10000 samples with 5 evaluations.
 Range (min … max):  6.590 ΞΌs …  1.197 ms  β”Š GC (min … max): 0.00% … 98.65%
 Time  (median):     7.534 ΞΌs              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   8.756 ΞΌs Β± 22.636 ΞΌs  β”Š GC (mean Β± Οƒ):  5.68% Β±  2.21%

  β–…β–…β–ƒβ–‡β–ˆβ–‡β–†β–…β–…β–†β–„β–„β–ƒβ–‚β–β–                      ▁▁▁▁                 β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–‡β–‡β–†β–‡β–‡β–‡β–†β–†β–‡β–ˆβ–‡β–‡β–‡β–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–†β–ˆβ–‡β–‡β–‡β–…β–‡β–†β–†β–†β–…β–‡β–ˆ β–ˆ
  6.59 ΞΌs      Histogram: log(frequency) by time     16.7 ΞΌs <
2 Likes

Off-topic:

It’s worth noting that it’s extremely easy to mis-use parametric β€œvalue” types, including Val ; in unfavorable cases, you can easily end up making the performance of your code much worse . In particular, you would never want to write actual code as illustrated above. (source)

I also did it in the past - and I know it is not simple, but shouldn’t we try to avoid offering examples of β€œproper usage” for some concept only to mention right away that you would never want to write actual code as illustrated above?

It is more like: here is a piece of code on how to properly use concept X - by the way, you would never want to write such code since it is not a proper use of X or violates best practice rules.

2 Likes

Thx for your response. Great approach.
However, if I just run @code_warntype DeltaTensor(2), Julia still can’t infer full output type:

function DeltaTensor(total_dimension::Int64)
    return DeltaTensor(total_dimension, Val(total_dimension * 2))
end

function DeltaTensor(total_dimension::Int64,::Val{tensororder})::Array{Float64,tensororder} where {tensororder}
    shape = ntuple(Returns(2), tensororder)
    return [float(allequal(i.I)) for i in CartesianIndices(shape)]
end

@code_warntype DeltaTensor(2)
MethodInstance for DeltaTensor(::Int64)
  from DeltaTensor(total_dimension::Int64) @ Main In[7]:1
Arguments
  #self#::Core.Const(DeltaTensor)
  total_dimension::Int64
Body::Array{Float64}
1 ─ %1 = (total_dimension * 2)::Int64
β”‚   %2 = Main.Val(%1)::Val
β”‚   %3 = Main.DeltaTensor(total_dimension, %2)::Array{Float64}
└──      return %3

It is only correct when using @code_warntype delta_tensor(2, Val(4)).

Yes, I think there is no way around that, since the parameter is a runtime value. But the crux of the function is wrapped, and that compiles into efficient and type stable code.

Also, @algunion is quite right, the performance of the function without Val types is equivalent (or even slightly better):

function delta_tensor_2(total_dimension::Int,
                        tensor_order::Integer = 2total_dimension
                       )::Array{Float64}
    shape = ntuple(Returns(2), tensor_order)
    return [float(allequal(i.I)) for i in CartesianIndices(shape)]
end

(@v1.8) julia v1.8.5> @benchmark delta_tensor_2(2)
BenchmarkTools.Trial: 10000 samples with 963 evaluations.
 Range (min … max):  72.840 ns …  1.847 ΞΌs  β”Š GC (min … max): 0.00% … 92.61%
 Time  (median):     78.427 ns              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   88.471 ns Β± 96.277 ns  β”Š GC (mean Β± Οƒ):  7.82% Β±  6.81%

  β–‚β–†β–ˆβ–‡β–†β–…β–…β–„β–„β–ƒβ–‚β–                                                β–‚
  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–…β–„β–„β–„β–ƒβ–…β–…β–„β–„β–„β–ƒβ–β–β–ƒβ–β–β–ƒβ–β–β–β–β–β–β–β–ƒβ–β–β–β–ƒβ–β–…β–†β–†β–ˆβ–‡β–‡β–†β–†β–…β–…β–†β–…β–…β–… β–ˆ
  72.8 ns      Histogram: log(frequency) by time       176 ns <

 Memory estimate: 256 bytes, allocs estimate: 2.

Type stability is not everything.

1 Like

There is a chance that constant propagation can resolve this in a β€œreal use case.” For example, foo() = DeltaTensor(2) might infer properly since 2 is a compile-time constant. But if you mostly expect this to be done with compile-time constants, I would just use a Val from the start. Here’s my implementation:

function DeltaTensor(::Val{total_dimension}) where total_dimension
    isa(total_dimension,Integer) || error("total_dimension must be a Val{<:Integer}")
    tensororder = 2*total_dimension # or don't even pass `total_dimension` as an argument -- can just input `tensororder` directly
    shape = ntuple(Returns(2), Val(tensororder))
    
    out = map(CartesianIndices(shape)) do i
        Float64(all(Base.Fix2(isequal,i[1]), Base.tail(Tuple(i))))
    end
    return out
end

None of your type annotations do anything. It will be simpler to leave them out.

Also consider your computation. On your 2\times2\ldots\times2 array, the only index where all the coordinates are 1 is the first and the only index where all coordinates are 2 is the last. So I would actually just use

function DeltaTensor(::Val{total_dimension}) where total_dimension
    isa(total_dimension,Integer) || error("total_dimension must be a Val{<:Integer}")
    tensororder = 2*total_dimension # or don't even pass `total_dimension` as an argument -- can just input `tensororder` directly
    shape = ntuple(Returns(2), Val(tensororder))
    
    out = zeros(Float64, shape)
    out[begin] = out[end] = 1
    return out
end

which avoids all the useless index calculations. If you need to generalize this to N_1 \times N_2 \ldots \times N_d arrays, use out[begin:sum(strides(out)):end] .= 1

This is natural. How can the compiler predict what the input value to the function is going to be, beforehand? If all you know is that x is of type Int, can you tell what value x is?

If the compiler only knows the type, it cannot infer the value (unless it’s a Value type.)

The β€˜cheating’ way around this is, as have been mentioned, constant propagation, but that only works in some cases, and only for actual compile time constants.