# How to define a function with argument dependent return

I need to set a function with N dimensional array return. And the parameter N is dependent on the argument. However, Julia cannot recognize this before compile. Here is my approach:

function DeltaTensor(total_dimension::Int64;tensororder=total_dimension*2)::Array{Float64,tensororder}
shape::NTuple{tensororder, Int64} = Tuple(fill(2, tensororder))
out::Array{Float64,tensororder} = zeros(Float64, shape)

@inbounds @simd for i in CartesianIndices(shape)
if all(isequal(i[j], i) for j in 1:length(shape))
out[i] = 1.0
end
end

return out
end


This function DeltaTensor need parameter total_dimension then output an array with order tensororder, which is a Delta tensor, having all 0 elements except diagonal parts. But the @code_warntype tells us that the return type cannot be derived:

@code_warntype DeltaTensor(2)
MethodInstance for DeltaTensor(::Int64)
from DeltaTensor(total_dimension::Int64; tensororder) @ Main In:1
Arguments
#self#::Core.Const(DeltaTensor)
total_dimension::Int64
Body::Array{Float64}
1 ─ %1 = (total_dimension * 2)::Int64
│   %2 = Main.:(var"#DeltaTensor#21")(%1, #self#, total_dimension)::Array{Float64}
└──      return %2


So, is it possible to let Julia knowing more information from function arguments or there are some other methods to achieve this issue?

I think value types are what you’re looking for

2 Likes

As @gdalle pointed out, Val types do help. Also, your code can be simplified significantly:

function delta_tensor(total_dimension::Int64)
return DeltaTensor(total_dimension, Val(total_dimension * 2))
end

function delta_tensor(total_dimension::Int64, ::Val{tensororder}
)::Array{Float64,tensororder} where {tensororder}
shape = ntuple(Returns(2), tensororder)
return [float(allequal(i.I)) for i in CartesianIndices(shape)]
end

julia> @code_warntype delta_tensor(2, Val(4))
MethodInstance for delta_tensor(::Int64, ::Val{4})
from delta_tensor(total_dimension::Int64, ::Val{tensororder}) where tensororder in Main at /tmp/tensor.jl:20
Static Parameters
tensororder = 4
Arguments
#self#::Core.Const(DeltaTensor)
total_dimension::Int64
_::Core.Const(Val{4}())
Locals
#40::var"#40#41"
shape::NTuple{4, Int64}
Body::Array{Float64, 4}
1 ─ %1  = Core.apply_type(Main.Array, Main.Float64, $(Expr(:static_parameter, 1)))::Core.Const(Array{Float64, 4}) │ %2 = Main.Returns(2)::Core.Const(Returns{Int64}(2)) │ (shape = Main.ntuple(%2,$(Expr(:static_parameter, 1))))
│         (#40 = %new(Main.:(var"#40#41")))
│   %5  = #40::Core.Const(var"#40#41"())
│   %6  = Main.CartesianIndices(shape::Core.Const((2, 2, 2, 2)))::Core.Const(CartesianIndices((2, 2, 2, 2)))
│   %7  = Base.Generator(%5, %6)::Core.Const(Base.Generator{CartesianIndices{4, NTuple{4, Base.OneTo{Int64}}}, var"#40#41"}(var"#40#41"(), CartesianIndices((2, 2, 2, 2))))
│   %8  = Base.collect(%7)::Array{Float64, 4}
│   %9  = Base.convert(%1, %8)::Array{Float64, 4}
│   %10 = Core.typeassert(%9, %1)::Array{Float64, 4}
└──       return %10


Note that I renamed the function to delta_tensor to follow Julia conventions. Also, the suggested implementation is significantly more efficient:

(@v1.8) julia v1.8.5> @benchmark delta_tensor(2)
BenchmarkTools.Trial: 10000 samples with 962 evaluations.
Range (min … max):  67.966 ns …  1.820 μs  ┊ GC (min … max): 0.00% … 95.41%
Time  (median):     78.095 ns              ┊ GC (median):    0.00%
Time  (mean ± σ):   88.547 ns ± 99.345 ns  ┊ GC (mean ± σ):  8.08% ±  6.83%

▁▃▂▅▇█▆▅▅▄▄▃▂▂▁▁▁                                           ▂
█████████████████▇█▇▆▅▅▅▆▅▅▄▄▄▄▃▁▃▃▃▁▁▁▁▁▁▁▁▅▆▆▆▇█▇▆▅▅▅▄▆▆▅ █
68 ns        Histogram: log(frequency) by time       177 ns <

Memory estimate: 256 bytes, allocs estimate: 2.

(@v1.8) julia v1.8.5> @benchmark DeltaTensor(2)
BenchmarkTools.Trial: 10000 samples with 5 evaluations.
Range (min … max):  6.590 μs …  1.197 ms  ┊ GC (min … max): 0.00% … 98.65%
Time  (median):     7.534 μs              ┊ GC (median):    0.00%
Time  (mean ± σ):   8.756 μs ± 22.636 μs  ┊ GC (mean ± σ):  5.68% ±  2.21%

▅▅▃▇█▇▆▅▅▆▄▄▃▂▁▁                      ▁▁▁▁                 ▂
██████████████████▇██▇▇▆▇▇▇▆▆▇█▇▇▇█▇████████▇▆█▇▇▇▅▇▆▆▆▅▇█ █
6.59 μs      Histogram: log(frequency) by time     16.7 μs <

2 Likes

Off-topic:

It’s worth noting that it’s extremely easy to mis-use parametric “value” types, including Val ; in unfavorable cases, you can easily end up making the performance of your code much worse . In particular, you would never want to write actual code as illustrated above. (source)

I also did it in the past - and I know it is not simple, but shouldn’t we try to avoid offering examples of “proper usage” for some concept only to mention right away that you would never want to write actual code as illustrated above?

It is more like: here is a piece of code on how to properly use concept X - by the way, you would never want to write such code since it is not a proper use of X or violates best practice rules.

2 Likes

Thx for your response. Great approach.
However, if I just run @code_warntype DeltaTensor(2), Julia still can’t infer full output type:

function DeltaTensor(total_dimension::Int64)
return DeltaTensor(total_dimension, Val(total_dimension * 2))
end

function DeltaTensor(total_dimension::Int64,::Val{tensororder})::Array{Float64,tensororder} where {tensororder}
shape = ntuple(Returns(2), tensororder)
return [float(allequal(i.I)) for i in CartesianIndices(shape)]
end

@code_warntype DeltaTensor(2)
MethodInstance for DeltaTensor(::Int64)
from DeltaTensor(total_dimension::Int64) @ Main In:1
Arguments
#self#::Core.Const(DeltaTensor)
total_dimension::Int64
Body::Array{Float64}
1 ─ %1 = (total_dimension * 2)::Int64
│   %2 = Main.Val(%1)::Val
│   %3 = Main.DeltaTensor(total_dimension, %2)::Array{Float64}
└──      return %3


It is only correct when using @code_warntype delta_tensor(2, Val(4)).

Yes, I think there is no way around that, since the parameter is a runtime value. But the crux of the function is wrapped, and that compiles into efficient and type stable code.

Also, @algunion is quite right, the performance of the function without Val types is equivalent (or even slightly better):

function delta_tensor_2(total_dimension::Int,
tensor_order::Integer = 2total_dimension
)::Array{Float64}
shape = ntuple(Returns(2), tensor_order)
return [float(allequal(i.I)) for i in CartesianIndices(shape)]
end

(@v1.8) julia v1.8.5> @benchmark delta_tensor_2(2)
BenchmarkTools.Trial: 10000 samples with 963 evaluations.
Range (min … max):  72.840 ns …  1.847 μs  ┊ GC (min … max): 0.00% … 92.61%
Time  (median):     78.427 ns              ┊ GC (median):    0.00%
Time  (mean ± σ):   88.471 ns ± 96.277 ns  ┊ GC (mean ± σ):  7.82% ±  6.81%

▂▆█▇▆▅▅▄▄▃▂▁                                                ▂
█████████████▇▇▅▄▄▄▃▅▅▄▄▄▃▁▁▃▁▁▃▁▁▁▁▁▁▁▃▁▁▁▃▁▅▆▆█▇▇▆▆▅▅▆▅▅▅ █
72.8 ns      Histogram: log(frequency) by time       176 ns <

Memory estimate: 256 bytes, allocs estimate: 2.


Type stability is not everything.

1 Like

There is a chance that constant propagation can resolve this in a “real use case.” For example, foo() = DeltaTensor(2) might infer properly since 2 is a compile-time constant. But if you mostly expect this to be done with compile-time constants, I would just use a Val from the start. Here’s my implementation:

function DeltaTensor(::Val{total_dimension}) where total_dimension
isa(total_dimension,Integer) || error("total_dimension must be a Val{<:Integer}")
tensororder = 2*total_dimension # or don't even pass total_dimension as an argument -- can just input tensororder directly
shape = ntuple(Returns(2), Val(tensororder))

out = map(CartesianIndices(shape)) do i
Float64(all(Base.Fix2(isequal,i), Base.tail(Tuple(i))))
end
return out
end


None of your type annotations do anything. It will be simpler to leave them out.

Also consider your computation. On your 2\times2\ldots\times2 array, the only index where all the coordinates are 1 is the first and the only index where all coordinates are 2 is the last. So I would actually just use

function DeltaTensor(::Val{total_dimension}) where total_dimension
isa(total_dimension,Integer) || error("total_dimension must be a Val{<:Integer}")
tensororder = 2*total_dimension # or don't even pass total_dimension as an argument -- can just input tensororder directly
shape = ntuple(Returns(2), Val(tensororder))

out = zeros(Float64, shape)
out[begin] = out[end] = 1
return out
end


which avoids all the useless index calculations. If you need to generalize this to N_1 \times N_2 \ldots \times N_d arrays, use out[begin:sum(strides(out)):end] .= 1

This is natural. How can the compiler predict what the input value to the function is going to be, beforehand? If all you know is that x is of type Int, can you tell what value x is?

If the compiler only knows the type, it cannot infer the value (unless it’s a Value type.)

The ‘cheating’ way around this is, as have been mentioned, constant propagation, but that only works in some cases, and only for actual compile time constants.