Adding parameters to function definition avoids allocations

I’m trying to get a small piece of code that I’ll run millions of times to run without allocations. There is a small allocation that I cannot understand the cause of (for example, using Profile.Allocs cannot attribute the allocation to anything, even with a sample rate of 100%), but that I can avoid by adding parameters to a specific function definition. Unfortunately, in practice I can’t use the parameters because I need to support the inputs being of different types.

I’ve attempted to write a MWE to highlight the problem. Unfortunately, it doesn’t exhibit the same behaviour - the allocations happen even with the parameters. I don’t yet know why it’s different. Perhaps it’s because the code I’m using is in a package I’m writing - I have no idea. However, the M(non-)WE shows the syntax I mean, so perhaps someone will be able to explain the behaviour. Here’s the code and output:

using BenchmarkTools
using StaticArrays

struct Residual{T<:Number}
    meas::SVector{2, T}

struct Transform{T<:Number}
    rot::SMatrix{3, 3, T}
    trans::SVector{3, T}

struct Point{T<:Number}
    point::SVector{3, T}

function project(p::Point)
    return SVector{2}(p.point[1], p.point[2]) ./ p.point[3]

function transform(t::Transform, p::Point)
    return project(Point(t.rot * p.point + t.trans))

function computeresidual1(r::Residual, t::Transform, p::Point)
    return transform(t, p) - r.meas

function computeresidual2(r::Residual{T}, t::Transform{T}, p::Point{T})::SVector{2, T} where T <: Real
    return transform(t, p) - r.meas

r = Residual(randn(SVector{2, Float64}))
t = Transform(randn(SMatrix{3, 3, Float64}), randn(SVector{3, Float64}))
p = Point(randn(SVector{3, Float64}))

@btime computeresidual1($r, $t, $p)
@btime computeresidual2($r, $t, $p)

julia> 327.487 ns (8 allocations: 256 bytes)
julia> 446.551 ns (8 allocations: 256 bytes)

In my own case, the call to the function with parameters specified (computeresidual2), has zero allocations. Here it not only still has allocations, it’s also slower; the speed becomes the same if I don’t specify the output type in the function declaration.

I’m running Julia 1.8.2

The problem is that SMatrix{3, 3, T} is not concrete, since it is missing its fourth parameter L which is the total length of the array. Replace it with SMatrix{3, 3, T, 9} and the issue should be gone.

Ah, thanks. Yes, that’s the problem causing the allocations in the MWE. :smiley:
I’ve realised that that’s not the actual problem in my code sadly. Hence why the parameters make no difference. I’ll fix the MWE, and try asking again.

I’ve now created a working MWE and posted it here.