I have the following situation. I am iterating something (i.e., numerical integration) forward in time, and sometimes I want to return the full trajectory of values at the intermediate steps, while other times I want to only return the terminal value.
The naive solution is:
using Random
function update!(x)
x[1] = 0.5 * x[2] + randn();
x[2] = -0.5 * x[1] + randn();
x
end
function integrate_path(x0, n_iters; save_path=false)
x = copy(x0)
if(save_path)
path = [similar(x0) for j in 1:n_iters];
end
for j in 1:n_iters
update!(x);
if(save_path)
@. path[j] = x;
end
end
if(save_path)
return path
else
return x
end
end
X0 = [1.,-2.];
Random.seed!(100);
@code_warntype integrate_path(X0, 100)
Nothing sexy, but if the path variable is not too large and storing it does not significantly impact performance, then I would probably always just return a tuple (x,path). Then you can just decide when you call the function whether you want to receive one or both outputs.
Is this even really a type instability? The compiler correctly infers that the function will return one of two possible objects. I generally consider type instability to be when the compiler cannot infer the type at all and starts boxing things. Perhaps someone with more experience will chime in.
My solutions is to “tell” the compiler what I expect as the return type by making it part of the signature, e.g. in this case:
function integrate_path_typed(::Type{T}, x0, n_iters) where T
x = copy(x0)
if T <: Vector{Vector}
path = fill(similar(x0), n_iters)
end
for j in 1:n_iters
update!(x)
if T <: Vector{Vector}
@. path[j] = x
end
end
if T <: Vector{Vector}
return path
else
return x
end
end
The issue with this solution is of course that the keyword argument was much easier to understand. So I would probably wrap the path in a struct to make the intention clearer.
struct Path{T}
data::Vector{Vector{T}}
end
function integrate_path_struct(::Type{T}, x0, n_iters) where T
x = copy(x0)
if T <: Path
path = Path(fill(similar(x0), n_iters))
end
for j in 1:n_iters
update!(x)
if T <: Path
@. path.data[j] = x
end
end
if T <: Path
return path
else
return x
end
end
That being said, if you are worried about allocations I would restructure the code completely to enable more memory reuse and let the caller handle the allocations, e.g.
function integrate_path!(x, n_iters)
for _ in 1:n_iters
update!(x)
end
return x
end
function integrate_path!(path::Vector{Vector}, x)
for j in eachindex(path)
update!(x)
@. path[j] = x
end
return path
end
This way you get the cleanest code in my opinion and properly use multiple dispatch for the price of some minor repetition.
function integrate_path(x0::T, n_iters;
path::Union{Nothing, Vector{T}}=nothing) where T
x = copy(x0)
for j in 1:n_iters
update!(x);
if !isnothing(path)
push!(path, copy(x))
end
end
return x
end
Thus the user who wants the saved path would pass in an empty vector to place it in: