Just yesterday I had code that worked fine for several months
v::Vector{Union{Float64, Missing}}
t = f.(v)
t .= ifelse.(ismissing.(t), NaN, t)
crash because v was all-missing on some new data, so t::Vector{Missing}
, which cannot store NaN
. Thankfully, that didn’t happen in production, but it’s a serious concern whether we’ve tested all cases.
I agree with @Tamas_Papp that we’re missing some building blocks for generic code. For instance, here’s a short version of the basic loop for generic Kalman filter code:
function kalman_filter(initial_state, observations, theta, N)
filtered = []
likelihoods = []
state = initial_state
for i in 1:N
state, ll = kernel(state, observations[i], theta)
push!(filtered, state)
push!(likelihoods, ll)
end
return filtered, likelihoods
end
How should I pick the eltype of filtered
? I could naively take the type of initial_state
, but then if I use ForwardDiff
to do gradient descent on theta
, the state
and likelihoods
will contain DualNumber
s, so that does not work. What works:
- Use
Base.return_type
. Oh, but maybe we need to call it several times until convergence of the type of state
- Use
accumulate
. Have the accumulator function return the tuple (state, ll)
, then unzip the sequences afterward.
- Do one or two iterations of
state = kernel(...)
just to pick the eltype, and hope that observations
doesn’t have any missing
s.
- Do my own type widening.
I tried accumulate
. The code was ugly, and performance was bad, but it’s a solution in theory. Type widening doesn’t sound like beautiful code either. As for Base.return_type
, I had some bad experience using it in Julia 0.6, but if it’s officially The Julian Way, I could give it another try. Right now, we use option #3, and I’m not happy with it, but it generally works.