The matrices written to d1 are converted into Matrix{Real} while the original types are retained for the matrices written to d2. So, there must be a difference between d1 and d2 resulting from the different type annotations. I couldn’t find much about this in the docs, but I might have overlooked something. Somewhere it said that <: Real is just shorthand for T where T <: Real which doesn’t make much sense in this case.
I noticed that d2 |> valtype |> eltype gives Any. Is that annotation actually just equivalent to Any? Even if that was the case, why is the no conversion to Matrix{Any}? In the long run I’m actually trying to understand if such a type annotation makes sense or not, cause somewhere in the docs it also said that if you are somehow forced to have containers with abstract types you should consider annotating those with Any as type checking could harm performance otherwise.
when you have Matrix{<:Real}, it means the value of this dictionary can be any Matrix{T} as long as T <: Real, which means that Matrix{Float64} already “fits in”.
because that’s what it is, <:Real doesn’t give you a concrete set of types because at run time people can sub-type it. and returning Real would be wrong because Real is just one of many types that satisfy T <: Real
Sure, both set of types are limitless in a way but certainly not equal. I just wondered were the constrained went. I guess in terms of performance Dict{String, Any} would still be preferable over Dict{String, Matrix{<:Real}}, right? As the type constrained introduces additional type checking at runtime.
Under performance tips, it said:
If you cannot avoid containers with abstract value types, it is sometimes better to parametrize with Any to avoid runtime type checking. E.g. IdDict{Any, Any} performs better than IdDict{Type, Vector}
julia> struct Foo{X} x::X end
julia> eltype(Foo{String})
Any
julia> struct Bar end
julia> eltype(Bar)
Any
julia> function baz end
baz (generic function with 0 methods)
julia> eltype(baz)
Any
julia> eltype(Any)
Any
Ah, I see, thank you. So both Anys came from two different implementations and there is probably something like eltype(t::Any) = Any somewhere in Core or Base.
julia> @which eltype(Vector{Any})
eltype(::Type{<:AbstractArray{E}}) where E
@ Base abstractarray.jl:236
julia> @which eltype(Vector{<:Real})
eltype(::Type)
@ Base abstractarray.jl:233
The fact that eltype(Vector{Real}) == Real, yet eltype(Vector{<:Real}) == Any, is probably a bug I think? I would argue the latter should return Real too.
To answer your actual question though: Dict{String, Matrix{<:Real}} is better, because although you suffer type instability from accessing the Dict’s member element Matrix, as soon as you pass it through a function barrier its type gets inferred, the correct specialization gets dispatched, and accessing all the elements of the Matrix is type-stable.
For example:
julia> using BenchmarkTools
julia> s1() = sum(d1["a"])
s1 (generic function with 1 method)
julia> s2() = sum(d2["a"])
s2 (generic function with 1 method)
julia> @btime s1();
303.571 ns (15 allocations: 240 bytes)
julia> @btime s2();
54.065 ns (1 allocation: 16 bytes)
Thanks again! I definitely agree with you. Maybe it actually is a bug. If it’s not I would be very interested in the explanation why it is the way it is.
A word of caution though, when working with type-unstable collections like this:
If you don’t have a function barrier, then using the better data structure can be slower. Take this for example:
julia> foo1(d) = let a=d["a"], s=zero(eltype(a))
for i=eachindex(a); s+=a[i] end
s
end
foo1 (generic function with 1 method)
julia> @btime foo1($d1);
@btime foo1($d2);
285.930 ns (16 allocations: 256 bytes)
1.350 μs (37 allocations: 864 bytes)
A function barrier creates an opportunity for the compiler to infer the concrete type, compile a specialization, and dynamically dispatch to it, even if it’s an inner function defined only locally:
julia> foo2(d) = let a=d["a"]
s(a) = let s=zero(eltype(a)); for i=eachindex(a); s+=a[i] end; s end
s(a)
end
foo2 (generic function with 1 method)
julia> @btime foo2($d1);
@btime foo2($d2);
291.829 ns (16 allocations: 256 bytes)
38.648 ns (1 allocation: 16 bytes)
julia> foo1(d1) ≡ foo2(d1) && foo1(d2) ≡ foo2(d2)
true
But what’s wrong with Real in this case? Giving up is one thing, but I understood the argument to be that Any is somehow ‘more correct’, which is confusing.