Hi @tim.holy; thanks for your question. Could you shed any light on when @generated
functions are slower? I did try not to use them because I remembered something about them being problematic for static compilation, whenever that becomes readily available. But I did not think that they would ever be slower?
As for your question/suggestion: So actually the TensorMap has two integer type parameters that have to do with dimensionality: a TensorMap
object represents a linear map from a tensor product of N2
vector spaces to a tensor product of N1
vector spaces. These spaces can be reshuffled between domain and codomain, but that yields a TensorMap
with different N1′
and N2′
.
Here are a whole bunch of timings (for which one should restart Julia every time, unless there would be a mechanism to clear the method table?)
On Julia 1.5
julia> @time Base.return_types(f, (TensorMap, TensorMap, TensorMap))
209.706046 seconds (375.58 M allocations: 28.486 GiB, 4.23% gc time)
1-element Array{Any,1}:
TensorMap{S,N₁,N₂,G,A,F₁,F₂} where F₂ where F₁ where A<:(Union{TensorKit.SortedVectorDict{G,var"#s99"} where var"#s99"<:(DenseArray{T,2} where T), var"#s100"} where var"#s100"<:(DenseArray{T,2} where T)) where G<:Sector where N₂ where N₁ where S<:ElementarySpace
julia> S = Z2Space
ℤ₂Space
julia> @time Base.return_types(f, (TensorMap{S}, TensorMap{S}, TensorMap{S}))
51.242267 seconds (116.99 M allocations: 7.747 GiB, 5.93% gc time)
TensorMap{ℤ₂Space,_A,_B,ℤ₂,_C,_D,_E} where _E where _D where _C<:(Union{TensorKit.SortedVectorDict{ℤ₂,var"#s99"} where var"#s99"<:(DenseArray{T,2} where T), var"#s100"} where var"#s100"<:(DenseArray{T,2} where T)) where _B where _A
julia> @time Base.return_types(f, (TensorMap{<:IndexSpace,2,2}, TensorMap{<:IndexSpace,2,1}, TensorMap{<:IndexSpace,3,4}))
262.097488 seconds (432.13 M allocations: 32.961 GiB, 3.84% gc time)
1-element Array{Any,1}:
TensorMap{S,N₁,N₂,G,A,F₁,F₂} where F₂ where F₁ where A<:(Union{TensorKit.SortedVectorDict{G,var"#s99"} where var"#s99"<:(DenseArray{T,2} where T), var"#s100"} where var"#s100"<:(DenseArray{T,2} where T)) where G<:Sector where N₂ where N₁ where S<:ElementarySpace
julia> S = Z2Space
ℤ₂Space
julia> @time Base.return_types(f, (TensorMap{S,2,2}, TensorMap{S,2,1}, TensorMap{S,3,4}))
56.762282 seconds (120.42 M allocations: 8.163 GiB, 5.60% gc time)
TensorMap{ℤ₂Space,_A,_B,ℤ₂,_C,_D,_E} where _E where _D where _C<:(Union{TensorKit.SortedVectorDict{ℤ₂,var"#s99"} where var"#s99"<:(DenseArray{T,2} where T), var"#s100"} where var"#s100"<:(DenseArray{T,2} where T)) where _B where _A
julia> @time Base.return_types(f, (tensormaptype(S,2,2,Float64),tensormaptype(S,2,1,Float64),tensormaptype(S,3,4,Float64)))
25.287506 seconds (107.81 M allocations: 5.063 GiB, 11.22% gc time)
TensorMap{ℤ₂Space,3,4,ℤ₂,TensorKit.SortedVectorDict{ℤ₂,Array{Float64,2}},FusionTree{ℤ₂,3,1,2,Nothing},FusionTree{ℤ₂,4,2,3,Nothing}}
On Julia 1.6/master
julia> @time Base.return_types(f, (TensorMap, TensorMap, TensorMap))
46.389146 seconds (99.33 M allocations: 6.786 GiB, 4.18% gc time)
1-element Vector{Any}:
TensorMap
julia> S = Z2Space
julia> @time Base.return_types(f, (TensorMap{S}, TensorMap{S}, TensorMap{S}))
42.310939 seconds (87.23 M allocations: 6.132 GiB, 3.06% gc time)
1-element Vector{Any}:
TensorMap{ℤ₂Space,_A,_B,ℤ₂,_C,_D,_E} where _E where _D where _C<:(Union{TensorKit.SortedVectorDict{ℤ₂,var"#s88"} where var"#s88"<:(DenseMatrix{T} where T), var"#s89"} where var"#s89"<:(DenseMatrix{T} where T)) where _B where
julia> @time Base.return_types(f, (TensorMap{<:IndexSpace,2,2}, TensorMap{<:IndexSpace,2,1}, TensorMap{<:IndexSpace,3,4}))
48.019551 seconds (104.80 M allocations: 7.120 GiB, 4.45% gc time)
1-element Vector{Any}:
TensorMap
julia> @time Base.return_types(f, (TensorMap{S,2,2}, TensorMap{S,2,1}, TensorMap{S,3,4}))
41.746267 seconds (84.82 M allocations: 5.918 GiB, 3.65% gc time)
1-element Vector{Any}:
TensorMap{ℤ₂Space,_A,_B,ℤ₂,_C,_D,_E} where _E where _D where _C<:(Union{TensorKit.SortedVectorDict{ℤ₂,var"#s88"} where var"#s88"<:(DenseMatrix{T} where T), var"#s89"} where var"#s89"<:(DenseMatrix{T} where T)) where _B where _A
julia> @time Base.return_types(f, (tensormaptype(S,2,2,Float64),tensormaptype(S,2,1,Float64),tensormaptype(S,3,4,Float64)))
19.412151 seconds (95.24 M allocations: 4.940 GiB, 13.05% gc time)
1-element Vector{Any}:
TensorMap{ℤ₂Space,3,4,ℤ₂,TensorKit.SortedVectorDict{ℤ₂,Matrix{Float64}},FusionTree{ℤ₂,3,1,2,Nothing},FusionTree{ℤ₂,4,2,3,Nothing}}
So the largest reduction comes from the first argument of TensorMap
, which is the type of vector space. This makes sense, as there are several functions which have an explicit branch depending on static properties of that parameter (not directly on that parameter itself, which is why I did not define separate methods, but rather had explicit branches, relying on dead branch elimination by the compiler). So as soon as this parameter is known, it does indeed seem that the compiler excludes several paths, and compilation time speeds up. In a practical simulation, this first type parameter will be fixed and will typically not be lost even in sloppy code. The only reason why it can sometimes not be inferred is in typical test functions, where you also vary over different values of that parameter, and then write a test function which does not specialize on that type. This happened in Julia 1.5 if that type was an argument of your test function, but without an explicit type parameter, as by default functions were not specialized on arguments which are types.
Adding the dimensionality parameters N1
and N2
actually lead to a worse compilation time if the first parameter is not known. If it is known, they only lead to a modest improvement in compilation time. The fully concrete type (which for convenience can be constructed using a call to tensormaptype
) furthermore includes information about the storage type and thus also the eltype
of the tensor data and does of course lead to the fastest compilation time.