I’m trying to do lazy evaluations as follows:
struct V
data::Vector{Int}
end
f(v, x) = Base.broadcasted(+, x, v.data)
g(v, x) = Base.broadcasted(*, x, v.data)
const x = [1, 2]
const a = V([3, 4])
const b = V([5, 6])
julia> (x .+ a.data) .* b.data
2-element Array{Int64,1}:
20
36
julia> Base.materialize(g(b, f(a, x) ) )
2-element Array{Int64,1}:
20
36
julia> @btime ($x .+ $a.data) .* $b.data;
37.233 ns (1 allocation: 96 bytes)
julia> @btime Base.materialize(g($b, f($a, $x) ) );
37.323 ns (1 allocation: 96 bytes)
so far so good.
However, once I add two more operations (materialization and matrix multiplication) into it, speed slows down and comes up with unknown allocation:
struct M
data::Matrix{Int}
end
h(m, x::Broadcast.Broadcasted) = h(m, Base.materialize(x) )
h(m, x) = m.data * x
const m = M([1 2; 3 4])
fun1() = m.data * ((x .+ a.data) .* b.data)
fun2() = h(m, g(b, f(a, x) ) )
julia> fun1()
2-element Array{Int64,1}:
92
204
julia> fun2()
2-element Array{Int64,1}:
92
204
julia> @btime fun1();
83.895 ns (2 allocations: 192 bytes)
julia> @btime fun2();
105.800 ns (6 allocations: 288 bytes)
now fun2()
is slower and has 4 mysterious additional allocations.
as far as I understand, the code of fun1()
and fun2()
should be identical:
julia> @code_lowered fun1()
CodeInfo(
1 ─ %1 = Base.getproperty(Main.m, :data)
│ %2 = Base.getproperty(Main.a, :data)
│ %3 = Base.broadcasted(Main.:+, Main.x, %2)
│ %4 = Base.getproperty(Main.b, :data)
│ %5 = Base.broadcasted(Main.:*, %3, %4)
│ %6 = Base.materialize(%5)
│ %7 = %1 * %6
└── return %7
)
### compared to:
julia> @code_lowered fun2()
CodeInfo(
1 ─ %1 = Main.f(Main.a, Main.x)
│ %2 = Main.g(Main.b, %1)
│ %3 = Main.h(Main.m, %2)
└── return %3
)
# which can be decomposed as:
julia> @code_lowered f(a, x)
CodeInfo(
1 ─ %1 = Base.broadcasted
│ %2 = Base.getproperty(v, :data)
│ %3 = (%1)(Main.:+, x, %2)
└── return %3
)
julia> @code_lowered g(b, f(a, x) )
CodeInfo(
1 ─ %1 = Base.broadcasted
│ %2 = Base.getproperty(v, :data)
│ %3 = (%1)(Main.:*, x, %2)
└── return %3
)
julia> @code_lowered h(m, g(b, f(a, x) ) )
CodeInfo(
1 ─ %1 = Base.materialize
│ %2 = (%1)(x)
│ %3 = Main.h(m, %2)
└── return %3
)
julia> @code_lowered h(m, Base.materialize(g(b, f(a, x) ) ) )
CodeInfo(
1 ─ %1 = Base.getproperty(m, :data)
│ %2 = %1 * x
└── return %2
)
help please! why fun2()
is slower than fun1()
? where’re those 4 additional allocations? thanks.