In v0.6 the vector form for *
is deprecated so it will be more obvious though. (I think that v0.6 is a big enough change in this problem domain it is close enough to release that everything should start happening in its language)
You can look at what is happening using expand
.
In your version there isn’t any syntax-level broadcast fusion happening, which means that every operator basically creates a new temporary array
julia> expand(:(Delta_W .+= lr * ( x * ehp' - xneg * ehn')'))
:((Base.broadcast!)(+,Delta_W,Delta_W,A_mul_Bc(lr,A_mul_Bc(x,ehp) - A_mul_Bc(xneg,ehn))))
In the optimized version there are only broadcasted operators used. These dots .
serve as a syntax sugar for broadcast and in master there is something happening that is called broadcast fusion. basically it merges all the dotted operators/function in one inner function that is broadcasted over the individual arrays. Thus no temporary memory for the inbetween computation needs to be allocated. Take a look:
julia> expand(:(Delta_W .+= lr .* (ehp .* x' .- ehn .* xneg')))
:($(Expr(:thunk, CodeInfo(:(begin
$(Expr(:thunk, CodeInfo(:(begin
global ##3#4
const ##3#4
$(Expr(:composite_type, Symbol("##3#4"), :((Core.svec)()), :((Core.svec)()), :(Core.Function), :((Core.svec)()), false, 0))
return
end))))
$(Expr(:method, false, :((Core.svec)((Core.svec)(##3#4,Any,Any,Any,Any,Any,Any),(Core.svec)())), CodeInfo(:(begin
#temp#@_9 = #temp#@_4 * #temp#@_5
#temp#@_8 = #temp#@_6 * #temp#@_7
#temp#@_10 = #temp#@_9 - #temp#@_8
#temp#@_11 = #temp#@_3 * #temp#@_10
return #temp#@_2 + #temp#@_11
end)), false))
#3 = $(Expr(:new, Symbol("##3#4")))
SSAValue(0) = #3
SSAValue(1) = ctranspose(x)
SSAValue(2) = ctranspose(xneg)
return (Base.broadcast!)(SSAValue(0),Delta_W,Delta_W,lr,ehp,SSAValue(1),ehn,SSAValue(2))
end)))))
EDIT: don’t be fooled by the syntax highlighting. there are no comments here. Somehow I am unable to turn of syntax highlighting.
EDIT: Notice the lower two lines that say ctranspose
here the new rowvector addition comes into play that was mentioned before