Memory allocation inconsistency (again...)

aaraujo71 · July 12, 2021, 10:09pm

When I run the following

struct Test
     x :: Array{Float64, 1}
     y :: Array{Float64, 1}
end

function test()
   var = Test([1, 2], [1, 2])
    @time @. var.x = var.y - var.x/var.y^2
    @time @. var.x = var.y - var.x/(var.y*var.y)
    return nothing
end

test()

I get

  0.000001 seconds (2 allocations: 16 bytes)
  0.000000 seconds

What’s the difference? Why do I get memory allocation when using var.y^2 (first calculation) but not with (var.y*var.y) (second calculation)? I’m using Julia 1.6.1, and I think I was not getting this difference in behavior in previous Julia versions (may be 1.5) (although the present code seems pretty useless, I’m actually getting something similar in an important piece of code, and I’m getting annoyed because I don’t understand what’s happening).

Please help.

gbaraldi · July 12, 2021, 11:20pm

Not sure if this helps but it might be weirder

ulia> function test2()
          var = Test([1, 2], [1, 2])
           @time @. var.x =- var.x/var.y^2 + var.y
           @time @. var.x = var.y - var.x/(var.y*var.y)
           return nothing
       end
test2 (generic function with 1 method)

julia> test2()
  0.000000 seconds
  0.000000 seconds

Changing the order removes the allocation

aaraujo71 · July 13, 2021, 10:20am

It just confirms the oddity. I’ve tried in version 1.7.0-beta3 and the result is the same. However, in version 1.5.3, I get no memory allocations.

julia> 

  0.000000 seconds
  0.000000 seconds

I think this is a bug, something related to the broadcasting function or the dot macro introduced in the 1.6-version update… It’s a pity because this kind of behavior makes the language feel unreliable.

lmiq · July 13, 2021, 10:59am

These are most certainly benchmarking artifacts. I would suggest putting each broadcasting in a different function, returning a meaningful result, and using @btime.

Skoffer · July 13, 2021, 11:42am

No, this is definitely some issue with broadcasting. Since 1.5 was not affected, I would recommend git bisect to find the culprit.

anon56330260 · July 13, 2021, 11:48am

The allocation is caused by the RefValue in the lower code.
I use Julia 1.6.1.
You can try the following code.

struct Test
    x :: Array{Float64, 1}
    y :: Array{Float64, 1}
end

function test1(var)
    @. var.x = var.y - var.x/var.y^2
    return nothing
 end
function test2(var)
    @. var.x = var.y - var.x/(var.y*var.y)
    return nothing
end

var = Test([1, 2], [1, 2])
# Run following code twice to exclude the allocation of compiling.
@allocated test1(var)
@allocated test2(var)

test1 allocates 16 bytes and test2 doesn’t allocate.
Check the lower code:

@code_typed test1(var)

│     %240 = Base.getfield(%239, 1, false)::Base.RefValue{typeof(^)}
│            Base.getfield(%240, :x)::typeof(^)
│     %242 = Core.getfield(%239, 2)::Base.Broadcast.Extruded{Vector{Float64}, Tuple{Bool}, Tuple{Int64}}
│     %243 = Core.getfield(%239, 3)::Base.RefValue{Val{2}}

You can see the RefValue here, it causes exactly two allocations, each 8 bytes, while:

@code_typed test2(var)

has no RefValue. And it only creates immutable values.
So I wonder what happens here?
Edit: LLVM IR of test1 has two additional jl_gc_pool_alloc while test2 has no (except on the error branch, but in this case we don’t throw errors).

aaraujo71 · July 13, 2021, 11:58am

I am just a Julia user and like the way Julia handles broadcasting. I’ve posted an issue in https://github.com/JuliaLang/julia/issues/. I hope this issue is solved. In the meantime I went back to 1.5.3…

anon56330260 · July 13, 2021, 12:05pm

Some even more interesting observations:
If I expand the @. manually, then the allocation is gone:

function test3(var)
    copyto!(var.x,Broadcasted(-,(var.y,Broadcasted(/,(var.x,Broadcasted(^,(var.y,2)))))))
    return nothing
end

@allocated test3(var) is zero…
Edit: the above lowering code is not exactly correct. See the following post to get the correct lowering.

aaraujo71 · July 13, 2021, 12:24pm

So, the problem must be with the macro.

anon56330260 · July 13, 2021, 12:28pm

Noop, I mistake the lower form of the code. It actually should be:

Meta.@lower var.x .= var.y .- var.x./var.y.^2

:($(Expr(:thunk, CodeInfo(
    @ none within `top-level scope'
1 ─ %1  = Base.getproperty(var, :x)
│   %2  = Base.getproperty(var, :y)
│   %3  = Base.getproperty(var, :x)
│   %4  = Base.getproperty(var, :y)
│   %5  = Core.apply_type(Base.Val, 2)
│   %6  = (%5)()
│   %7  = Base.broadcasted(Base.literal_pow, ^, %4, %6)
│   %8  = Base.broadcasted(/, %3, %7)
│   %9  = Base.broadcasted(-, %2, %8)
│   %10 = Base.materialize!(%1, %9)
└──       return %10
))))

copy the lower code you get:

function test3(var)
    v7 = Base.Broadcast.broadcasted(Base.literal_pow,^,var.y,Base.Val(2))
    v8 = Base.Broadcast.broadcasted(/,var.x,v7)
    v9 = Base.Broadcast.broadcasted(-,var.y,v8)
    Base.Broadcast.materialize!(var.x,v9)
    return nothing
 end

It still allocates 16 bytes…

anon56330260 · July 13, 2021, 1:07pm

Ok, I guess I found the reason. It’s because there is an uninlined function call preprocess_args.
Redefine the function fix the bug:

import Base.Broadcast.preprocess_args
import Base.Broadcast.preprocess
@inline preprocess_args(dest, args::Tuple) = (Base.Broadcast.preprocess(dest, args[1]), Base.Broadcast.preprocess_args(dest, Base.tail(args))...)
@inline preprocess_args(dest, args::Tuple{Any}) = (Base.Broadcast.preprocess(dest, args[1]),)
@inline preprocess_args(dest, args::Tuple{}) = ()

@aaraujo71 Can you try the following code?
On my computer with Julia 1.6.1:

struct Test
    x :: Array{Float64, 1}
    y :: Array{Float64, 1}
end

function test1(var)
    @. var.x = var.y - var.x/var.y^2
    return nothing
end

var = Test([1, 2], [1, 2])
#compile
@allocated test1(var)
@assert var.x == [0,1.5]

var = Test([1, 2], [1, 2])
@allocated test1(var)
@assert var.x == [0,1.5]

import Base.Broadcast.preprocess_args
import Base.Broadcast.preprocess
@inline preprocess_args(dest, args::Tuple) = (Base.Broadcast.preprocess(dest, args[1]), Base.Broadcast.preprocess_args(dest, Base.tail(args))...)
@inline preprocess_args(dest, args::Tuple{Any}) = (Base.Broadcast.preprocess(dest, args[1]),)
@inline preprocess_args(dest, args::Tuple{}) = ()
var = Test([1, 2], [1, 2])
#compile
@allocated test1(var)
@assert var.x == [0,1.5]

var = Test([1, 2], [1, 2])
@allocated test1(var)
@assert var.x == [0,1.5]

You should have 4 allocation number, the first and the third one is a large number including compilation time and the second and the fourth one is 16 (with uninlined function) and 0 (after fix).

gbaraldi · July 13, 2021, 2:23pm

Github issue for those wanting to follow it https://github.com/JuliaLang/julia/issues/41565

aaraujo71 · July 13, 2021, 2:26pm

I am using 1.5.3 now. However, when I run your code I get nothing. I must be doing something wrong.

anon56330260 · July 13, 2021, 2:29pm

My mistake… You need to add a println to each @allocoated. I use a REPL so println is not needed. The code would be:

struct Test
    x :: Array{Float64, 1}
    y :: Array{Float64, 1}
end

function test1(var)
    @. var.x = var.y - var.x/var.y^2
    return nothing
end

# compile
var = Test([1, 2], [1, 2])
println(@allocated test1(var))
@assert var.x == [0,1.5]

# before fix
var = Test([1, 2], [1, 2])
println(@allocated test1(var))
@assert var.x == [0,1.5]

import Base.Broadcast.preprocess_args
import Base.Broadcast.preprocess
@inline preprocess_args(dest, args::Tuple) = (Base.Broadcast.preprocess(dest, args[1]), Base.Broadcast.preprocess_args(dest, Base.tail(args))...)
@inline preprocess_args(dest, args::Tuple{Any}) = (Base.Broadcast.preprocess(dest, args[1]),)
@inline preprocess_args(dest, args::Tuple{}) = ()

# compile
var = Test([1, 2], [1, 2])
println(@allocated test1(var))
@assert var.x == [0,1.5]

# after fix
var = Test([1, 2], [1, 2])
println(@allocated test1(var))
@assert var.x == [0,1.5]

Simply save the code in the file and run it with Julia. You should get:

$ julia alloc.jl 
23433466
16
10964697
0

@aaraujo71

aaraujo71 · July 13, 2021, 3:02pm

I get the following:

(version 1.6.1, first run)

(version 1.6.1, second run)

(version 1.5.3, first run)

(version 1.5.3, second run)

Topic		Replies	Views
Struct and memory allocation Performance memory-allocation	4	309	January 10, 2024
What does allocation mean when creating a new array New to Julia	3	1353	December 30, 2018
Why is there memory allocation and time difference in @views and view? Performance memory-allocation	9	473	March 29, 2021
Allocations (again...) Performance question , allocations	15	656	November 25, 2022
Allocations in function timing Performance	2	550	October 9, 2018

Memory allocation inconsistency (again...)

Related topics