Hi all,
I stumbled upon a combination of factors that lead julia to allocate memory where I dont think it should. I guess its an odd case and maybe this has a reason. If so, I would be happy if someone could explain it to me
To briefly summarize, if a struct contains a Vector{String} and a field that is variably defined by T, and the Vector{String} is accessed via another struct that references the first one within a comprehension-style function argument, julia allocates memory for the access. Any other combination of the above does not seem to do this, so I’m curious why it would happen for this combination.
edit: I understand the answer of @gdalle and this solves the issue for me. But I would still be interested in why case 3 is not allocating when case 5 does.
Here is the code:
using BenchmarkTools
struct VariableType{T<:UInt}
var::T
s::Vector{String}
end
struct VariableTypeReferer
vt::VariableType
end
struct FixedType
var::UInt
s::Vector{String}
end
struct FixedTypeReferer
ft::FixedType
end
function test_variable_type()
l = 1000
rs = [10*i+1:10*i+10 for i in 0:(floor(Int, l/10-1))] # ranges needed for the comprehension syntax later
vt = VariableType(UInt(1), fill("test", l))
ft = FixedType(UInt(1), fill("test", l))
check_strings = ["test1", "test"]
# case 1: For a VariableType: direct access to string array of VariableType
c = 0
@btime for i in 1:length($vt.s)
$c += $vt.s[i] in $check_strings
end
# case 2: For a VariableType: direct access without the referencing struct, but with intermediate iterator
c = 0
@btime for i in 1:length($rs)
$c += sum($vt.s[ii] in $check_strings for ii in $rs[i])
end
# case 3: For a VariableType: access through referencing struct, but without intermediate iterator
c = 0
@btime for i in 1:length($rs)
vtr = VariableTypeReferer($vt)
for ii in $rs[i]
$c += vtr.vt.s[ii] in $check_strings
end
end
# case 4: For FixedType: access through referencing struct and with intermediate iterator
c = 0
@btime for i in 1:length($rs)
ftr = FixedTypeReferer($ft)
$c += sum(ftr.ft.s[ii] in $check_strings for ii in $rs[i])
end
# case 5: For a VariableType: access through referencing struct and with intermediate iterator
c = 0
@btime for i in 1:length($rs)
vtr = VariableTypeReferer($vt)
$c += sum(vtr.vt.s[ii] in $check_strings for ii in $rs[i])
end
end
test_variable_type()
For me, this results in:
2.973 ÎĽs (0 allocations: 0 bytes)
3.802 ÎĽs (0 allocations: 0 bytes)
3.474 ÎĽs (0 allocations: 0 bytes)
3.658 ÎĽs (0 allocations: 0 bytes)
13.306 ÎĽs (100 allocations: 3.12 KiB)
edit: I changed the code a little to only contain necessary parts.
Thanks already,
Malte