How to access all field values for one field of a composite type array

LotteVictor · January 11, 2018, 11:31am

Hi there! I’ve created a composite type with:

mutable struct TAdult
id::Int64
strain::String
hind::Float64
gind::Float64
tradeoff::Float64
end

Then I create an array of that type with:

function InitAdults(ft::Int64, fhind_mean::Float64, fhind_std::Float64, fgind_mean::Float64, fgind_std::Float64, tcost::Float64)
Adults=Array{TAdult}(0)
for i in 1:ft
id=i
strain=string(id)
hind=rand(Normal(fhind_mean,fhind_std))
gind=rand(LogNormal(fgind_mean,fgind_std))
tradeoff=exp((-gind^2)/(2*tcost^2))
nAdult=TAdult(id, strain, hind, gind, tradeoff)
push!(Adults, nAdult)
end
return Adults
end

In a new function, I need to access certain fields of all TAdult in that array. I can achive this by looping over the array (e.g. in a function)

function idvector(fAdults::Array{TAdult,1})
idvector=Float64
for i in 1:length(fAdults)
push!(idvector, fAdults[i].id)
end
return idvector
end

However, I think this takes too long and I want to improve my performance. So I was wondering whether there was a possible way to do something similar to
idvector= Adults[1:end].id
I thought about using a tuple type, but if I understand it correctly, tuple types are immutable? And also would only be accessible by index, not fieldname. Does anybody have an idea how to solve this?
Thanks alot!

kristoffer.carlsson · January 11, 2018, 11:48am

Did you measure and found out that the loop was slow? It looks perfectly fine to me and any alternative way of writing this will, in the end, have to do the same as what you have written, loop over all elements and extract the id field.

GunnarFarneback · January 11, 2018, 12:05pm

Try a comprehension.

idvector = [x.id for x in Adults]

LotteVictor · January 11, 2018, 2:02pm

The loop itself isn’t slow, but I need to access different fields of several thousand instances repeteadly and my whole simulation is going to take ten hours if I do not find a faster way. So there is definetly the need for performance improvement. I was just wondering, whether there is a fast track to access an indexed value of an indexed array in an array…Apperently there is none (?), so I’ll probably have to change the simulation at a more basal level. Thanks for your input!

LotteVictor · January 11, 2018, 2:03pm

@GunnarFarneback I’ll try this and see if it’s faster. Thank you

kristoffer.carlsson · January 11, 2018, 2:04pm

Perhaps using a “Struct of Arrays” would work better e.g:

struct TAdults
    ids::Vector{Int64}
    strains::Vector{String}
    hinds::Vector{Float64}
    ginds::Vector{Float64}
    tradeoffs::Vector{Float64}
end

Tamas_Papp · January 11, 2018, 2:06pm

Did you in fact profile your code and ascertain that this is the bottleneck? When the types are known to the compiler, value.field accessors are very fast.

LotteVictor · January 11, 2018, 2:08pm

That’s a good idea, but would not represent what I want to achieve. Every instance of TAdult needs to accessed later on again, replicated and even changed sometimes.

LotteVictor · January 11, 2018, 2:13pm

No I didn’t find the actual bottleneck yet.
But I have an idea, about what could take so long (which is not the access to the field, but changing it later on), which I thought I could shorten up with a different way to access the fields. However, I’m rather new to all this (especially performance improvement hasn’t been an issue yet), but before I expand my simulation, I want to make sure, it runs as fast as possible… It’s good to know that value.field is usually fast… Thank you so much!

sdanisch · January 11, 2018, 2:13pm

You can automate that process with: https://github.com/simonster/StructsOfArrays.jl

Btw, iterating Vector{<: mutable struct} is quite a bit slower then Vector{<:struct}, since you end up iterating a linked list. So trying out the struct approach and if it’s lots faster figure out how to work with the immutability could be the best way to deal with this.

LotteVictor · January 11, 2018, 2:20pm

Thanks! I know that it would be faster to have immutable struct, but I think that’s not really a possibility here. I need to be able to change everything rather often (with a probability of 0.03% of ~10 000 *100 instances). Or do you have a good idea how I could deal with immutablity in this case?
I’ll have a look into StructsOfArrays. This sounds interesting. Thank you!

LotteVictor · January 11, 2018, 2:23pm

So far, this didn’t improve the performance. If I call the above mentioned function I get:

@time fooidvector(Testpop)
0.000002 seconds (8 allocations: 512 bytes)

and if I use the comprehension it’s

@time idvector = [x.id for x in Testpop]
0.039866 seconds (8.72 k allocations: 488.114 KiB)

Is this something you’d expect? Thanks alot!

sdanisch · January 11, 2018, 2:25pm

The compiler is very good at optimizing immutables. Try it out and benchmark If the approach works, you might want to have some helpers to “modify” the immutables. Hopefully, this should be a lot nicer on 0.7!

LotteVictor · January 11, 2018, 2:30pm

Thanks a lot. I’ll try to figure out if there is a way for me to use immutables. with helpers, do you mean find/create functions to change the values? I need them permanently changed (because of evolution ) and can’t have them show up changed for certain parts of the simulation and then not be changed in the initial instance.
Or did you already have something in mind to help achieve this? Thank you so much!

kristoffer.carlsson · January 11, 2018, 2:45pm

Iterating an isbits is faster. Even with immutable this wouldn’t be isbits due to the string field.

This is the first thing to do though. How can you try optimize code if you don’t know what is slow?

sdanisch · January 11, 2018, 2:47pm

I would have thought the array is the shared state
It won’t work if you pass around the instances outside the array and mutate them at multiple stages and expect everything to update.

sdanisch · January 11, 2018, 2:48pm

Iterating an isbits is faster. Even with immutable this wouldn’t be isbits due to the string field.

Good point, I overlooked the String… Would be interesting if that really needs to be a string though

LotteVictor · January 11, 2018, 2:56pm

I think, the string rellay need to be a string. It contains a mixture of characters and numbers and can be rather long (like 1h23m45h68h79m50 or something similar). Is there a possibility to do this without a string?

LotteVictor · January 11, 2018, 2:58pm

True, I should have done that first. Do you have any tips on doing this correctly? Or a nice link? There is the BenchmarkTools.jl package, for benchmarking. What exactly did you mean before with profiling? Can you recommend a homepage to read? Thanks again!

sdanisch · January 11, 2018, 3:01pm

Well, @kristoffer.carlsson is right, you should first find your actual bottleneck, before you dive into further optimizations
But if the string is fixedsize you could use something like NTuple{N, UInt8} or NTuple{N, Char} or FixedSizeStrings.jl.
Or just save an index into a string array - but that will just get you down further the premature optimization rabbit hole, and you should really first pin down the actual bottlenecks

About the profiling: https://docs.julialang.org/en/stable/manual/profile/

Topic		Replies	Views
Obtaining field values over an array of composite types Performance struct	11	128	October 21, 2024
Mutable vs immutable struct: modifying an array field General Usage struct , mutable-structure	4	1804	February 22, 2023
Structs: mutable versus immutable New to Julia question , struct	22	12564	January 25, 2021
Extract a field from an array of structures General Usage	7	1617	June 17, 2022
Mutable scalar in immutable object: the best alternative? New to Julia	11	2091	May 29, 2024

How to access all field values for one field of a composite type array

Related topics