Array containing references?

nmatzke · October 20, 2019, 12:24am

Hi all,

Is there a way to store references in an array?

Basically, I would like to have a huge array of event rates (array A), however the rates are specified in just a few parameters (e.g. a,b,c,d,e) which are referred to by reference.

This way, I would be able to e.g. change rate “a” during an optimization search, and array A would automatically refer to the new “a” wherever required. This seems far superior to having to re-iterate through the whole array A to update “a” whenever parameter “a” is changed, because A could easily have a length of millions.

I suspect this is very easy in Julia, but I don’t see an obvious solution on google. Any help very appreciated?

Cheers!
Nick

rdeits · October 20, 2019, 1:28am

Sure, you can create a Ref, and then store that reference in an array, or you can store a mutable struct in an array and mutate the fields of that struct. But it’s hard to recommend an exact solution because I’m not totally sure what the desired behavior is. If you’re creating an array in which each element is a reference to the same set of parameters, then why do you even need an array?

nmatzke · October 20, 2019, 2:02am

Hi! Thanks for your help – here’s a clearer explanation I hope:

# My model is 
# describing a bifurcating process where
# you have a starting state and 2 descendant states
# start_state -> desc1_state, desc2state

# There can be numstates^3 of these kinds of 
# transitions, which explodes quickly

# However, many rates are shared.

# I would like this to happen:

using DataFrames
start_state = [1,1,1,2,2,2,3,3,3]
desc1_state = [1,2,3,1,2,3,1,2,3]
desc2_state = [1,1,1,2,2,2,3,3,3]

# Specify the rates
a = 4.1
b = 2.5
rates = [a,a,a,b,b,b,b,b,b]

rates_df = DataFrame(start_state=start_state, desc1_state=desc1_state, desc2_state=desc2_state, rates=rates)

# 9×4 DataFrame
# │ Row │ start_state │ desc1_state │ desc2_state │ rates   │
# │     │ Int64       │ Int64       │ Int64       │ Float64 │
# ├─────┼─────────────┼─────────────┼─────────────┼─────────┤
# │ 1   │ 1           │ 1           │ 1           │ 4.1     │
# │ 2   │ 1           │ 2           │ 1           │ 4.1     │
# │ 3   │ 1           │ 3           │ 1           │ 4.1     │
# │ 4   │ 2           │ 1           │ 2           │ 2.5     │
# │ 5   │ 2           │ 2           │ 2           │ 2.5     │
# │ 6   │ 2           │ 3           │ 2           │ 2.5     │
# │ 7   │ 3           │ 1           │ 3           │ 2.5     │
# │ 8   │ 3           │ 2           │ 3           │ 2.5     │
# │ 9   │ 3           │ 3           │ 3           │ 2.5     │


# Basically, I'd like to be able to change "a" to 1.5, and have this 
# appear in the "rates" column of rates_df when I access it the next time
# (without having to iterate through the whole "rates" column)
# (The DataFrame part is not essential, I could just use a separate Array for each column)
a = 1.5
rates_df

dpsanders · October 20, 2019, 2:07am

Instead of storing the rates, you could store an index into the rates array, which avoids the problem.

nmatzke · October 20, 2019, 2:16am

Thanks for the suggestion, that’s a good one, but unfortunately I have to do lots of operations like summing “rates” where desc1_state==2, so having a reference that auto-updates seems like it would be faster once the columns have length of 10^5 or higher…

Cheers!
Nick

dpsanders · October 20, 2019, 2:35am

I meant that you replace your rates array with

reduced_rates = [a, b]

in your example, and index into that.

rdeits · October 20, 2019, 4:08am

Try this:

a = Ref(1.5)
A = [a, a, a]
a[] = 2

But also note that it sounds like you’re making a design choice based on hypothetical performance without actually measuring anything. In particular, there’s no reason a priori to expect an array of Refs to perform much differently than an array of indices into some small array of values. I’d encourage you to use BenchmarkTools.jl to actually measure some different ideas before you go too far down a particular performance-driven design path.

nmatzke · October 20, 2019, 8:39pm

Thanks for this! It’s true I haven’t measured this, mostly my motivation is that I have this weird problem where

(a) it could easily to expand to 100,000+ rate entries
(b) I am going to have to repeatedly sum over subsets of the rates (every step of an ODE integrator)
(c) the rate parameters will be updated regularly
(d) I will likely have enough rate parameters it would be a minor pain to keep track of the indices for a reduced_rates array

It occurs to me just now that I could just use the parameter names as dictionary keys, that might be the easiest thing.

Cheers!
Nick

Topic		Replies	Views
Reference to Objects Very Slow New to Julia	2	1033	December 15, 2016
How to make a struct that contains a reference to other memory? General Usage memory-allocation	12	229	July 20, 2024
How to define an array whose elements are manually defined type? General Usage	6	406	August 4, 2021
Trying to reference a struct field into an array New to Julia arrays , struct	7	1281	August 26, 2022
NDReducibles.jl: multi-dimensional array computations without indexing (a proof-of-concept) Package Announcements	4	719	July 28, 2019

Array containing references?

Related topics