Array containing references?

Hi all,

Is there a way to store references in an array?

Basically, I would like to have a huge array of event rates (array A), however the rates are specified in just a few parameters (e.g. a,b,c,d,e) which are referred to by reference.

This way, I would be able to e.g. change rate “a” during an optimization search, and array A would automatically refer to the new “a” wherever required. This seems far superior to having to re-iterate through the whole array A to update “a” whenever parameter “a” is changed, because A could easily have a length of millions.

I suspect this is very easy in Julia, but I don’t see an obvious solution on google. Any help very appreciated?

Cheers!
Nick

Sure, you can create a Ref, and then store that reference in an array, or you can store a mutable struct in an array and mutate the fields of that struct. But it’s hard to recommend an exact solution because I’m not totally sure what the desired behavior is. If you’re creating an array in which each element is a reference to the same set of parameters, then why do you even need an array?

1 Like

Hi! Thanks for your help – here’s a clearer explanation I hope:

# My model is 
# describing a bifurcating process where
# you have a starting state and 2 descendant states
# start_state -> desc1_state, desc2state

# There can be numstates^3 of these kinds of 
# transitions, which explodes quickly

# However, many rates are shared.

# I would like this to happen:

using DataFrames
start_state = [1,1,1,2,2,2,3,3,3]
desc1_state = [1,2,3,1,2,3,1,2,3]
desc2_state = [1,1,1,2,2,2,3,3,3]

# Specify the rates
a = 4.1
b = 2.5
rates = [a,a,a,b,b,b,b,b,b]

rates_df = DataFrame(start_state=start_state, desc1_state=desc1_state, desc2_state=desc2_state, rates=rates)

# 9×4 DataFrame
# │ Row │ start_state │ desc1_state │ desc2_state │ rates   │
# │     │ Int64       │ Int64       │ Int64       │ Float64 │
# ├─────┼─────────────┼─────────────┼─────────────┼─────────┤
# │ 1   │ 1           │ 1           │ 1           │ 4.1     │
# │ 2   │ 1           │ 2           │ 1           │ 4.1     │
# │ 3   │ 1           │ 3           │ 1           │ 4.1     │
# │ 4   │ 2           │ 1           │ 2           │ 2.5     │
# │ 5   │ 2           │ 2           │ 2           │ 2.5     │
# │ 6   │ 2           │ 3           │ 2           │ 2.5     │
# │ 7   │ 3           │ 1           │ 3           │ 2.5     │
# │ 8   │ 3           │ 2           │ 3           │ 2.5     │
# │ 9   │ 3           │ 3           │ 3           │ 2.5     │


# Basically, I'd like to be able to change "a" to 1.5, and have this 
# appear in the "rates" column of rates_df when I access it the next time
# (without having to iterate through the whole "rates" column)
# (The DataFrame part is not essential, I could just use a separate Array for each column)
a = 1.5
rates_df

Instead of storing the rates, you could store an index into the rates array, which avoids the problem.

1 Like

Thanks for the suggestion, that’s a good one, but unfortunately I have to do lots of operations like summing “rates” where desc1_state==2, so having a reference that auto-updates seems like it would be faster once the columns have length of 10^5 or higher…

Cheers!
Nick

I meant that you replace your rates array with

reduced_rates = [a, b]

in your example, and index into that.

Try this:

a = Ref(1.5)
A = [a, a, a]
a[] = 2

But also note that it sounds like you’re making a design choice based on hypothetical performance without actually measuring anything. In particular, there’s no reason a priori to expect an array of Refs to perform much differently than an array of indices into some small array of values. I’d encourage you to use BenchmarkTools.jl to actually measure some different ideas before you go too far down a particular performance-driven design path.

3 Likes

Thanks for this! It’s true I haven’t measured this, mostly my motivation is that I have this weird problem where

(a) it could easily to expand to 100,000+ rate entries
(b) I am going to have to repeatedly sum over subsets of the rates (every step of an ODE integrator)
(c) the rate parameters will be updated regularly
(d) I will likely have enough rate parameters it would be a minor pain to keep track of the indices for a reduced_rates array

It occurs to me just now that I could just use the parameter names as dictionary keys, that might be the easiest thing.

Cheers!
Nick