Idiomatic storage for intermediate calculations

I’m writing a module that’s numerically intensive (aren’t we all?), and I’m running into an issue where I seem to be fighting the language… which makes me suspect I’m doing it wrong.

Specifically, I want to have a struct that contains some data determined by the user. The user can sporadically add or update the data. But eventually there’s going to be some intensive calculations done on it, and there’s a significant setup that needs to be done first. I only want to do the setup once.

So what I have now is:

mutable struct map
    data::Array{Float64, 1}     # the user gets to append data to this
    data2::Array{Float64, 1}   # This gets calculated once using the data Array
   # some constructors go here
end

function setup()
    # calculate data2
end

function execute()
    # do millions of calculations depending on data2
end

But there are a couple of issues: one, data2 wants to be completely uninitialized until the user calls a setup() routine, and in point of fact I won’t know how big the array will need to be until setup() is called. And two, the idea of having this data being changed inside a struct seems to go against the grain of how Julia wants to do things.

Can anyone suggest a more Julia-esque of doing this? It can’t be uncommon.

You can create a zero-length array for data2, which would indicate that it is not initialized, can’t you?

Or split data and data2 into two data structures?

You can create a zero-length array for data2, which would indicate that it is not initialized, can’t you?

I can. Just doesn’t seem like the cleanest way to do it.

Or split data and data2 into two data structures?

I could do that… but there are going to be multiple instances of the struct, and I don’t want to put the responsibility of matching up the right “internal” calculated data structure with the user-facing one on the programmer.

Hmm: Is there a reason why the setup is separate from the calculation (execute)? Data 2 is just scratch space, if I understand your intent? In that case, wouldn’t it make sense to calculate it inside execute?

It’s not scratch space. It’s a set of preliminary calculations that are time consuming but only need to be executed once, after all of the data has been added. For instance, it could be a sorted version of data that’s added randomly; the million subsequent calculations are searches. It doesn’t make sense to sort the same data prior to every search.

1 Like

I see. In that case I would make this series of steps: set up transforms the data into another form, which is then passed to execute. That seems cleanest to me.

2 Likes

I’m not sure it’s properly idiomatic, but I have a similar situation where I do something like this

mutable struct UserData
   data::Array{Float64,1}
end

function setup()
   # compute data2 from data1

    function execute()
        # act on data2
    end
end

execute = setup()

The preprocessed data gets incorporated into the inner execute function via closure. This way, the consumer can’t muck with data2 and you can’t accidentally run execute without doing setup first.

mutable struct PreMap
    data::Array{Float64, 1}
end

struct Map
    data1::Array{Float64, 1}
    data2::Array{Float64, 1}
end

function Map(p::PreMap)
    ...
    return Map(...)
end

function execute(m::Map)
    ...
end

execute(p::PreMap) = execute(Map(p))
3 Likes

Very nice…