How Can I Create Permanently-Stored Lookup Tables?

Hello, everyone. How can I store computed values permanently, i.e. so they never have to be recomputed?

Specifically, I have generated 2 lookup tables via lengthy Monte Carlo simulations and stored them in CSV files. I would like to import them only once, and then store their values permanently in the Julia function that will use them. The tables both have 300 entries, so typing them in by hand is not an option.

Alternatively, I can run the Monte Carlo simulations again, if that leads to a way to store their contents permanently.

Thanks in advance.

Use a functor:

Ps: of course you can read the tables into arrays, using CSV.jl

3 Likes

Thanks, I’ll look into it.

Also note there are the Memoization and Memoize packages (and I think there is even a third or fourth package in this same subject) that allow you to simple annotate a function so it is never recomputed for the same arguments in the same session. I think that unfortunately, they do not support persistent storage (i.e., saving to a CSV) but maybe if you can relax your constraints they are good options (and maybe they could be interested on extending their use cases to support persistent storage after hearing you).

3 Likes

Example using DelimitedFiles:

julia> using DelimitedFiles

julia> struct MyFunc
         data::Matrix{Float64}
       end

julia> (f::MyFunc)(x) = f.data*x

julia> data, header = readdlm("csv.dat",',',header=true)
([1.0 0.0 0.0; 0.0 1.0 0.0; 0.0 0.0 1.0], AbstractString["A" " B" " C"])

julia> f = MyFunc(data)
MyFunc([1.0 0.0 0.0; 0.0 1.0 0.0; 0.0 0.0 1.0])

julia> x = rand(3)
3-element Vector{Float64}:
 0.07056682527318503
 0.7581530064054021
 0.4889526606997521

julia> f(x)
3-element Vector{Float64}:
 0.07056682527318503
 0.7581530064054021
 0.4889526606997521

where csv.dat is:

A, B, C
1, 0, 0
0, 1, 0
0, 0, 1

2 Likes

Thanks guys, but I’m looking for permanent persistence, not just persistence for one session. In other words, I can close Julia, and when I reopen it the values will still be there, without the CSV files needing to be read in again or the simulations run again.

The idea is that I can eventually share this program with people who do not have access to the CSV files or the file that runs the Monte Carlo simulation.

Then you have to write the data directly into the function, as an array. Just copy and paste it into an array constructor. But the code won’t be pretty.

3 Likes

That seems to work, thank you very much.

1 Like

By the way, if the arrays were really huge and it wasn’t practical to copy them into the source (or you needed some other kind of data that wasn’t amenable to that) you could use Pkg’s artifacts system. It would be a bit more setup, but the end result would be that the artifacts could be automatically downloaded when your package is installed, and you could read in the arrays from the artifacts inside an __init__ function which would run at module-load time.

4 Likes

Thanks Eric. That sounds great, I will definitely look into it eventually.

The artifacts system does seem right for this. It doesn’t solve the problem of where to put the file, but it uses libcurl to get the file, so it can be on a website, in a web service, on FTP, or in a local file. Heck, it does GOPHER protocol. I wonder whether figshare or Zenodo would be appropriate storage places? If it were OK to generate the files once, then Scratch.jl would be a good option, but it seems you want to make the file available beforehand.

I took a cursory look at the page you linked, and it says the data containers are mutable. I don’t think that is a good idea for my application, but thanks anyway, I appreciate the info.

This is turning out to be quite a learning experience.

1 Like