Examples of a database to manage simulation result files

Hello everyone,

I use DrWatson.jl to manage my scientific projects.
The default way to manage files containing simulation results, is to put the list of parameters in the filename. This works well for small projects, but quickly, the type and amount of parameters make this approach very difficult.

In this issue of the DrWatson.jl package, a user faced similar difficulties.
One of the contributors suggested using a database to manage simulation results.
I would be very interested in concrete examples of such a workflow. If anyone has done this, I would be very grateful if they could point me towards the relevant resources.

I’m posting this question here and not in the github issue to avoid derailing the topic in that issue.

Thanks in advance,
Olivier

I’m not sure my “database” is the same as your “database”. When you say database I think MySQL, PostgreSQL, Oracle, etc… In which case I don’t think this works, because if you lose the database you loose what parameters were associated with which run.

Probably the approach I would take is to create a “index.txt” file, each line in the file would have the key/value pairs of a run. Something like:

a=1 b=6.3 c=curved
a=3 b=0.2 c=straight

The actually filename would then be something like “run_##.ext” where ## is the line number in index.txt that contains the parameter for that run. When generating the filename index.txt would have to be consulted to see if the combination of parameters has been used before then reuse that line. If the parameters haven’t been seen before then a new line is added.

It might make sense to make index.txt a json file or something more structured, then you wouldn’t need to escape equal signs or maybe spaces. Or maybe you could use pipes or comma deliminator instead of spaces.

thanks!

I’m not sure my “database” is the same as your “database”. When you say database I think MySQL, PostgreSQL, Oracle, etc…

yes indeed, by database I mean some file that keeps a record of the metadata of the runs.

Probably the approach I would take is to create a “index.txt” file, each line in the file would have the key/value pairs of a run.

I had in mind something similar but I was more looking for a concrete implementation that makes these tasks more or less automatic.
Do you happen to have something implemented?

Sorry I have nothing implemented. I was thinking that most of the contents of savename() could reused to do this (it appends the parameters together), code would just need to be written to use the index file.

That might be a feature to merge into DrWatson…but I don’t know how a human would go from the file name back to view the parameters. That seems like a standalone script that would have to sit in your repository.

Hi,
Sorry for not replying earlier.

That might be a feature to merge into DrWatson…but I don’t know how a human would go from the file name back to view the parameters. That seems like a standalone script that would have to sit in your repository.

Ok, I’ll try that when I’ll have some time and post it here.
Thanks again for the suggestions!