Best way to store arrays with metadata?

felipeh · October 10, 2023, 12:10am

I plan to run a sequence of many (on the order 100) simulations, each of which outputs an array of floating point numbers (on the order of 10k each). I want to store these arrays with metadata about the parameters used in the simulation that produced them.

I know this is a simple question but I just want to ask for a good way to organize this data.

One thought I had was for each simulation to correspond to a line in a DataFrame with columns describing the simulation parameters and an additional column for the actual simulation data. But I am not sure that it is “correct” for one column to be a large array of numbers while the others are simple strings and integers (maybe this is a standard usage, I am really new to DataFrames).

Something else I thought about was to have a “main” dataframe of just metadata stored as one CSV, with the “data” columns corresponding to filenames (one per simulation) where the arrays are stored.

Any help / advice would be appreciated. Thanks so much!

By the way, if it wasn’t clear my objective function is based on aesthetics. I am not dealing with so much data so I am not worried about I/O performance or anything like that. I just want to organize things in such a way that it will be easy for me to go back after a few weeks and easily remember how things are arranged and what the numbers all mean. For reference, my current system is to have output files with strange names and numbers that I can’t parse, so anything better than this is an improvement!

lmiq · October 10, 2023, 12:33am

Topic		Replies	Views
How to add metadata info to a DataFrame? Data dataframes , metadata	96	9061	October 26, 2019
A type for metadata? Data question , proposal , metadata	11	1878	August 17, 2018
Recording simulation data General Usage	9	2091	October 21, 2020
Should I use either Dataframes.jl or Named Array for a long and wide array for sci computing General Usage	7	2627	July 25, 2019
Save DataFrame containing arrays Data	4	963	May 16, 2018

Best way to store arrays with metadata?

Related topics