Save DataFrame containing arrays


#1

I want to manipulate performance experiments data, store/load these data from a file.

I try to use DataFrames.jl which looks OK. When I want to store my data in a csv file, it fails if my data contains arrays of floats.

What format should I use for this purpose ?

using DataFrames
using CSV

mm=DataFrame(action = ["toto","tutu"], gflops = [1.2,5.6])
CSV.write("./mm1.csv",mm) #OK

measurements=DataFrame(action = ["toto","tutu"], gflops = [[1.2,3.4,5.6],[5.6,6.7,7.8]])
CSV.write("./mm.csv",measurements)#KO

Laurent


#2

JLD.jl, JLD2.jl or https://github.com/MikeInnes/BSON.jl can store (almost) arbitrary Julia objects.


#3

Thanks !
I have an additional question. I wanted to store the file on a git server, I thought that textual format could be efficiently handled by git (storing only diffs). I guess that this basic approach will not work with JDL or BSON…
What should I do ?


#4

git is not really made to store data, even in text format. But it’s probably ok as long as your data is small (irrespective of wether text or binary). If your data really only changes a little and it is essential that git takes advantage of storing diffs, then you could consider other text based formats such as JSON, YAML or XML, for which there are probably packages around.


#5

Thank you very much !