I was wondering what the suggested formats for saving and serializing are these days. I am really confused about the landscape of DataTables, DataFrames, databases, etc. I was hoping to write a few functions for the DiffEq solution type to save to some common data formats, but am not sure what I should be targeting. It should be something that would work well with statistics, plotting, and machine learning libraries. Or maybe the approach is just to go generic: I know there are things like DataStreams which are “independent readers”, is there something reverse that I can target so the user can choose which data type they want out? Would that even be necessary? I am hoping someone could guide me in the right direction here. Thanks!
For completeness, I opened an issue on DifferentialEquations.jl related to this topic (and it shows how idea-less I am, except I have had requests for something of this nature):
Also, is there by any chance a form of serialization for types which hold functions? I know JLD hasn’t worked since v0.5, and am wondering if there’s anything along these lines.
DataStreams are not just for reading, they are also for writing. I would suggest implementing a DataStream Source a let the user choose what output format they want to use (possibly with a default format if you want).
Are there any examples for how to setup a DataStream Source anywhere? Are there examples how how to take an arbitrary source and write it to a DataFrame?
You wouldn’t deal with DataFrame at all, you would just implement the Source interface, and the code living in DataFrames would take care of creating the object. For an example, you can have a look at the DataFrames code implementing a Source: https://github.com/JuliaStats/DataFrames.jl/pull/1174 CSV.jl is another possibly useful example.
I just put together a quick and dirty integration with IterableTables in this PR https://github.com/davidanthoff/IterableTables.jl/pull/22. With that you can easily convert a DESolution into any of the supported table sink types, e.g. things like DataTable(sol) will work to create a DataTable from a DESolution instance. Essentially you get support for all the sinks that are listed in the README, plus of course full Query integration, i.e. one can easily run queries against a DESolution instance. You also get integration with DataStreams from that “for free”: you can use IterableTables.get_datastreams_source(sol) to create a DataStreams.Source from your solution (I’m still trying to figure out an easier way to handle that particular integration from a user point of view).