DataViewer.jl supported data file formats

I was interested in the recent post on DataViewer.jl and wondered if it could be used with csv files or whether it requires hf5/jdl2/json datafiles?

1 Like

Why not give it a try?

An alternative would be Queryverse. However I am not sure how up to date this is
Queryverse | Queryverse, Julia packages for data science.

1 Like

Try it out, and if it doesn’t work create an issue at Issues · triscale-innov/DataViewer.jl · GitHub

See also the package section on extending DataViewer to support more data formats.

3 Likes

netcdf support would be very nice for geoscience community.

1 Like

It should all work. The underlying Javascript Data Voyager hasn’t seen any updates in a long time though, not clear to me whether that particular project is still alive or not… But the existing functionality should just work.

Thanks for your interest! For now, DataViewer only supports HDF5, JLD2 and JSON data files. But more fundamentally, it was designed with tree-like data structures in mind (think things like dictionaries of dictionaries of arrays of dictionaries).

It would probably not be difficult to add support for tabular/columnar file formats (like CSV, or Arrow which was mentioned in the other thread), but I’m not sure how we’d want to display them:

  • like a dictionary of array-like columns? → this would be already supported, probably the best option for now
  • like a vector of dict-like rows? → this would also already be possible, but would it be useful?
  • like a spreadsheet? → this does not currently exist in DataViewer and would probably be a bit more work to implement

But (and here I might very well be wrong because I almost never work with such data), I’m under the impression that there already exist lots of tools which would be more suited to flat, tabular data. For example, in the QueryVerse (which has already been mentioned in this thread) I would expect the DataVoyager UI or the ElectronDisplay “table display” feature to be particularly useful with columnar data coming from CSV files.

1 Like

I would like to be able to browse a dictionary of transfer function, displaying them
as bode plots…


The function is stored as:

TransferFunction{Continuous, ControlSystemsBase.SisoRational{Float64}}
   1.0s^3 + 0.24915934813931118s^2 + 0.01701570159733241s + 0.00018625726584153255
--------------------------------------------------------------------------------------
372.27171456943216s^3 + 92.88568061036781s^2 + 6.366403832359201s + 0.0695505145474942

One slightly off topic response from me. I am surprised that the .jld2 and .hdf5 file extension is being looked for.
Surely the file type can be found in the header?

Sorry if I am exposing my ignorance here…

Sure enough, FileTypes.jl does not cover these file types
GitHub - JuliaIO/FileTypes.jl: Small and dependency-free Julia package to infer file and MIME type checking the magic numbers signature.

1 Like

Thanks for all the resonses and for being pointed to Queryverse which may suit me better.

That’s a very good point!

Lack of time was the main reason I did not do this, but I did consider it at one point, which lead me to the following remarks:

  • I’m not sure whether it would be possible to reliably auto-detect text-based formats like JSON (or CSV, for that matter), so a file-extension-based mechanism might be needed anyway.
  • Since the JLD2 format is itself based on HDF5, I’m not sure a magic-number-based approach à la FileTypes.jl could reliably distinguish between the two. It might be possible to implement a two-stage approach, though: determine that the file is an HDF5 container first, then look in it for specific meta-data signalling a JLD2 file.

Is that what your file contains: TransferFunction instances? (In what type of file?)

One problem I see is that, in order to handle those, DataViewer would have to know about the TransferFunction type. That could probably be a nice extension that depends both on ControlSystemsBase and DataViewer.

Yes, indeed. Well, can be StateSpace or TransferFunction, they can easily be converted into each other. I have lots of these files (simple example):

julia> load("data/lin_turbine_0.95.jld2")
Dict{String, Any} with 9 entries:
  "omega_max"   => 0.11643
  "Uest"        => 6.88889
  "ω"           => 0.87478
  "U0"          => 7.0
  "Pgc"         => 1089328
  "Γ"           => 0.95
  "linsys"      => StateSpace{Continuous, Float64}…
  "max_mag_db"  => -43.3495
  "phase_shift" => -238.909

JLD2 files have custom magic bytes at the beginning.
(Here’s JLD2’s own header verification)

If required, I can help with JLD2 specific things.

EDIT: also, FileIO can already correctly identify JLD2 / HDF5

1 Like

Note, that netcdf files are also just HDF5 with extra metadata strapped on it.

JLD2 can read netcdf files already, with the caveat that using the metadata is not implemented.

Very good to know, thanks!

julia> using FileIO

julia> query("sample.h5")
File{DataFormat{:HDF5}, String}("/home/francois/projets/DataViewer.jl/app/sample.h5")

julia> query("sample.jld2")
File{DataFormat{:JLD2}, String}("/home/francois/projets/DataViewer.jl/app/sample.jld2")

julia> query("sample.json")
File{DataFormat{:UNKNOWN}, String}("/home/francois/projets/DataViewer.jl/app/sample.json")

I guess I still need the extension-based mechanism for things like JSON, but still a very nice improvement for everything else! I probably won’t have any time to work on this soon, but filed an issue to remember about it

4 Likes