DataViewer.jl supported data file formats

rajmac · October 30, 2023, 1:42pm

I was interested in the recent post on DataViewer.jl and wondered if it could be used with csv files or whether it requires hf5/jdl2/json datafiles?

johnh · October 30, 2023, 2:00pm

Why not give it a try?

An alternative would be Queryverse. However I am not sure how up to date this is
Queryverse | Queryverse, Julia packages for data science.

ufechner7 · October 30, 2023, 2:06pm

Try it out, and if it doesn’t work create an issue at Issues · triscale-innov/DataViewer.jl · GitHub …

rafael.guerra · October 30, 2023, 2:08pm

See also the package section on extending DataViewer to support more data formats.

MJulia · October 30, 2023, 2:12pm

netcdf support would be very nice for geoscience community.

davidanthoff · October 30, 2023, 4:35pm

It should all work. The underlying Javascript Data Voyager hasn’t seen any updates in a long time though, not clear to me whether that particular project is still alive or not… But the existing functionality should just work.

ffevotte · October 31, 2023, 8:15am

Thanks for your interest! For now, DataViewer only supports HDF5, JLD2 and JSON data files. But more fundamentally, it was designed with tree-like data structures in mind (think things like dictionaries of dictionaries of arrays of dictionaries).

It would probably not be difficult to add support for tabular/columnar file formats (like CSV, or Arrow which was mentioned in the other thread), but I’m not sure how we’d want to display them:

like a dictionary of array-like columns? → this would be already supported, probably the best option for now
like a vector of dict-like rows? → this would also already be possible, but would it be useful?
like a spreadsheet? → this does not currently exist in DataViewer and would probably be a bit more work to implement

But (and here I might very well be wrong because I almost never work with such data), I’m under the impression that there already exist lots of tools which would be more suited to flat, tabular data. For example, in the QueryVerse (which has already been mentioned in this thread) I would expect the DataVoyager UI or the ElectronDisplay “table display” feature to be particularly useful with columnar data coming from CSV files.

ufechner7 · October 31, 2023, 9:27am

I would like to be able to browse a dictionary of transfer function, displaying them
as bode plots…

The function is stored as:

TransferFunction{Continuous, ControlSystemsBase.SisoRational{Float64}}
   1.0s^3 + 0.24915934813931118s^2 + 0.01701570159733241s + 0.00018625726584153255
--------------------------------------------------------------------------------------
372.27171456943216s^3 + 92.88568061036781s^2 + 6.366403832359201s + 0.0695505145474942

johnh · October 31, 2023, 9:29am

One slightly off topic response from me. I am surprised that the .jld2 and .hdf5 file extension is being looked for.
Surely the file type can be found in the header?

Sorry if I am exposing my ignorance here…

Sure enough, FileTypes.jl does not cover these file types
GitHub - JuliaIO/FileTypes.jl: Small and dependency-free Julia package to infer file and MIME type checking the magic numbers signature.

rajmac · October 31, 2023, 9:51am

Thanks for all the resonses and for being pointed to Queryverse which may suit me better.

ffevotte · October 31, 2023, 11:06am

That’s a very good point!

Lack of time was the main reason I did not do this, but I did consider it at one point, which lead me to the following remarks:

I’m not sure whether it would be possible to reliably auto-detect text-based formats like JSON (or CSV, for that matter), so a file-extension-based mechanism might be needed anyway.
Since the JLD2 format is itself based on HDF5, I’m not sure a magic-number-based approach à la FileTypes.jl could reliably distinguish between the two. It might be possible to implement a two-stage approach, though: determine that the file is an HDF5 container first, then look in it for specific meta-data signalling a JLD2 file.

ffevotte · October 31, 2023, 11:11am

ufechner7:

The function is stored as:

TransferFunction{Continuous, ControlSystemsBase.SisoRational{Float64}}
   1.0s^3 + 0.24915934813931118s^2 + 0.01701570159733241s + 0.00018625726584153255
--------------------------------------------------------------------------------------
372.27171456943216s^3 + 92.88568061036781s^2 + 6.366403832359201s + 0.0695505145474942

Is that what your file contains: TransferFunction instances? (In what type of file?)

One problem I see is that, in order to handle those, DataViewer would have to know about the TransferFunction type. That could probably be a nice extension that depends both on ControlSystemsBase and DataViewer.

ufechner7 · October 31, 2023, 12:03pm

Yes, indeed. Well, can be StateSpace or TransferFunction, they can easily be converted into each other. I have lots of these files (simple example):

julia> load("data/lin_turbine_0.95.jld2")
Dict{String, Any} with 9 entries:
  "omega_max"   => 0.11643
  "Uest"        => 6.88889
  "ω"           => 0.87478
  "U0"          => 7.0
  "Pgc"         => 1089328
  "Γ"           => 0.95
  "linsys"      => StateSpace{Continuous, Float64}…
  "max_mag_db"  => -43.3495
  "phase_shift" => -238.909

JonasIsensee · October 31, 2023, 3:24pm

JLD2 files have custom magic bytes at the beginning.
(Here’s JLD2’s own header verification)

If required, I can help with JLD2 specific things.

EDIT: also, FileIO can already correctly identify JLD2 / HDF5

JonasIsensee · October 31, 2023, 3:30pm

Note, that netcdf files are also just HDF5 with extra metadata strapped on it.

JLD2 can read netcdf files already, with the caveat that using the metadata is not implemented.

ffevotte · October 31, 2023, 6:00pm

Very good to know, thanks!

julia> using FileIO

julia> query("sample.h5")
File{DataFormat{:HDF5}, String}("/home/francois/projets/DataViewer.jl/app/sample.h5")

julia> query("sample.jld2")
File{DataFormat{:JLD2}, String}("/home/francois/projets/DataViewer.jl/app/sample.jld2")

julia> query("sample.json")
File{DataFormat{:UNKNOWN}, String}("/home/francois/projets/DataViewer.jl/app/sample.json")

I guess I still need the extension-based mechanism for things like JSON, but still a very nice improvement for everything else! I probably won’t have any time to work on this soon, but filed an issue to remember about it

Topic		Replies	Views
[pre-ANN] DataViewer.jl: explore data files with the power of Makie Package Announcements	14	1654	October 27, 2023
Dataframe viewer needed Data	27	8139	January 31, 2019
Jld2 preview in vscode? VS Code question , jld2	9	1251	August 21, 2024
Alternative to Data Voyager for Julia Visualization question	3	950	March 14, 2020
Can't read old JLD2 file Tooling	17	2947	February 19, 2019

DataViewer.jl supported data file formats

Related topics