I’ve recently been working on DataViewer.jl, a GUI that helps explore and visualize the structure of data contained in datafiles (such as HDF5, JLD2 or JSON files).
This tool focuses on understanding how data files are structured, and therefore provides by default a tree-like view of the data. When a “leaf data” is of a supported type[1], basic Makie-based visualizations are provided, to further understand what the data file contains.
DataViewer.jl can be used as a normal Julia package, whose main API is the DataViewer.view function:
julia> using DataViewer
julia> using JLD2
julia> DataViewer.view("my_file.jld2")
It can also be “installed” in the system, i.e. the DataViewer.install function creates a launcher script[2] that can be called from the command line. (This is what is demoed in the screencast above)
I arguably lacked inspiration for the package name: although it is free in the General registry, a cursory look on github revealed at least 3 other Julia projects by that name. So please feel free to bikeshed on the name (or anything else!)
Implementationwise, this is based on JSServe.jl and WGLMakie, everything being rendered in an Electron.jl window. All my gratitude goes to @sdanisch and @jules who helped me a lot!
Currently: a 1D 2D or 3D array of numbers, or a dictionary of numbers ↩︎
By default, this also compiles a system image so that everything feels more snappy ↩︎
These aren’t implemented yet, but should be doable and in the spirit of the package
Currently, DataViewer is more meant as a tool to quickly understand/check what’s in a data file. I.e answer questions like “did I correctly put all results in my output file?” or “does this user-provided input file contain the data I need, with the right structure?”.
It’s not really meant as a plotting tool, and I feel like adding UI to transform axes would go too far in that direction (and where do we stop? do we also want some UI to label axes? or add a title?)
Saving the output graph could be a nice, generic feature, though.
AFAIU, Arrow files store flat tabular/columnar data à la DataFrames? For now, DataViewer has rather been designed to work with tree-like data structures (like Dicts of Dicts of Arrays of Dicts), but I guess we could also support DataFrame-like data structures by looking at them either as a Dictionary of columns, or an Array of rows. Once this is done, reading arrow files would be very natural.
Good question, I’d like to know that as well! AFAIU, this is a limitation of WGLMakie (or maybe it’s simply that I didn’t manage to resize the WGLMakie figure?)
Yes, good idea! I like that it conveys the idea of working with datafiles rather than raw data. Maybe the command-line tool could be called dfv then?
The short answer is that it’s a perfectly sensible thing to do, but I didn’t manage to do it.
I originally thought it would be as simple as asking the JSServe app to display in VSCode, but this only partly works: all features implemented with Makie Observables work fine in this context, but navigation in the data fields doesn’t work. It might be because this is coded like a plain old web app: in the Electron window clicking on a link asks the JSServe server for the document associated to a new URL; in a VSCode pane, clicking on a link doesn’t seem to do anything…
If someone knowledgeable wants to chime in about this, I’d really appreciate it!
I’m not at all familiar with the DICOM format, but a cursory look at DICOM.jl seems to indicate that all keywords are there: tree-like data structure, Dict-like access…
It might be really easy to at least try adding support to DICOM in DataViewer.
Not sure I understand: are you asking whether 3D slices are implemented in Makie? If so, there’s for example this snippet in the WGLMakie documentation, which shows how to do it:
A DICOM file consists of two parts: a header and data. The header contains information on the patient, the equipment used, image parameters and other metadata. To visualize metadata, you can see (self-promotion here ) my package DICOMTree.
Data can take many forms, but the most common (a DICOM file from a CT scanner) is a 2D matrix of grayscale pixels, corresponding to a slice of the 3D image. To make up the complete image, there are several separate DICOM files, each containing a slice. If the image has 256 slices, there will be 256 DICOM files. Notice that this may vary for some DICOM files, in radiotherapy for instance. For example, the RT Dose (patient irradiation dose map) contains the 3D matrix directly in a single file. The RT Structure, on the other hand, contains points (coordinates in physical space) that describe the contours of certain organs/regions of interest. But the most common is the CT scanner.
Seeing the video above I imagined VSCode Julia could someday support a user-customizable interactive gui system which would be very powerful, something like https://gtoolkit.com/. What stands in the way of this?