Reading hdf5 file saved with nested dict from Python

Hello,
I’m having problems reading hdf5 files using the ‘HDF5’ package, with nested dictionaries in them stored from Python. The data consists of a key (string) and its value (dict containing two keys for different arrays).

I saved it using deepdish , and I can read it fine in Python. However, when I read it in Julia using matches = h5open(file) , the data is like:

:card_index_dividers: HDF5.File: (read-only) Photo_SG_matches.h5
├─ :label: CLASS
├─ :label: DEEPDISH_IO_DEEPDISH_IO_UNPACK
├─ :label: DEEPDISH_IO_VERSION
├─ :label: PYTABLES_FORMAT_VERSION
├─ :label: TITLE
├─ :label: VERSION
└─ :1234: data
├─ :label: CLASS
├─ :label: PSEUDOATOM
├─ :label: TITLE
└─ :label: VERSION

And if I run read(matches, data) it shows a 1-element vector of strings:

 1-element Vector{Vector{UInt8}}:
 [0x80, 0x04, 0x95, 0x43, 0x12, 0x01, 0x00, 0x00, 0x00, 0x00  …  0xd8, 0x23, 0x3f, 0x94, 0x74, 0x94, 0x62, 0x75, 0x75, 0x2e]

Am I missing something, or does Julia not support nested dicts?

Not an HDF5 expert, but it looks like the Python package employs a custom serialization of Python types, probably because HDF5 does not have a dictionary type.

Unless you find a better file format for which you know already that it can be also read with a package in Julia, you will have to check the Python library’s docs and figure how to make sense of their data field.

Julia supports nested Dicts:

julia> Dict(:a => Dict(:b => Dict(:c => "hello")))
Dict{Symbol, Dict{Symbol, Dict{Symbol, String}}} with 1 entry:
  :a => Dict(:b=>Dict(:c=>"hello"))

Ah sorry, I meant to ask if the Julia HDF5 library supported nested dicts, not the language itself

According to the docs of HDF5.jl it does not: Home · HDF5.jl

If you need a pkg to export Julia data types into HDF5 format you could try JLD2.jl: GitHub - JuliaIO/JLD2.jl: HDF5-compatible file format in pure Julia
But again, this package employs a custom serialization because (I think) there is not dict type in the HDF5 specifications.
The supported Julia types for export are explained here: HDF5 Compatibility · Julia Data Format

Either way, I think there is (not yet) a packaged solution that allows you to write nested dicts in Python and then import them in Julia, or vice versa.
I guess you will have to write your own helper functions to get this done.

Also: If it is only about nested dicts that contain strings as values then your best bet might be to just use a .json.
For python you could use the json module and for Julia you could use JSON.jl.

Thank you for the information. I guess for now, I will use PyCall and read in the hdf5 through the PyCall with the same library I used to save (deepdish). On trying that, it seems to work