HDF5: compound data types


using Python, I’ve created an HDF5 file with a compound data type. I can read the data in Julia, but I do not manage to create such a file from Julia.

Here, I read the file I’ve created in Python:

julia> using HDF5

julia> h1 = h5open( "test.h5" )
🗂️ HDF5.File: (read-only) test.h5
└─ 📂 g
   ├─ 🔢 data
   └─ 🔢 positions

julia> h1["g"]["positions"]
🔢 HDF5.Dataset: /g/positions (file: test.h5 xfer_mode: 0)

The data set in the last line looks like this:

julia> a = read( h1["g"]["positions"] )
10000-element Array{NamedTuple{(:chromosome, :position, :index),Tuple{UInt8,UInt32,UInt32}},1}:
 (chromosome = 0x00, position = 0x002dd549, index = 0x00000000)
 (chromosome = 0x00, position = 0x002e1cc1, index = 0x00000001)
 (chromosome = 0x00, position = 0x002e2987, index = 0x00000002)
 (chromosome = 0x00, position = 0x0242d473, index = 0x00002798)
 (chromosome = 0x00, position = 0x02437b22, index = 0x00002799)
 (chromosome = 0x00, position = 0x0243c3fe, index = 0x0000279a)

Now, I try to write it back into a new HDF5 file:

julia> h2 = h5open( "test2.h5", "w" )
🗂️ HDF5.File: (read-write) test2.h5

julia> write( h2, "p", a )
ERROR: MethodError: no method matching datatype(::Array{NamedTuple{(:chromosome, :position, :index),Tuple{UInt8,UInt32,UInt32}},1})
Closest candidates are:
  datatype(::HDF5.Attribute) at /home/anders/.julia/packages/HDF5/cDXRT/src/HDF5.jl:991
  datatype(::HDF5.Dataset) at /home/anders/.julia/packages/HDF5/cDXRT/src/HDF5.jl:989
  datatype(::Union{Bool, Float32, Float64, Int16, Int32, Int64, Int8, UInt16, UInt32, UInt64, UInt8, HDF5.Reference}) at /home/anders/.julia/packages/HDF5/cDXRT/src/HDF5.jl:994

As you can see, I cannot write such a complex data type without somehow specifying it. I’ve noticed that there is a function HDF5.create_datatype, but couldn’t find any documentation for it.

Can somebody give me a hint?



1 Like

Does this help?