Using HDF5.jl and reading h5 data to Julia?

Hi all!
I am new to .h5 data, so maybe I can get some help.
I got some data from my research group. They work in Python and apparently somthin like this works for them:

import h5py
data, time, delta, name = read_hdf5(fname)

But I can’t read the file, if i run:

using HDF5
data = h5read(“STROMBOLI.h5”)

I get

HDF5-DIAG: Error detected in HDF5 (1.12.1) thread 1:
#000: H5O.c line 128 in H5Oopen(): unable to open object
major: Object header
minor: Can’t open object
#001: H5VLcallback.c line 5386 in H5VL_object_open(): object open failed
major: Virtual Object Layer
minor: Can’t open object
#002: H5VLcallback.c line 5353 in H5VL__object_open(): object open failed
major: Virtual Object Layer
minor: Can’t open object
#003: H5VLnative_object.c line 58 in H5VL__native_object_open(): unable to open object by name
major: Object header
minor: Can’t open object
#004: H5Oint.c line 625 in H5O_open_name(): object not found
major: Object header
minor: Object not found
#005: H5Gloc.c line 442 in H5G_loc_find(): can’t find object
major: Symbol table
minor: Object not found
#006: H5Gtraverse.c line 837 in H5G_traverse(): internal path traversal failed
major: Symbol table
minor: Object not found
#007: H5Gtraverse.c line 613 in H5G__traverse_real(): traversal operator failed
major: Symbol table
minor: Callback failed
#008: H5Gloc.c line 399 in H5G__loc_find_cb(): object ‘data’ doesn’t exist
major: Symbol table
minor: Object not found
ERROR: Error opening object //data
Stacktrace:
[1] error(::String, ::String, ::String, ::String)
@ Base ./error.jl:42
[2] h5o_open
@ ~/.julia/packages/HDF5/T1b9x/src/HDF5.jl:2334 [inlined]
[3] h5o_open
@ ~/.julia/packages/HDF5/T1b9x/src/HDF5.jl:2114 [inlined]
[4] o_open(parent::HDF5File, path::String)
@ HDF5 ~/.julia/packages/HDF5/T1b9x/src/HDF5.jl:876
[5] getindex
@ ~/.julia/packages/HDF5/T1b9x/src/HDF5.jl:887 [inlined]
[6] h5read(::String, ::String)
@ HDF5 ~/.julia/packages/HDF5/T1b9x/src/HDF5.jl:729
[7] top-level scope
@ none:1

Help, please!

Can you check the current working folder with pwd() and make sure STROMBOLI.h5 is in that folder?

Yeap I am in the right place with the right file.

What dataset are you trying to read within the HDF5 file? This is supposed to be specified by the second argument to h5read.

Otherwise use h5open.

See the documentation for further details:
https://juliaio.github.io/HDF5.jl/stable/

I do not see a read_hdf5 function in the h5py documentation. This must be a custom function.

There is one called data and one called time. How can I access those?

I use h5open and read. See DataDrop.jl.

I don’t have the STROMBOLI.h5 file, but an example for exploration of another file might be:

Find out what is in the file:

julia> h5open("data/eiscat/MAD6400_2022-12-12_manda_60@uhf.hdf5", "r")
🗂️ HDF5.File: (read-only) data/eiscat/MAD6400_2022-12-12_manda_60@uhf.hdf5
├─ 📂 Data
│  └─ 🔢 Table Layout
└─ 📂 Metadata
   ├─ 🔢 Data Parameters
   ├─ 🔢 Experiment Notes
   ├─ 🔢 Experiment Parameters
   ... more metadata

Get all datasets without loading them (in case they are big):

datasets=HDF5.get_datasets(h5file)
6-element Vector{HDF5.Dataset}:
 🔢 HDF5.Dataset: /Data/Table Layout (file: data/eiscat/MAD6400_2022-12-12_manda_60@uhf.hdf5 xfer_mode: 0)
 🔢 HDF5.Dataset: /Metadata/Data Parameters ...
... more metadata sets

The actual data is the 1st set, the others are metadata (which are probably important as well).

julia> h5data=datasets[1]
julia> size(h5data)
(94776,)

So I get a vector of named tuples:

julia> keys(h5data[1])
(:year, :month, :day, :hour, :min, :sec, :recno, :kindat, ....

Print parameters gdalt and ne, the first three vector elements:

julia> foreach(x -> println("height = $(x[:gdalt]), ne = $(x[:ne])"), h5data[1:3])
height = 20.169572265625, ne = 2.282413824e9
height = 20.5092578125, ne = 1.0e6
height = 20.86676953125, ne = 2.51701408e8
5 Likes

Hey thanks @stephancb
This really help :).

Could you tell me how to access the none databases? This is a picture of the structure of my file and I need the info saved in the yellow tags too:

The things in the yellow tags are attributes.

There are currently two interfaces to get to this information via the attrs or attributes methods. attrs is meant to more closely emulate the h5py interface. See below for more information.

https://juliaio.github.io/HDF5.jl/stable/interface/attributes/

What do you mean by the “none databases”?

2 Likes

Oh! They are attributes. I did not know that. There is info stored there and I did not know how to access it. thanks!