dataset.value has been deprecated. Use dataset[()] instead.
I guess if you look really close you would see what they mean, but the two syntaxes I immediately tried are h5f.dataset[('kVals')] and h5py.dataset(...). It turns out the correct one is:
h5f['kVals'][()]
Could not wrap my head around as to how that is supposed to be better than .value
It sounds like they wanted to make reading the whole dataset less convenient so new users would use slicing more often. Maybe someone can extract a lesson from this?
Every other email I see with new h5py users somebodyās recommending use of dataset.value, which is horrible because it dumps the entire dataset to an array. Then people complain that h5py is slow. We canāt get rid of this for backwards compatibility but Iām removing it from the documentation and having it raise a warning.
Perhaps Iām somewhat emotional over this because I see posts from people who stumble upon it and donāt realize that datasets support slicing operations. ādataset.valueā is exactly equivalent to ādataset[ā¦]ā; but people do things like ādataset.value[10:20]ā and donāt understand why it takes forever (or takes 8GB of memory and hangs Python).
read_direct is a little different because you supply an existing array which h5py āfills inā with the requested data; with both dataset.value and dataset[ā¦], h5py creates a brand new array and returns it.
I resisted removing this for 2.0 because of backwards compatibility concerns.
Also, I feel like you canāt even easily slice the data-set because h5py.File isnāt a subscriptable object. Say I have a long list of x and a long list y, I canāt just tell h5py to extract the first 20 elements of each (which is a common use case), instead, I have to materialize two datasets (x and y) and do slicing.
I havenāt use Juliaās H5D packages, I hope they have a better way of doing this.
I think the opposite is true. According to the linked issue, accessing .value would read the whole file, which is why it was deprecated. Instead, you are supposed to use h5f['kVals'][0:20,0:20] and it will only access the first elements in x and y, which avoids the unnecessary reads and allocations.
are all equivalent. Extending the last set of statements, itās pretty natural that x[()] selects the whole 3d array. If Python had x[] like Julia I think they wouldāve used it.
By the way, you donāt have the similar equivalence like x[0, :, :] == x[0] in Julia and you need to explicitly specify exact number of :s you need. Although this leads to more strict code (which I like), as in many API decisions, it comes with a trade off. It would be nice to have a syntax to avoid hard-coding repeated : (see https://github.com/JuliaLang/julia/issues/5405). Interface like Array(::HDF5Dataset) or collect(::HDF5Dataset) is more idiomatic in Julia for āmaterializingā dataset as an array. But I think we need a solution to #5405 for a notation as āhandyā as x[()], e.g., x[...].