A better way to assign to [:, :, ... k]?

hdf5

#1

I’m writing a utility that collects a user’s n-dimensional matrices over time and stores them in an n+1 dimensional array. Something like this:

storage[:, :, k] = input

But of course I can’t literally type those colons, because the input could be anything (as long as it’s consistent from call to call). So I’ve written this to store the input matrix:

num_input_dims = length(size(input))
storage[repeat([:], num_input_dims), k] = input

For what it’s worth, in my case storage is actually part of an HDF5 file with a space preallocated for all of the data the user will pass in. The syntax for [:, :, :, k] is what the HDF5 package uses for input, and I don’t think strides will work for it.

My syntax above just feels a little silly, and I’m wondering if there’s a better way.


#2

I think slicedim() does what you’re looking for.


#3

EllipsisNotation.jl at least gives nice syntax: storage[..,k]. In was fine for performance, but when we made it more generic it doesn’t infer properly anymore so you may want to use an older version, or help us finish it up :stuck_out_tongue:.


#4

Lispy tuple programming!

replace_last(t, i) = first(t), replace_last(Base.tail(t), i)...
replace_last(t::Tuple{T}, i) where T = (i,)

test = rand(5, 5, 5)
test_indices = indices(test)
test[replace_last(test_indices, 1)...]

Should be super-performant.


#5

Why can’t you use a (pre-allocated) vector of n-dimensional matrices?


#6

Thanks, but if I understand correctly, this would be for reading, and I need to write with setindex! (it’s the method the HDF5 package has writing to data in files). Have I missed something?

This is more elegant than what I had. My code now looks like (taken from EllipsisNotation.jl):

colons = (Colon() for i in 1:length(size(input)))
storage[colons..., k] = input

Nice. I’m going with @ChrisRackauckas’s suggestion for clarity, but I wonder if this recursive implementation is any faster?

One could; it just won’t work for my application, where storage is (must be) a multidimensional array.

Thanks all. You helped me retire a TODO in my code, find a bunch of new functions, and figure out why @bramtayl’s second replace_last method would be interpreted as the base case.


#7

Only if you hit the splatting penalty, i.e. have more than 16 dimensions. The issue with EllipsisNotation.jl is a little bit different and has to do with inferrability when it generalized. If it infers correctly, any implementation should be about the same speed.


#8

FWIW assigning to the last index is slow(er) since Julia is column major.

Edit. This is nonsense.


#9

By assigning to

storage[:, :, k]

Isn’t he effectively assigning through the first two axis? That is, I thought the above would basically sort of be equivalent to:

for in 1:size(storage,2)
  for j in 1:size(storage,1)
    storage[j, i, k]
  end
end

in terms of memory access pattern, and that the whole block from both cases should be contiguous.
Am I mistaken, or did I misinterpret something?


#10

Yes, sorry. I don’t know what I was thinking…