Accessing vector fields

Hi everyone, hope you are all healthy.

I’m a mechanical engineering student and I come from MATLAB, so I am sorry for my heresies.

I currently am working on a NASTRAN .hdf5 output. There is a package to open such files, that stores informations in a vector with several fields (hope it is the correct word!) as shown in the following piece of code:

GRIDcard
111403-element Array{HDF5.HDF5Compound{7},1}:
 HDF5.HDF5Compound{7}((567377, 0, [-76.9957, -13.1615, -82.5], 0, 0, 0, 1), ("ID", "CP", "X", "CD", "PS", "SEID", "DOMAIN_ID"), (Int64, Int64, HDF5.FixedArray{Float64,(3,)}, Int64, Int64, Int64, Int64))
 HDF5.HDF5Compound{7}((567378, 0, [-74.1493, -12.1665, -82.5], 0, 0, 0, 1), ("ID", "CP", "X", "CD", "PS", "SEID", "DOMAIN_ID"), (Int64, Int64, HDF5.FixedArray{Float64,(3,)}, Int64, Int64, Int64, Int64))

I would like to save all X entries in a single matrix defined as:

GRID = [
X[1]'
X[2]'
...
X[n]'
]

Is it possible to do it without a for loop that reads all entries as:

GRID = zeros(n,3)
for i = 1:n
    GRID[i,:] = GRIDcard[i].data[3]
end

Thank you all in advance, and I hope I haven’t broken too many rules!

Welcome to the Julia discourse!

Loops are generally fast in Julia, but if you want to do it without a loop, you could use a comprehension

GRID = vcat([GRIDcard[i].data[3] for i = 1:n]...).
1 Like

Note that using vcat in this way is unnecessarily slow both at compile time (because you end up calling a function with n arguments, and n can be very large), and at run time (because Julia isn’t well-optimized to handle function calls with huge numbers or arguments).

(I’m demonstrating here with hcat instead, since that seems closer to the loop result in the original code, but the result applies to vcat or hcat).

For example:

julia> data = [rand(3) for _ in 1:1000000];

julia> @btime hcat($data...);
  38.808 ms (7 allocations: 38.15 MiB)

You’ll get the same result with better performance from reduce(hcat, ...), like this:

julia> @btime reduce(hcat, $data);
  13.763 ms (2 allocations: 22.89 MiB)
1 Like

Thanks a lot!

Isn’t it possible to do something like GRIDcard[:].data[3]?

Feels like I learn a new trick every day on this site :grinning_face_with_smiling_eyes:

or would be anyway as slow as a for or hcat?

I can see how that would be useful, but no–it’s not a supported syntax in Julia. It is the kind of thing that could probably be implemented in a macro (kind of like how GitHub - mcabbott/Tullio.jl: ⅀ implements Einstein summation), but I don’t know of an existing implementation.

I think the primary thing to take away here is not the particular trick of how to use hcat most efficiently. The most important thing is what @jlchan said at the very beginning:

It’s hard to overstate how important this is. Your initial post included a loop which was easy to write and easy to understand, and simple loops like that are often the fastest way to do things in Julia. If you can describe what you want in a loop, then you can often just declare victory and move on to the next problem.

By the way, if you do want to improve performance a bit, you might try transposing the way you store your matrix. Julia matrices are column-major (this is like fortran and Matlab and unlike numpy and C), so it’s usually better to store data that are accessed together as columns instead of rows. That would mean making GRID into a 3xn matrix instead of nx3 and accessing it as GRID[:, i].

3 Likes

didn’t know that, thanks a lot, now I see why the NASTRAN’s (written in fortran) output is nx3!

The best way would be to modify the package I guess to get exactly what I want from the .h5 file, right?

If you think the modification would be of use to other users of the package, then sure! But there’s no performance difference between package code or user code in Julia, so you might find it easier to just write a function to do whatever data manipulation you need and keep that function in your own code.

1 Like

Okay, thank you very much, have a nice day!

1 Like