Creating multidimensional arrays

Hey everyone, i am fairly new to Julia and i am feeling like taking crazy pills. I am coming from numpy/tensorflow and all i want to do is to create a test array with size (1,3) and tile it in the first dimension by 300, so that i have an array of size (300,3).
In numpy i would do it somewhat like this:

out = np.tile( [[1,2,3]], (300,1))

if i try to do the same with julia but my brain is melting and nothing makes sense. I cannot even seem to create a multidimensional array with this line:

a = [ [ 1 2 3] ] 
b = Array( [ [ 1, 2, 3 ] ])

because for some reason the size(a) gives (1,), Why doesn’t it look recursivly into its elements and puts together a multidimensional array automatically? It doesn’t even do this when i specify its type as Array as in example b. So the first question is, what is the most consistent/nicest way of putting together multidimensional arrays explicitly?

The tiling also makes my brain hurt:

out = repeat([1;2;3], 1,3)
out2 = repeat([1,2,3], 1,3)

both give the SAME result, but maybe i just missunderstood the ; operator. I eventually got it to work with

permutedims(repeat([1 ,2 , 3], outer = (1,300)), (2,1))

But that does look pretty hacky. Is there another way that does not need to use permutedims, especially since it seemed to crash my kernel a few times, when i was low on RAM? All of this seems kinda weird and not really made for multidimensional arrays like numpy or tensorflow is. Am i simply missing a big package here that handles this more like numpy, or is this just the way it is in julia? I know that i can use numpy in julia via python, but i want to write GPU kernels eventually, and i do not think parallelization will work seemlessly with python numpy arrays. Hope you can clear up some of my missconceptions.

1 Like

I think you are just trying to apply the numpy syntax to julia, and they are a bit different. Multidimensional arrays in numpy are generally represented as vectors of vectors, while julia ones are more like matlab’s/fortran’s.

Multi-dimensional Arrays · The Julia Language the docs are quite comprehensive in the different ways to instantiate multidimensional arrays.

2 Likes

Take a look at the docs that were linked. But for a short answer, [1 2 3] or [1;;2;;3] are both valid ways of creating a 1x3 array. The semicolons approach is more general in that it can create arrays of arbitrary dimensions and can be used to create 1x1 arrays ([x] or [x;] will create a 1 array but [x;;] will create a 1x1). To elaborate, a ; indicates that the contents should be concatenated along the first dimension, ;; along the second, ;;; along the third, etc.

3 Likes

Be very careful when using repeat, it only creates one array and copies the same one to multiple locations. Editing one of the values will affect them all.

Go far back enough in my activity and you’ll find that this caused a huge bug in my code.

Edit: It’s a nuanced case, I think it’s fine here because the elements of the array are bitstypes, but arrays of arrays will cause issues when used with repeat:

3 Likes

Because then how would one be able to distinguish between multi-dimensional arrays and actual arrays of arrays? They are really not the same thing.

As for what you are trying, you can do this:

julia> repeat([1 2 3], 5)
5Ă—3 Matrix{Int64}:
 1  2  3
 1  2  3
 1  2  3
 1  2  3
 1  2  3

But I suspect that there is something wasteful going on here. By creating an array with many equal rows, you are using up memory without actually storing more information. Do you really need this?

5 Likes

Julia has far simpler and more natural syntax and handling of arrays of all dimensions than numpy, available in the Base language, no need for any packages. The arrays also work for absolutely all and any types, not limited to some built-in types.

But you need to learn how that syntax works, and that is actually quite close to Matlab (arrays are one of the relatively few areas where Matlab is superior to Python). I suggest that you start by looking at the manual on arrays here: Single- and multi-dimensional Arrays · The Julia Language

Do remember a few key things: Array is the general, overarching definition. A Vector is an Array (a one-dimensional one), a Matrix is also an Array (a two-dimensional one), etc. It therefore does not make sense trying to convert a vector to an array, it already is an array.

Also, you don’t need or want type definitions when creating arrays, just use the literal syntaxes: [1,2,3] for Vector, [1 2 3] or [1 2 3; 4 5 6] for Matrix. And then some new syntax for higher dimensions (see manual).

1 Like

Ah ok. That clears up my confusion with the ; operator and makes a lot of sense. I have to relearn a lot, especially since most ML frameworks work just like numpy in that regard.

Oh that is also quite good to know. So how did you solve it? Is there a hardcopy method that resolves the references, or did you do without it?

That is the best solution for my problem so far. Thanks. I guess i could go without the tiling for now, but i want to have a bundle of rays with the same origin for a simple path tracer that i want to eventually place on the GPU as a Julia test project. For now its just simpler to have the origins and the directions be in the same dimensionality. Also even without the project, i want to know how to handle multi-dim arrs as good as possible.

As far as I know, if your example is fine because the eltype you’re repeating, int, is immutable. The problems come when you repeat a mutable type at which point it copies it by reference in order to avoid allocations (and possibly for semantic reasons).

See my comment’s edit to see where it goes wrong and the suggested fix of using comprehensions.

1 Like

A bit hard to say, but it seems a bit like a situation where you would tend to use broadcasting in Julia:

julia> origin = [1, 2, 3];

julia> directions = rand(3, 5)
3Ă—5 Matrix{Float64}:
 0.648981   0.0739386  0.459989   0.287787  0.14477
 0.4276     0.684642   0.784215   0.213416  0.859163
 0.0444743  0.743287   0.0516741  0.385912  0.231109

julia> endpoints = origin .+ directions
3Ă—5 Matrix{Float64}:
 1.64898  1.07394  1.45999  1.28779  1.14477
 2.4276   2.68464  2.78422  2.21342  2.85916
 3.04447  3.74329  3.05167  3.38591  3.23111

Here you avoid having to store repeated versions of the origin coordinates.

It also sounds like a job for https://github.com/JuliaArrays/StaticArrays.jl, but maybe that’s for later.

3 Likes

Julia has a lot of great storage containers besides multidimensional arrays. Adding more information does not always mean you should add a dimension. (This is something I had to unlearn from Matlab.)

Let’s do explicitly that:

julia> struct Ray
           u
           v
           w
       end

julia> bundle = [Ray(1, 2, 3) for i in 1:300]
300-element Vector{Ray}:
 Ray(1, 2, 3)
 Ray(1, 2, 3)
 Ray(1, 2, 3)
 â‹®

julia> bundle[2] = Ray(4, 5, 6)
Ray(4, 5, 6)

julia> bundle
300-element Vector{Ray}:
 Ray(1, 2, 3)
 Ray(4, 5, 6)
 Ray(1, 2, 3)
 â‹®

julia> bundle[2].v
5

You can also consider changing the definition of Ray by specifying concrete parametric types for your fields and subtyping FieldVector for some extra speed and functionality:

julia> using StaticArrays

julia> struct Ray{T<:Number} <: FieldVector{3,T}
           u::T
           v::T
           w::T
       end
2 Likes

Slight update. The solutions work in Julia 1.7 to which i have now upgraded, but array creation with multiple semicolons failed in version 1.4.1 which was shipped with my OS.

That is pretty much how i have done it so far, but every ray not only needs a direction, but also an origin point, that eventually can be different for different rays (scattering in the scene).

struct Ray{T}
    origin::Vector{T}
    direction::Vector{T}
end

function get_ray_position(origin, direction, t)
    return origin + t*direction
end

function get_ray_position(ray::Ray, t)
    return ray.origin + t*ray.direction
end

It’s a pretty recent addition. Also the julia versions that distros ship usually have so issues so the official binaries tend to work best

3 Likes

You may be interested in this previous thread, where some of us have implemented the “Raytracing in a Weekend” book in julia.

I want to advise you though that naively porting numpy code 1:1 is probably going to leave performance on the table.

3 Likes