Converting Python Numpy program to Julia

question

#1

Hello,

I am trying to convert Python Numpy code to Julia. Its has 5 dataset with each 10000 rows and 3072 columns.

Data is 10000x3072 numpy array of uint8s. Each row of the array stores a 32x32 colour image. The first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. The image is stored in row-major order, so that the first 32 entries of the array are the red channel values of the first row of the image.

Python Numpy Code

#1st function which sends 1 file name at a time to 2nd function to load data
  xs=[];

  for b in range(1,6):
   xs.append(X);#above array is appened 5 times.

  Xtr=np.concatenate(xs);
  retrurn Xtr

#2nd function load file from disk
  X=X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("float");
  return X

#Finally once the Xtr is returned, Xtr_rows becomes 50000 x 3072
Xtr_rows = Xtr.reshape(Xtr.shape[0], 32 * 32 * 3)

Julia Code

#1st function which sends 1 file name at a time to 2nd function to load data
 xs=[]
  for b=1:5
    push!(xs,X)
    push!(ys,Y)
  end
 Xtr = convert.(Float64,vcat(xs...))
 return Xtr

#2nd function load file from disk

X=permutedims(reshape(X,10000,3,32,32), [1,3,4,2])
retrun X

#Finally once the Xtr is returned, Xtr_rows becomes 50000 x 3072
Xtr_rows= reshape(Xtr,size(Xtr,1), 32*32*3);

The results which i get are different. 1st and last columns of the array Xtr_rows are same but rest of the columns are different. I am sure this is because Python is row major and Julia is column major hence it affect while reshaping the array. I believe i have tried different combination while reshaping but i am not getting the desired result.
Kindly let me know how to solve this issue.

Thank You


Should `reshape` have an option for row-major order?
#2

I would guess (without fully comprehending your data layout) that you need to reverse the sizes that you pass to reshape(). Let’s pick a waaaaay simpler example:

Python

In [2]: data = np.array([1,2,3,4,5,6])

In [3]: data.reshape(3, 2)
Out[3]:
array([[1, 2],
       [3, 4],
       [5, 6]])

Julia

julia> data = [1,2,3,4,5,6]
6-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6

# This doesn't match numpy, because Julia Arrays are column-major
julia> reshape(data, 3, 2)
3×2 Array{Int64,2}:
 1  4
 2  5
 3  6

# This does match
julia> reshape(data, 2, 3)'
3×2 Array{Int64,2}:
 1  2
 3  4
 5  6

you’ll also need to call permutedims(), but only after you do the correct reshape.


#3

Julia is column major, right?
https://docs.julialang.org/en/stable/manual/performance-tips/#Access-arrays-in-memory-order,-along-columns-1


#4

Yes


#5

Derpty derp, thanks. I’ll edit the post.


#6

It is important to understand how reshape works conceptually to know how to move your code from Python to Julia

In Python reshape iterates over the indices from first to last (imagine a nested for loop), so the last index changes most rapidly. In Julia, it is the opposite, it iterates over the indices from last to first so the first index changes most rapidly. This is a generalization of the row-major (in Python) and column major (in Julia), respectively, but for multidimensional arrays.

This rule is followed when scanning the source matrix and when ‘filling’ the new one (actually only the internal representation is modified, so no real copying is happening, at least in Julia, not sure about Python).

Given that, if you want to get reshape to work in Julia as it would in Python, you should transpose the source matrix and change the order of the indices passed to reshape then use permutedims to clean up as so:

X=permutedims(reshape(X',32,32,3,10000), [4,1,2,3])

Notice I replaced every index i in Python’s transpose/permutedims with 4+1-i in Julia, because that’s the corresponding index now after flipping the reshape indices.

Disclaimer: I didn’t test this, so it may not work!

Edited to add: by the way transpose and permutedims are expensive operations so try to avoid them by using the right shape from the beginning or wrapping your array by a special get function that will flip the indices for you without copying the whole matrix.


#7

@mohamed82008 , @rdeits Thank You very much for helping me understand reshape & permutedims()

Below is the code which resolved the issue.

X=permutedims(reshape(X',32,32,3,10000), [4,3,2,1])


#8

Not immediately relevant to your question, but note that Julia supports an RGB type and that can eliminate one index’s worth of complexity in your data structures.


#9

@tim.holy Thank You very much. Will certainly try that.


#10

@tim.holy Kindly add Documentation. Pkg does seem interesting.


#11

Try http://juliaimages.github.io/latest/


#12

@tim.holy Thank You :slight_smile: