I'm converting a Python/NumPy program into Julia

JuliaCaesar · February 26, 2017, 10:47am

Hi,

first post here, hello everybody.

I would like to convert bits of code into Julia.

Python :

import numpy as np
import matplotlib.pyplot as plt

rot = np.array([[0.90, -0.30], [0.30, 0.90]])
sca = np.array([[3.4, 0], [0, 2]])

np.random.seed(150)
cd = (np.random.randn(100,2)).dot(sca).dot(rot)

I went here so far :

rot = [[0.90, -0.30], [0.30, 0.90]]
sca = [[3.4, 0], [0, 2]]

srand(150)
cd = (randn(100,2))

But I’m unable to apply rotation and scatter to cd because I can’t apply dot() to cd.

Any hint is welcome

dfdx · February 26, 2017, 11:36am

Try this:

# in Julia, syntax for matrix literals is more similar to Matlab's 
# rather than NumPy's:
rot = [0.90 -0.30; 0.30 0.90]
# 2×2 Array{Float64,2}:
#  0.9  -0.3
#  0.3   0.9

sca = [3.4 0; 0 2]
# 2×2 Array{Float64,2}:
#  3.4  0.0
#  0.0  2.0

srand(150)

# `dot` for matrix arguments in NumPy is just matrix multiplication in Julia:
cd = rand(100, 2) * sca * rot
# 100×2 Array{Float64,2}:
#  1.26191    1.35444  
#  3.13555    0.564228 
#  ⋮                   
#  2.33887    0.69385  
#  2.03997    0.491

JuliaCaesar · February 26, 2017, 3:41pm

Worked fine, thanks.

ChrisRackauckas · February 26, 2017, 4:00pm

http://docs.julialang.org/en/stable/manual/noteworthy-differences/#noteworthy-differences-from-python

JuliaCaesar · February 26, 2017, 6:51pm

Thanks for the link.

I’m now stuck at plotting.
Python follow-up:

cd1 = np.random.randn(25,2)+[-10, 2]
cd2 = np.random.randn(25,2)+[-7, -2]
data = np.concatenate((cd, cd1, cd2,))
l1c = np.ones(100, dtype=int)
l2c = np.zeros(100, dtype=int)
labels = np.concatenate((l1c, l2c))
cm = np.array(['r','g'])
plt.scatter(data[:,0],data[:,1],c=cm[labels],s=50,edgecolors='none')
plt.show()

Julia follow-up:

...
cd = randn(100,2) * sca * rot
cd1 = randn(25,2) .+ [-10 2]
cd2 = randn(25,2) .+ [-7 -2]
data = cat(1, cd, cd1, cd2)
l1c = ones(Int, 100)
l2c = zeros(Int, 100)
labels = cat(1, l1c, l2c)
cm =['r';'g']

I tried various ways without success, including using Pyplot, but I’d like to stay away from Python technology.

ChrisRackauckas · February 26, 2017, 7:52pm

Give Plots.jl a try.

JuliaCaesar · February 26, 2017, 8:08pm

I did, but I’m unable to reproduce what the Python code does.

mkborregaard · February 26, 2017, 8:45pm

What does the Python code do?

JuliaCaesar · February 26, 2017, 9:23pm

It plots this:

mkborregaard · February 26, 2017, 9:24pm

You might try

using Plots
scatter(data[:,0], data[:,1], color =  [:red :green], groups = labels, markersize=5, markerstrokewidth = 0)

(or replace color with c, markersize with ms and markerstrokewidth with msw, doesn’t matter).

JuliaCaesar · February 26, 2017, 10:23pm

I get an error :

julia> scatter(data[:,0], data[:,1], color =  [:red :green], groups = labels, markersize=5, markerstrokewidth = 0)
ERROR: BoundsError: attempt to access 150x2 Array{Float64,2}:

ChrisRackauckas · February 26, 2017, 10:33pm

Change the indexing to 1 based, I.e 1 and 2

JuliaCaesar · February 26, 2017, 10:34pm

Thought of it but same:

julia> scatter(data[:,1], data[:,2], color =  [:red :green], groups = labels, markersize=5, markerstrokewidth = 0)

[Plots.jl] Initializing backend: pyplot
INFO: Precompiling module PyPlot.
/usr/lib64/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
ERROR: BoundsError: attempt to access 150-element Array{Float64,1}:

mkborregaard · February 26, 2017, 10:37pm

That’s because your labels vector is 200 long, data is only 150.

JuliaCaesar · February 26, 2017, 10:47pm

Oh I see, I corrected it it’s now working !
Thanks, this is helpful.

dpsanders · February 26, 2017, 11:28pm

Please report the whole error message in future.

JuliaCaesar · February 27, 2017, 12:30am

Ok.

Now I’m trying to split the data.
Python:

X_train1, X_test1, y_train1, y_test1 = train_test_split(data, labels, test_size=0.33)

Julia:

(X_train1, X_test1), (y_train1, y_test1) = splitdata(data, labels; at = 0.33)
ERROR: AssertionError: nobs(X) == nobs(y)
 in splitdata at /home/me/.julia/v0.4/MLDataUtils/src/datasplits/splitdata.jl:34

I don’t get it as data type/dimensions seems coherent:

julia> typeof(data)
Array{Float64,2}
julia> data
200x2 Array{Float64,2}:

julia> typeof(labels)
Array{Int64,1}
ulia> labels
200-element Array{Int64,1}:

JuliaCaesar · February 27, 2017, 3:33am

Looks like I had to transpose the data matrix as now this works :
(X_train1, X_test1), (y_train1, y_test1) = splitdata(transpose(data), labels; at = 0.33)

Topic		Replies	Views
How to convert the python code below to julia? New to Julia question	12	509	December 26, 2022
Converting Python Numpy program to Julia New to Julia question	11	6148	July 6, 2017
Dot product of 2D matrices New to Julia linearalgebra	3	2070	August 10, 2021
Tips for moving from Julia to Numpy/Python? General Usage	33	3045	July 17, 2021
Numpy to Julia conversion questions General Usage	1	466	June 24, 2021

I'm converting a Python/NumPy program into Julia

Related topics