I'm converting a Python/NumPy program into Julia

Hi,

first post here, hello everybody.

I would like to convert bits of code into Julia.

Python :

import numpy as np
import matplotlib.pyplot as plt

rot = np.array([[0.90, -0.30], [0.30, 0.90]])
sca = np.array([[3.4, 0], [0, 2]])

np.random.seed(150)
cd = (np.random.randn(100,2)).dot(sca).dot(rot)

I went here so far :

rot = [[0.90, -0.30], [0.30, 0.90]]
sca = [[3.4, 0], [0, 2]]

srand(150)
cd = (randn(100,2))

But I’m unable to apply rotation and scatter to cd because I can’t apply dot() to cd.

Any hint is welcome

Try this:

# in Julia, syntax for matrix literals is more similar to Matlab's 
# rather than NumPy's:
rot = [0.90 -0.30; 0.30 0.90]
# 2×2 Array{Float64,2}:
#  0.9  -0.3
#  0.3   0.9

sca = [3.4 0; 0 2]
# 2×2 Array{Float64,2}:
#  3.4  0.0
#  0.0  2.0

srand(150)

# `dot` for matrix arguments in NumPy is just matrix multiplication in Julia:
cd = rand(100, 2) * sca * rot
# 100×2 Array{Float64,2}:
#  1.26191    1.35444  
#  3.13555    0.564228 
#  ⋮                   
#  2.33887    0.69385  
#  2.03997    0.491    

4 Likes

Worked fine, thanks.

http://docs.julialang.org/en/stable/manual/noteworthy-differences/#noteworthy-differences-from-python

2 Likes

Thanks for the link.

I’m now stuck at plotting.
Python follow-up:

cd1 = np.random.randn(25,2)+[-10, 2]
cd2 = np.random.randn(25,2)+[-7, -2]
data = np.concatenate((cd, cd1, cd2,))
l1c = np.ones(100, dtype=int)
l2c = np.zeros(100, dtype=int)
labels = np.concatenate((l1c, l2c))
cm = np.array(['r','g'])
plt.scatter(data[:,0],data[:,1],c=cm[labels],s=50,edgecolors='none')
plt.show()

Julia follow-up:

...
cd = randn(100,2) * sca * rot
cd1 = randn(25,2) .+ [-10 2]
cd2 = randn(25,2) .+ [-7 -2]
data = cat(1, cd, cd1, cd2)
l1c = ones(Int, 100)
l2c = zeros(Int, 100)
labels = cat(1, l1c, l2c)
cm =['r';'g']

I tried various ways without success, including using Pyplot, but I’d like to stay away from Python technology.

Give Plots.jl a try.

I did, but I’m unable to reproduce what the Python code does.

What does the Python code do?

It plots this:

You might try

using Plots
scatter(data[:,0], data[:,1], color =  [:red :green], groups = labels, markersize=5, markerstrokewidth = 0)

(or replace color with c, markersize with ms and markerstrokewidth with msw, doesn’t matter).

2 Likes

I get an error :

julia> scatter(data[:,0], data[:,1], color =  [:red :green], groups = labels, markersize=5, markerstrokewidth = 0)
ERROR: BoundsError: attempt to access 150x2 Array{Float64,2}:

Change the indexing to 1 based, I.e 1 and 2

2 Likes

Thought of it but same:

julia> scatter(data[:,1], data[:,2], color =  [:red :green], groups = labels, markersize=5, markerstrokewidth = 0)

[Plots.jl] Initializing backend: pyplot
INFO: Precompiling module PyPlot.
/usr/lib64/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
ERROR: BoundsError: attempt to access 150-element Array{Float64,1}:

That’s because your labels vector is 200 long, data is only 150.

Oh I see, I corrected it it’s now working !
Thanks, this is helpful.

1 Like

Please report the whole error message in future.

1 Like

Ok.

Now I’m trying to split the data.
Python:

X_train1, X_test1, y_train1, y_test1 = train_test_split(data, labels, test_size=0.33)

Julia:

(X_train1, X_test1), (y_train1, y_test1) = splitdata(data, labels; at = 0.33)
ERROR: AssertionError: nobs(X) == nobs(y)
 in splitdata at /home/me/.julia/v0.4/MLDataUtils/src/datasplits/splitdata.jl:34

I don’t get it as data type/dimensions seems coherent:

julia> typeof(data)
Array{Float64,2}
julia> data
200x2 Array{Float64,2}:

julia> typeof(labels)
Array{Int64,1}
ulia> labels
200-element Array{Int64,1}:

Looks like I had to transpose the data matrix as now this works :
(X_train1, X_test1), (y_train1, y_test1) = splitdata(transpose(data), labels; at = 0.33)