Hi,
first post here, hello everybody.
I would like to convert bits of code into Julia.
Python :
import numpy as np
import matplotlib.pyplot as plt
rot = np.array([[0.90, -0.30], [0.30, 0.90]])
sca = np.array([[3.4, 0], [0, 2]])
np.random.seed(150)
cd = (np.random.randn(100,2)).dot(sca).dot(rot)
I went here so far :
rot = [[0.90, -0.30], [0.30, 0.90]]
sca = [[3.4, 0], [0, 2]]
srand(150)
cd = (randn(100,2))
But I’m unable to apply rotation and scatter to cd because I can’t apply dot() to cd.
Any hint is welcome
dfdx
February 26, 2017, 11:36am
2
Try this:
# in Julia, syntax for matrix literals is more similar to Matlab's
# rather than NumPy's:
rot = [0.90 -0.30; 0.30 0.90]
# 2×2 Array{Float64,2}:
# 0.9 -0.3
# 0.3 0.9
sca = [3.4 0; 0 2]
# 2×2 Array{Float64,2}:
# 3.4 0.0
# 0.0 2.0
srand(150)
# `dot` for matrix arguments in NumPy is just matrix multiplication in Julia:
cd = rand(100, 2) * sca * rot
# 100×2 Array{Float64,2}:
# 1.26191 1.35444
# 3.13555 0.564228
# ⋮
# 2.33887 0.69385
# 2.03997 0.491
4 Likes
Thanks for the link.
I’m now stuck at plotting.
Python follow-up:
cd1 = np.random.randn(25,2)+[-10, 2]
cd2 = np.random.randn(25,2)+[-7, -2]
data = np.concatenate((cd, cd1, cd2,))
l1c = np.ones(100, dtype=int)
l2c = np.zeros(100, dtype=int)
labels = np.concatenate((l1c, l2c))
cm = np.array(['r','g'])
plt.scatter(data[:,0],data[:,1],c=cm[labels],s=50,edgecolors='none')
plt.show()
Julia follow-up:
...
cd = randn(100,2) * sca * rot
cd1 = randn(25,2) .+ [-10 2]
cd2 = randn(25,2) .+ [-7 -2]
data = cat(1, cd, cd1, cd2)
l1c = ones(Int, 100)
l2c = zeros(Int, 100)
labels = cat(1, l1c, l2c)
cm =['r';'g']
I tried various ways without success, including using Pyplot, but I’d like to stay away from Python technology.
I did, but I’m unable to reproduce what the Python code does.
What does the Python code do?
You might try
using Plots
scatter(data[:,0], data[:,1], color = [:red :green], groups = labels, markersize=5, markerstrokewidth = 0)
(or replace color
with c
, markersize
with ms
and markerstrokewidth
with msw
, doesn’t matter).
2 Likes
I get an error :
julia> scatter(data[:,0], data[:,1], color = [:red :green], groups = labels, markersize=5, markerstrokewidth = 0)
ERROR: BoundsError: attempt to access 150x2 Array{Float64,2}:
Change the indexing to 1 based, I.e 1 and 2
2 Likes
Thought of it but same:
julia> scatter(data[:,1], data[:,2], color = [:red :green], groups = labels, markersize=5, markerstrokewidth = 0)
[Plots.jl] Initializing backend: pyplot
INFO: Precompiling module PyPlot.
/usr/lib64/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
ERROR: BoundsError: attempt to access 150-element Array{Float64,1}:
That’s because your labels vector is 200 long, data is only 150.
Oh I see, I corrected it it’s now working !
Thanks, this is helpful.
1 Like
Please report the whole error message in future.
1 Like
Ok.
Now I’m trying to split the data.
Python:
X_train1, X_test1, y_train1, y_test1 = train_test_split(data, labels, test_size=0.33)
Julia:
(X_train1, X_test1), (y_train1, y_test1) = splitdata(data, labels; at = 0.33)
ERROR: AssertionError: nobs(X) == nobs(y)
in splitdata at /home/me/.julia/v0.4/MLDataUtils/src/datasplits/splitdata.jl:34
I don’t get it as data type/dimensions seems coherent:
julia> typeof(data)
Array{Float64,2}
julia> data
200x2 Array{Float64,2}:
julia> typeof(labels)
Array{Int64,1}
ulia> labels
200-element Array{Int64,1}:
Looks like I had to transpose the data matrix as now this works :
(X_train1, X_test1), (y_train1, y_test1) = splitdata(transpose(data), labels; at = 0.33)