Arrays, Matrix, append!

Hey all,

may you can tell me some insights about the arrays in julia and their dimension. I came from python3.7 and want to translate a python script to julia.

I need to create a list, and iterate through a set of vecs, with each iteration I append this to X_l. So far so good. I put X_l into np.array(X_l). And there for I get with .shape a shape of (4932, 300).

In Julia it would be using set which would be in my case equavilant to unique.
After all lines of code, I want to convert my multidimensional list into the shape (4932, 300), but when I try it with the https://docs.julialang.org/en/v1/manual/arrays/
basic function -for example size- I do not get the correct shape. May you can help me and suggest how I am able to solve this? A little hint would be appreciated to thank you.

kind regards,
Lucas

It would be much easier to help if you posted an MWE.

Here is the MWE:

python 3.7

import numpy as np

Variables

en_fr = {
‘the’: ‘la’,
‘and’: ‘et’,
‘was’: ‘était’,
‘for’: ‘pour’
}
english_vecs = {
‘the’: np.asarray([0.08007812, 0.10498047, 0.04980469]),
‘and’: np.asarray([7.12890625e-02, -2.58789062e-02, 1.81884766e-02]),
‘was’: np.asarray([6.44531250e-02, 8.64257812e-02, -1.69921875e-01]),
‘for’: np.asarray([-1.17797852e-02, -4.73632812e-02, 4.46777344e-02])
}

X_l = list()
english_set = set(english_vecs.keys()) # Dictionary keys
for en_word, fr_word in en_fr.items(): #en_fr is a Dictionary
if en_word in english_set:
en_vec = english_vecs[en_word]
X_l.append(en_vec)

X = np.array(X_l)
X.shape # (4, 3)

Julia 1.5.1

Variables

#en_fr = Dict()
en_fr = Dict([
“the”=> “la”,
“and”=> “et”,
“was”=> “était”,
“for”=> “pour”
])

english_vecs = Dict([
“the”=> [0.08007812, 0.10498047, 0.04980469],
“and”=> [7.12890625e-02, -2.58789062e-02, 1.81884766e-02],
“was”=> [6.44531250e-02, 8.64257812e-02, -1.69921875e-01],
“for”=> [-1.17797852e-02, -4.73632812e-02, 4.46777344e-02]
])

X_l =

english_set = unique(keys(english_vecs))

for (en_word, fr_word) in en_fr
println(en_word, fr_word)
if en_word in english_set
en_vec = english_vecs[en_word]
append!(X_l, en_vec)
end
end
X = X_l
size(X)

I know that I am missing the np.array when I translate python to Julia but I could not found any good suggestion atm. May anyone is able to help me? Thanks in advance!

Your MWE didn’t work for me:

>>> X_l = list()
>>> english_set = set(english_vecs.keys()) # Dictionary keys
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>

for en_word, fr_word in en_fr.items(): #en_fr is a Dictionary
#if condition
en_vec = english_vecs[en_word]
X_l.append(en_vec)NameError: name 'english_vecs' is not defined
>>>
>>> for en_word, fr_word in en_fr.items(): #en_fr is a Dictionary
... #if condition
... en_vec = english_vecs[en_word]
  File "<stdin>", line 3
    en_vec = english_vecs[en_word]
      X_l.append(en_vec)   ^
IndentationError: expected an indented block
>>> X_l.append(en_vec)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'en_vec' is not defined

did I do something wrong?

May It was not a clear MWE, actually I posted it too fast. I will generate a better one.

Please quote your code with backticks.

1 Like

updated my post, thank you.

You could use push! instead of append!. The result is a vector of vectors. Concatenate into a Matrix with X_l2 = reduce(hcat,X_l) |> permutedims to get shape (4, 3).

2 Likes

that worked, thank you AndiMD!

1 Like

One thing I think is nice to mention is that, in Julia, you can write explicitly what you want using simple definitions and loops, and obtain an efficient code. In this case, you could have done, for example (after using push!):

X = zeros(length(X_l),length(X_l[1]))
for i in 1:length(X_l)
  for j in 1:length(X_l[i])
    X[i,j] = X_l[i][j]
  end
end
println(size(X))
@show X

This less compact, but many times it is easier to write your own small function than to search for a library to do what you want (this code can be made shorter in several ways, depending on the familiarity you have with the Julia syntax).

2 Likes

A comprehension X = [X_l[i][j] for i in eachindex(X_l), j in eachindex(X_l[1])] is also a good option for constructing arrays via loops.

2 Likes