I am sorry maybe my title of the problem is probably not so appropriate, but here is the problem. So I have some arrays of categorical or boolean values and I want to create a combination of all of them and form an array. Also It could be the case that for each one of these combinations I have a unique value (in the example it is on the z
vector). I tried with the following example and it worked. But honestly I didn’t like the way I could do this, is there a simpler way to do this?
Suppose I have
x = [false, true];
y = ['A', 'B'];
z = rand(4);
And I want to prodce a two dimensional array like following
4×3 Array{Any,2}:
false 'A' 0.961038
true 'A' 0.518147
false 'B' 0.210022
true 'B' 0.543537
Here is what I tried
A = [(i, j) for i in x, j in y];
B = [collect(zip(A[i]..., z[i])) for i in 1:length(A)];
C = reshape([i for i in (B[1]...)], 1, 3);
for j in 2:length(B)
C = vcat(C, reshape([i for i in (B[j]...)], 1, 3))
end
A
You are looking for Iterators.product
.
x = [false, true];
y = ['A', 'B'];
z = rand(4);
df = DataFrame(a = Bool[], b = Char[], c = Float64[])
for p in Iterators.product(x, y, z)
push!(df, p)
end
Matrix(df)
1 Like
Hi @pdeffebach, thanks a lot for the reply. So I need to create a DataFrame, could it be independent of Dataframe? Thanks for the solution. This looks way way simpler than mine. But just tried, the code didn’t work, maybe I don’t know the details of Iterators.product
. I can check
Sorry, edited the above so it works. It looks like you have to declare the columns first if you are pushing just Tuple
s instead of NamedTuples
.
You don’t have to use DataFrames
, but its a convenient thing to do, plus you said in your post you are interested in “creating a dataset”.
You can do the same process by initializing an arrray of Any
with 3 columns an 0 rows.
julia> df = Array{Any}(undef, 0, 3)
julia> for p in Iterators.product(x, y, z)
global df # because i'm working in REPL I need to declare global in the loop
df = vcat(df, permutedims(collect(p)))
end
Hi Peter, thanks, this works actually, so this is what I wanted,
df = Array{Any}(undef, 0, 2)
for p in Iterators.product(x, y)
df = vcat(df, permutedims(collect(p)))
end
hcat(df, z)