Function like expand.grid in R

I think the simplest is to iterate the Iterators.product, which gives Tuples (that is, it loses name information). Then you can convert on the fly to named tuples, and pass the resulting iterator to DataFrame. For example

function expand_grid(; kws...)
    names, vals = keys(kws), values(kws)
    return DataFrame(NamedTuple{names}(t) for t in Iterators.product(vals...))
end
5 Likes

It would be nice to see this feature added in a “base” package like StatsModels.jl !

1 Like

There a DataFrames.jl function added in the past year called allcombinations. See here

2 Likes

other two ways with and without zip


julia> a=1:2
1:2

julia> b='a':'c'
'a':1:'c'

julia> c=[1,2,3,4]
4-element Vector{Int64}:
 1
 2
 3
 4

julia> r(v,n)= hvncat((ones(Int,n-1)...,size(v)...), true,v...)
r (generic function with 1 method)

julia> DataFrame(NamedTuple.(zip.([[:a,:b,:c]],tuple.(r(a,1),r(b,2),r(c,3))[:])))
24×3 DataFrame
 Row │ a      b     c     
     │ Int64  Char  Int64
─────┼────────────────────
   1 │     1  a         1
   2 │     2  a         1
   3 │     1  b         1
   4 │     2  b         1
  ⋮  │   ⋮     ⋮      ⋮
  22 │     2  b         4
  23 │     1  c         4
  24 │     2  c         4
           17 rows omitted

julia> DataFrame(tuple.(r(a,1),r(b,2),r(c,3))[:])
24×3 DataFrame
 Row │ 1      2     3     
     │ Int64  Char  Int64
─────┼────────────────────
   1 │     1  a         1
   2 │     2  a         1
   3 │     1  b         1
   4 │     2  b         1
  ⋮  │   ⋮     ⋮      ⋮
  22 │     2  b         4
  23 │     1  c         4
  24 │     2  c         4
           17 rows omitted

julia> rename(DataFrame(tuple.(r(a,1),r(b,2),r(c,3))[:]),[:a,:b,:c])
24×3 DataFrame
 Row │ a      b     c     
     │ Int64  Char  Int64
─────┼────────────────────
   1 │     1  a         1
   2 │     2  a         1
   3 │     1  b         1
   4 │     2  b         1
  ⋮  │   ⋮     ⋮      ⋮
  22 │     2  b         4
  23 │     1  c         4
  24 │     2  c         4
           17 rows omitted

or better this way

r(v,n)= hvncat(n,v...)
DataFrame(NamedTuple{(:a,:b,:c)}.(tuple.(r.([a,b,c],1:3)...)[:]))

Another method is using crossjoin from DataFrames:

julia> (Dict(x) for x in pairs((;a = 1:2, b = 'a':'c', c=[1,2,3,4]))) .|>
         splat(DataFrame) |> splat(crossjoin)
24×3 DataFrame
 Row │ a      b     c     
     │ Int64  Char  Int64 
─────┼────────────────────
   1 │     1  a         1
   2 │     1  a         2
   3 │     1  a         3
   4 │     1  a         4
   ⋮ │     ⋮  ⋮         ⋮

Note the easy no-nonsense syntax specifying a,b,c.

I reproduce this to show that I understand how your elegant solution works.

crossjoin(DataFrame.((pairs((;a,b,c))...,))...)

Instead I did not understand the meaning of the note (please note that my English is very poor: I use google translator to write)

1 Like