Flux model with categorical and continuous covariates/features

johnbb · September 30, 2020, 10:49pm

Using Flux, I would like to train a regression model with both continuous and categorical features. Since the categorical variable(s) may have a many levels, I would prefer to map these into a few new continuous variables before they are concatenated with the remaining continuous features. Are there better/simpler ways to achieve this than the code example below? Any comment on the code is welcome.

using Flux
n = 10_000
x = vcat(rand(1:10, 1, n), rand(Float32, 5, n)) # 1st row categorical with 10 levels
y = rand(Float32, n)
trdata = Flux.Data.DataLoader((Flux.onehotbatch(x[1,:], 1:10), x[2:end,:]), y,
                              batchsize = 100)

function create_model(embedding, main_model)
    return function(x)
        x1 = embedding(x[1]) 
        x2 = cat(x1, x[2], dims=1)
        return main_model(x2)
    end, params(embedding, main_model)
end
m, prm = create_model(Dense(10,3), Chain(Dense(8,5), Dense(5,1)))
loss(x, y) = Flux.mse(m(x), y)
@time Flux.train!(loss, prm, collect(trdata), ADAM())
## 2nd time:  0.028417 seconds (52.83 k allocations: 24.259 MiB, 33.36% gc time)

Topic		Replies	Views
Flux: Embeddings on GPU Machine Learning gpu , flux	5	1025	February 28, 2021
Flux, categorical arrays, roc curves, confusion matrices Machine Learning flux	14	1042	December 12, 2022
Simple NLP in Flux with Embedding Layer New to Julia flux	2	1374	August 25, 2019
Flux: multiple input of unequal dimensions Machine Learning flux	4	1300	September 7, 2020
Flux GPU Error with Zygote Machine Learning flux , rnn	2	484	August 23, 2022

Flux model with categorical and continuous covariates/features

Related topics