There are many ways to do one hot encoding in Julia. I want to list down some ways
Method | Comment |
---|---|
Roll your own | It’s not too hard |
Flux.onehot & Flux.onehotbatch | pretty heavy to depend on Flux so normally avoided unless u r using Flux anyway |
DataConvenience.onehot |
can be applied directly to dataframes |
FeatureTransform.OneHotEncoder |
They syntax for FeatureTransform is still on the verbose side for my liking imo |
MLJ.OneHotEncoder() |
I really want to like MLJ but it’s quite heavy and makes me want to avoid it. What’s with this machine ? Why do I need a machine to do one hot encoding? |
ScikitLearn.jl |
Hmm, need to install a bunch of Python stuff. Thank you no thank you for such a simple thing |
Keep your data in categorical format | You need to find libraries that accept that |
unique(x) .== permutedims(x) |
due to @Mattriks. Great for if you don’t need sparse representation |
That’s it. Anything I’ve missed?