All the ways to do one-hot encoding

There are many ways to do one hot encoding in Julia. I want to list down some ways

Method Comment
Roll your own It’s not too hard
Flux.onehot & Flux.onehotbatch pretty heavy to depend on Flux so normally avoided unless u r using Flux anyway
DataConvenience.onehot can be applied directly to dataframes
FeatureTransform.OneHotEncoder They syntax for FeatureTransform is still on the verbose side for my liking imo
MLJ.OneHotEncoder() I really want to like MLJ but it’s quite heavy and makes me want to avoid it. What’s with this machine? Why do I need a machine to do one hot encoding?
ScikitLearn.jl Hmm, need to install a bunch of Python stuff. Thank you no thank you for such a simple thing
Keep your data in categorical format You need to find libraries that accept that
unique(x) .== permutedims(x) due to @Mattriks. Great for if you don’t need sparse representation

That’s it. Anything I’ve missed?

15 Likes