Dummy Encoding(One hot encoding) from PooledDataArray

Just do this:

for c in unique(df[:Country])
    df[Symbol(c)] = UInt.(df[:Country] .== c)
end

or

for c in unique(df[:Country])
    df[Symbol(c)] = ifelse.(df[:Country] .== c, 1, 0)
end

The dot vectorized syntax ensures that no temporary vector will be created. But you can also keep the column as Bool as in many operations it will behave as expected: false * 2 == 0.

2 Likes