Alternatives to doing
df[df[:A].isnull, :A] = newValue
for replacing NULLs in column :A of the DataTable df with “newValue”? From an earlier question I think that using .values and .isnull is not encouraged.
Maybe an “impute()” function would be useful?
Thanks
Andre
You can write df[isnull.(df[:A]), :A] = newValue
to avoid accessing private fields.
But since there won’t be any missing values left in the column, you can also convert it to a standard Array
. The conversion method accepts a second argument giving the value to replace nulls with: df[:A] = convert(Array, df[:A], newValue)
.
We could provide a more discoverable and shorter function for that. dplyr uses coalesce
(inspired from SQL), which can be passed either arrays or scalars. In the present case, passing a vector and a scalar would replace the nulls with the scalar.