There has been a bit of discussion on here about ‘nothing’ vs ‘missing’ and why both are needed. I don’t really have an opinion on this as long as I can easily convert, which I can’t.
I tried the suggestion on Github [missing vs nothing #1854] which was marked as the solution:
replace( df.A , nothing => missing)
and it didn’t complain but didn’t work (‘nothings’ still there).
I wondered why the ! wasn’t needed and also why I didn’t need to broadcast, but attempting either of those led to an error.
Does anyone have any ideas?
Thanks
replace
sans !
returns a copy of the collection with the given replacement, but doesn’t modify in place. To modify in place, use replace!
. So if you do e.g.
using DataFrames
df = DataFrame(x = [1, nothing, 3])
replace(df.x, nothing => missing)
Then the last line will just return [1, missing, 3]
without modifying your DataFrame. Note that replacing nothing
with missing
will tend to require a new array type, and so replace!
will throw an error. (E.g. an array with element type Union{Int64, Nothing}
can’t hold values of type Missing
.) Therefore, you’ll probably want to change dataframe columns by df.A = replace(df.A, nothing => missing)
.
3 Likes
Hi Thanks, That is what I thought. But when I added the ! it threw an error:
julia>replace(pt[:Age],nothing=>missing);
julia> replace!(pt[:Age],nothing=>missing);
Warning: getindex(df::DataFrame, col_ind::ColumnIndex)
is deprecated, use df[!, col_ind]
instead.
caller = top-level scope at REPL[722]:1
@ Core REPL[722]:1
ERROR: MethodError: Cannot convert
an object of type Missing to an object of type Int64
Thanks again
PS I edited out the first Warning about using pt[:Age] which is not relevant.
The array element type cannot accommodate missing
. You can use replace
, replacing the entire column, eg
df.col = replace(df.col , nothing => missing)
2 Likes
I see. Great. Thanks [worked]