Hey everyone,
I would like to remove values that are equal to datetime2unix(DateTime(0)) from my column :
replace!(df[:,:col2], datetime2unix(DateTime(0)) => nothing)
OR
df[:,:col2] = replace(df[:,:col2], datetime2unix(DateTime(0)) => nothing)
I get this error :
MethodError: Cannot `convert` an object of type Nothing to an object of type Float64
Do you have any ideas ?
Thank you
I can think of a few different things here:
Firstly, it sounds like you are describing a scenario where you are reading in data where you know that DateTime(0)
isnβt actually a valid observation in the dataset. If that is indeed the case, youβre best bet is probably to handle this when you read in the dataset. Most packages have this capability as a keyword argument. For example:
using CSV
CSV.read("file.csv", missingstrings = "00:00:00")
If that isnβt the issue, you could try the below:
Try doing this using an operation that is not in place. My guess is that if you ran typeof(df[:, :col2])
you would see something like Vector{DateTime}
. Since DateTime vectors cannot hold objects of type Nothing
, the error makes sense. So if you just created a new object that would solve your problem:
col3 = replace(df[:, :col2], datetime2unix(DateTime(0)) => nothing)
typeof(col3)
df2 = hcat(df, col3)
or
df = transform(df, :col2 => ByRow(
x -> if x == datetime2unix(DateTime(0))
nothing
else
x
end
) => :col2)
If you donβt need the whole row, then you could use an inplace operation to just subset out the bad observations:
filter!(:col2 => c -> c == datetime2unix(DateTime(0)), df)
1 Like
nilshg
November 8, 2021, 5:21pm
3
Your issue comes from using df[:, :col2]
indexing which attempts to use the existing column, which as Derek says canβt hold values of type Nothing
.
You can do:
df[:,:col2] = replace(df[:,:col2], datetime2unix(DateTime(0)) => nothing)
or
df.col2 = replace(df[:,:col2], datetime2unix(DateTime(0)) => nothing)
If you create a new vector you are however re-allocating everything, which will not be the most performant way of going about this.
Note also that nothing
is not usually meant to signify missing data, for this there is missing
.
In summary, I would do the following:
julia> df = DataFrame(x = [now(), DateTime(0)]) # example data
2Γ1 DataFrame
Row β x
β DateTime
ββββββΌβββββββββββββββββββββββββ
1 β 2021-11-08T17:17:52.754
2 β 0000-01-01T00:00:00
julia> allowmissing!(df, :x) # Change type of column x to allow missing values
2Γ1 DataFrame
Row β x
β DateTime?
ββββββΌβββββββββββββββββββββββββ
1 β 2021-11-08T17:17:52.754
2 β 0000-01-01T00:00:00
julia> replace!(df.x, DateTime(0) => missing); df # use mutating version of replace
2Γ1 DataFrame
Row β x
β DateTime?
ββββββΌβββββββββββββββββββββββββ
1 β 2021-11-08T17:17:52.754
2 β missing
3 Likes
Thank you both for these answers, I replaced my values DateTime(0) with missing as you said it worked perfectly !