Change values to missing in a column

So I have been trying to wrap my head around how to set a value to missing with an IF ELSE statement… It seems like this should be an easy way to do it:

df= @chain df begin
@transform :x= begin 
        if :x != 0
            :x
            else missing
        end
    end
end

but the results are unchanged:


168779-element Vector{Int64}:
 20220810
 20220810
 20220810
 20220810
 20220810
 20220810
 20220810
 20220810
 20220810
 20220810
 20220810
 20220810
 20220810
        ⋮
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
        0
 20220705

Ultimately I need those zeros to be missing so I can run:

df.x= Dates.Date.(string.(df.x), dateformat"yyyymmdd")

but I can’t with value 0 - however with a missing I can just add a passmissing(x) in addition to the string()

OR is there an easier way altogether that I just haven’t seen yet?

Why do you need all those machinery, could you not just do

ifelse.(df.x .== 0, missing, Date.(string.(df.x), dateformat"yyyymmdd"))

(btw if performance matters you can probably do better by using digits of your Int to directly construct the date rather than allocating a string and then parsing that)

1 Like

hmm this gave me:

LoadError: ArgumentError: Unable to parse date time. Expected directive DatePart(yyyy) at char 1

You should use macro @byrow,

julia> df= @chain df begin
       @transform @byrow :x= begin
               if :x != 0
                   :x
                   else missing
               end
           end
       end

The easyway to do this is

@transform df  @byrow :x = ifelse(:x == 0, missing, :x)
1 Like

or with @rtransform the @byrow part is not needed.

3 Likes

Great. Thanks.

@rtransform! df :x = ifelse(:a == 0, missing, :x)
3 Likes

Do you mean something like this or is there some function that does the transformation directly?


df.date= datefromint.(df.x)


function datefromint(d::Int)
    if d!=0
    Date(div(d,10^4),div(rem(d,10^4),10^2),rem(d,10^2))
    else
        missing
    end
end

But the most important reason why I intervened is to ask for clarification on the functioning of the ifelse function


df.date= map(d->ifelse(d!=0, datefromint(d),missing), df.x)

datefromint(d::Int)=Date(div(d,10^4),div(rem(d,10^4),10^2),rem(d,10^2))

It seems that ifelse evaluates the datefromint function even if d == 0.
I wonder if this is the case in general, that is, if ifelse evaluates both the “then” and “else” conditions before the return?
But it is more likely that there is something (or more than something) that he did not understand.

You can still use the ternary notation.

@rtransform! df :x = :a == 0 ? missing : :x
1 Like

Thanks, it should be

@rtransform! df :x = :a == 0 ? missing : :x
2 Likes

You can also use replace! just like Stata. I like this way.

replace!(df.x, 0=>missing)