Change column to row using conditions

Hi,
i have a dataframe like this

df = DataFrame()

df.id = sort!(repeat(1:3,5))
df.time = repeat(1:5,3)
df.dv1 = repeat(['.',".",".",".",1],3)
df.dv2 = repeat(['.',".",".",".",10],3)
df.cmt = repeat([1,1,1,1,2],3)

i just want to delete dv2 column, and insert non missing dv2 values in dv1 one column wherever time and id are matching. and change the df.cmt to 3 wherever df2 values were inserted. Here is the DataFrame i want.

df2 = DataFrame()
df2.id = sort!(repeat(1:3,6))
df2.time = repeat([1,2,3,4,5,5],3)
df2.dv = repeat(['.',".",".",".",1,10],3)
df2.cmt = repeat([1,1,1,1,2,3],3)

Hope i am clear. Thanks in advance.

This gives you a different row order, but I assume this is acceptable (you can always re-order later):

dv2 = df.dv2;
df2 = select(df, Not(:dv2));
for (i, v) in enumerate(dv2)
    if v isa Number
        push!(df2, (df2.id[i], df2.time[i], v, 3))
    end
end

Note that your data does not have missing values but a mixture of '.' and "." for non-numeric data so I did v isa Number check. You might want to use some other test in your production code.

1 Like

OP, it seems you may be coming from Stata, where missing values appear as . . Julia does not work this way. We have a type missing which is used for missing values. I encourage you to read that instead.

One way to do this easily with DataFramesMeta.jl is with the @eachrow macro

julia>  @eachrow df begin 
           if :time == :id
               if !ismissing(:dv2)
                   :dv1 = :dv2
                   :cmt = 3
               end
           end
       end

Although I followed your instructions in your post and no row matches the criteria for changes.

2 Likes

Sorry, I meant id and time values are same for dv1 and dv2

@pdeffebach - OP wanted to duplicate rows in which :dv2 was non-missing.

1 Like