Hi i have a dataframe looks like this
df1 = DataFrame()
df1.id = sort!(repeat(1:3,5))
df1.a = [1,missing,2,3,missing,missing,2,3,4,5, 1,2,3,missing,5]
i want to fill the missing values in column a with the previous value of same id
i want a dataframe like this
df2 = DataFrame()
df2.id = sort!(repeat(1:3,5))
df2.a = [1,1,2,3,3,missing,2,3,4,5, 1,2,3,3,5]
can somebody help me to do this
Here is a quite verbose way of doing it. 
for gdf in groupby(df1,:id)
for row_idx in 2:nrow(gdf)
if ismissing(gdf.a[row_idx])
gdf.a[row_idx] = gdf.a[row_idx-1]
end
end
end
1 Like
what kind of midification should i do , to fill the value with next value of same id
is this fine
for gdf in groupby(df1,:id)
for row_idx in 1:nrow(gdf)-1
if ismissing(gdf.a[row_idx])
gdf.a[row_idx] = gdf.a[row_idx + 1]
end
end
end
Thanks, is there any other way of doing it ?
Here is a one-liner, but I find it hard to comprehend.
combine(groupby(df1,:id),:a=>(x->[x[1],coalesce.(x[2:end],x[1:end-1])...])=>:a)
1 Like
bkamins
#10
If you want to update df1
in-place do:
using Impute
for sdf in groupby(df1,:id)
sdf.a .= Impute.locf(sdf.a)
end
or
transform!(groupby(df1, :id), :a => Impute.locf => :a)
if you want a new data frame:
transform(groupby(df1, :id), :a => Impute.locf => :a)
4 Likes
InMemoryDatasets
package has ffill
and bfill
similar to pandas
functions.
using InMemoryDatasets
ds=Dataset(df1)
modify(IMD.groupby(ds,:id),:a=>ffill!)
2 Likes