Hi,
Say I have a large DataFrame, with thousands of rows. For some of those rows, I need to parse out a substring and put it in another column. In my case, these rows have “Register(nnnn)” in them, and I need to extract out “nnnn”.
Here is a MWE with just 4 rows:
dframe = DataFrame(Item = ["Register(1234)","Flow","Register(6789)","Temp"], b = 1:4)
insertcols!(dframe,2,:RegisterID=>"")
rrows = findall(x -> split(x,"(",)[1]=="Register", dframe[:,1])
for i in rrows
dframe[i,2]=SubString(dframe[i,1],10:13)
end
This works fine. However, I’m suspicious there should be some clever way to broadcast that SubString() function to the subset of rows and put the result in column 2 – but I can’t figure out the syntax.
It will provide a more simple syntax than DataFrames.jl for many data cleaning operations. No need to learn all of DataFrames before beginning to learn DataFramesMeta.jl