I have a DataFrame containing a column of Strings and a separate, standalone Vector of strings. I want a resulting DataFrame with the rows containing the strings in the standalone vector… basically check the text column against each string entry in the vector.
I did this, but seems overly convoluted for this task. Is there a simpler or more efficient way?
Thanks!
julia> df = DataFrame(:number => [1, 2, 3, 4], :text => ["Green Car", "Purple Grape", "Yellow Banana", "Purple Bruise"])
4×2 DataFrame
Row │ number text
│ Int64 String
─────┼───────────────────────
1 │ 1 Green Car
2 │ 2 Purple Grape
3 │ 3 Yellow Banana
4 │ 4 Purple Bruise
julia> keyword = ["Green", "Brown", "Black", "Blue"]
4-element Vector{String}:
"Green"
"Brown"
"Black"
"Blue"
julia> result = DataFrame(:number => Integer[], :text => String[])
0×2 DataFrame
julia> for color in keyword
append!(result, df[contains.(df.text, color), :])
end
julia> result
1×2 DataFrame
Row │ number text
│ Integer String
─────┼────────────────────
1 │ 1 Green Car