I am looking for a more elegant way to do this
data[(data.ID .== 1) .| (data.ID .== 4) .| (data.ID .== 7) .| (data.ID .== 10),: ]
where data
is a DataFrame. I simply want to extract rows where the ID value is one of the four specified. I was trying stuff like data.ID .== [1,4,7,10]
. But this breaks because of the broadcasting.
One way to do it is by using Ref
julia> df = DataFrame(:ID => collect(1:10), :x => 'a':'j')
julia> df[in.(df.ID, Ref((1, 4, 7, 10))), :]
4×2 DataFrame
│ Row │ ID │ x │
│ │ Int64 │ Char │
├─────┼───────┼──────┤
│ 1 │ 1 │ 'a' │
│ 2 │ 4 │ 'd' │
│ 3 │ 7 │ 'g' │
│ 4 │ 10 │ 'j' │
It is easy with filter:
filter(row → row.ID in [1,4,7,10], data)
I suggest also to to read the DataFrame Tutorial , it is very informative about DataFrames.
julia> data = DataFrame(:ID=>[1, 4, 7, 10, 20])
5×1 DataFrame
│ Row │ ID │
│ │ Int64 │
├─────┼───────┤
│ 1 │ 1 │
│ 2 │ 4 │
│ 3 │ 7 │
│ 4 │ 10 │
│ 5 │ 20 │
julia> filter(row->row.ID in [1,4,7], data)
3×1 DataFrame
│ Row │ ID │
│ │ Int64 │
├─────┼───────┤
│ 1 │ 1 │
│ 2 │ 4 │
│ 3 │ 7 │
2 Likes
Okay thank you both, I’ll check out the tutorial
nilshg
March 12, 2020, 4:36pm
5
Skoffer:
df[in.(df.ID, Ref((1, 4, 7, 10))), :]
I also like the Fix2
version of in
which I find more readable:
df[in([1, 4, 7, 10]).(df.ID), :]
2 Likes
Since nobody else has mentioned it, you might also find DataFramesMeta to be useful.