What is Julia's equivalent of R's %in%

Luigi_Marongiu · September 13, 2019, 2:21pm

hello,
I have two different dataframes with different columns and headers, but the columns Start and Start_site are related and in particular I want to select the rows of the first dataframe with Start matching the second dataframe with Start_site.
Essentially I am looking for the julia equivalent of R’s Start %in% Start_site.
I tried with join, but it requires the same dataframe structure…
Thanks

Luigi_Marongiu · September 13, 2019, 2:25pm

Nevermind, I found it:

join(df1, df2; on = :Start, kind = :semi, makeunique = false,
                   indicator = nothing, validate = (false, false))

I had to call the columns with the same name

ElOceanografo · September 13, 2019, 4:12pm

There is also the in function:

start = [1, 3, 4, 6]
start_site = 1:5 
in_start_site = in(start_site) # in(...) returns a function
in_start_site.(start) #  [true, true, true, false]

in(collection) returns a function that tests if something is in collection. You can also write this in one line:

in(start).(start_site)

Tamas_Papp · September 14, 2019, 6:02pm

cf

TL;DR: you are probably looking for

x .∈ Ref(Set(y))

Luigi_Marongiu · October 8, 2019, 2:08pm

Hello, I got a slight variation of this problem. Instead of testing arrays, I would like to select the rows of a dataframe whose column X corresponds to the all the elements of an array. In other words:
if I have an array x=["a", "b", "c"] and a dataframe df with unique(df[:X]) = "a", "c", "d", "g", can I make a selection? df[df[:X] .== x, :], df[df[:X] .∈ x, :] and df[occursin.(x, df.X), :] did not work…

nilshg · October 8, 2019, 3:25pm

df[in(x).(df.X), :]

Luigi_Marongiu · November 21, 2019, 1:26pm

what would be the negation of this? that is how to select the rows of the dataframe whose field X IS NOT in x?

MrUrq · November 21, 2019, 1:42pm

julia> x=["a","b","c"]
julia> df = DataFrame(X = ["a","c","d","g"])

julia> df[.!in(x).(df.X),:]

2×1 DataFrame
│ Row │ X      │
│     │ String │
├─────┼────────┤
│ 1   │ d      │
│ 2   │ g      │

Mattriks · November 21, 2019, 1:50pm

or:

join(df1, df2, on=:X, kind=:anti)

?join

pdeffebach · November 21, 2019, 2:16pm

filter(df) do row
    row.X in x
end

Topic		Replies	Views
Delete rows that exist in another data frame in Julia? New to Julia question	1	1557	May 27, 2019
Julia equivalent to R %in% General Usage	3	565	February 24, 2019
Filter DataFrame by an Array New to Julia	8	5923	December 10, 2019
What is julia lang equivalent to R's %in%? New to Julia	7	506	March 2, 2021
DataFrame isin operation New to Julia dataframes	4	1119	December 9, 2021

What is Julia's equivalent of R's %in%

Related topics