Replace all values greater than 1 in a DataFrame with 1

What’s the most idiomatic way of achieving this in Julia? Can this be done as a one-liner?

Something like this:

df[df.> 0] .= 1

I would maybe do by column.

julia> for col in eachcol(df)
           col[col .> 0] .= 1
       end
1 Like

You will get bitten in these approaches if you do not filter out non-numeric columns. Use the loop but add an element type check before trying to replace.

1 Like

One solution is to use names to select real-valued columns.

julia> for col in eachcol(df[!, names(df, Real)])
           col[col .> .5] .= 100
       end
4 Likes

Or if you desperately wanted a one-liner:

mapcols!(x -> ifelse.(x .> 0, 1.0, x), df)
1 Like

df=DataFrame(rand(10,4).*2, :auto)

DataFrame(min.(Matrix(df),1),names(df))

select(df, names(df,Real).=>x->min.(x,1))

Do these have any benefit over existing solutions?

  1. The first solution allocates a matrix, which is highly inefficient.
  2. The second one does not modify the data frame in place, as OPs question implied.

Do you have any explanation for why you included the solutions you did?

It’s fine, they’re just suggestions.

Good pointing out how they compare to other solutions though.

Same style as @nilshg 's one-liner, but with column filtering:

mapcols!(x -> eltype(x)<:Real ? 
  ifelse.(x .> one(eltype(x)), one(eltype(x)), x) : x, df)
1 Like

no. Except I had some free time

this version does not unnecessarily allocate a matrix

min.(df[:,names(df,Real)],1)
1 Like