Announcement: DataFrames Future Plans

The DataTables manual is online now.

The documentation is now available at Introduction · DataTables.jl

1 Like

Very nice!

Having played around with DataTables I have a question about modifying columns. Say I want to subtract the mean of every column. I came up with the following rather verbose code:

function centering!(dt)
  for (name, col) in eachcol(dt)
    x0 = mean(dropnull(col))
    tmp = NullableArray(Float64, length(col))
    for i in eachindex(col)
      tmp.values[i] = col.values[i] - x0
      tmp.isnull[i] = col.isnull[i]
    end
    dt[name] = tmp
  end
end

Where I create a new NullableArray, fill it with values and “isnulls” and finally replace the original column in the DataTable with the new one. Is there less verbose way of doing this, maybe with built in methods I overlooked?

Cheers,
Andre

You should never need to access the values field. Something like map!(x -> x - x0, col) is enough. On Julia 0.6, col .- Nullable(x0) also works, though in the present case it’s not a clear improvement.

You can also use Query.jl which will unwrap (“lift”) nullables automatically.

1 Like

Nice. Sorry if it’s offtopic to post here, but I was first expecting something like this to be possible using colwise (and I guess this works in 0.6) but why is there no colwise! function?

Yes, I guess colwise! would be a useful addition. Could you file an issue?

2 Likes

Happy to :slight_smile:

1 Like

I’ve just posted an update on the DataFrames plans in a new post.