This post summarizes the current roadmap for DataFrames.jl and the data ecosystem in general for the Julia 1.0 milestone. Plans have changed significantly since the previous post . Indeed, it has appeared that representing missing values via the special Nullable type is not the only solution to attai…

NM, I think I have one: f(x::Int)=x+1 f(::Void) =nothing g(x) = x>0 ? f(x) : f(nothing) @code_llvm g(0) define { i8**, i8 } @julia_g_61178([8 x i8]* noalias nocapture, i64) #0 !dbg !5 { top: %2 = icmp slt i64 %1, 1 br i1 %2, label %L3, label %if if: …

@nalimilan , is there an update on this? I’ve noticed that all of the DataTables commits have been merged with DataFrames and that DataFrames now uses Nulls instead of DataArrays. This is definitely a welcome update, I’d like to start migrating my packages from DataTables to DataFrames in anticipat…

Yes, I wanted to post an short update but you beat me to it. You summarized the situation pretty well. Users and package authors are encouraged to test the master branch of DataFrames, which is now based on Nulls.jl and reflects the new API. The branch should be fully usable, but it will not be as f…

Great. Do we know yet whether the promised improvements to union types have already been implemented in 0.7? I haven’t seen anything about it in NEWS.

Yes, the Union struct/array optimizations have landed in Base. There is still an open issue to improve inference (currently planned and in progress), as well as a SIMD isbits Union array issue that will hopefully be helped by the first issue and some other planned codegen cleanups. Both are on track…

[image] nalimilan: The main limitation right now is that most packages depending on DataFrames have no yet been updated (e.g. Query and RCall; there’s an open pull request for StatsModels). I just added this PR Add support for Nulls.jl based DataFrames.jl by davidanthoff · Pull Request #60 · …

[image] davidanthoff: The whole Union{T,Null} design has some fundamental issues that make it more or less unusable within Query.jl :frowning: Do you have a writeup explaining the issues?

I would imagine the discussion here: https://github.com/JuliaData/Nulls.jl/issues/6

[image] gabrielgellner: I would imagine the discussion here: Union{T,Null} inference in structs · Issue #6 · JuliaData/Missings.jl · GitHub That is the one.

I think I more or less understand why this won’t work for Query.jl design as to produce columns efficiently out of a named tuples iterator you’d need to infer the return type, which is harder with Union{T, Null}. However I’m still a bit confused as to what is the policy when collecting into a data s…

Announcement: An Update on DataFrames Future Plans

Specific Domains Data

shashi December 27, 2017, 5:03pm 44

If the maintainers of IndexedTables.jl13 at some point pick a standard way to represent missing data

I think we’ll be going with missing on Julia 0.7+

Missing data and NamedTuple compatibility

Topic		Replies	Views
Representing Nullable Values Internals & Design	39	7758	January 20, 2018
Is there light at the end of the DataFrames tunnel? Data question	36	4718	November 24, 2017
DataTables or DataFrames? Data question	32	15780	November 19, 2018
Getting our act together in the data ecosystem Data	4	1860	July 4, 2017
Missing data and NamedTuple compatibility Internals & Design	91	11378	April 2, 2018

Announcement: An Update on DataFrames Future Plans

Related topics