I’m a little hesitant to ask mildly terrified of asking, but has any thought been given to writing a dataframe implementation using union types, as we’ve been told that updates in the handling of union types will render them efficient enough to be appropriate for use in data?
I am aware of Nulls.jl and that @quinnj has been experimenting with using this for DataStreams.jl, but I’m not aware of any actual dataframes implementation.
As far as I know, using union types even in their current state wouldn’t be any less efficient than what is already being done in DataFrames in most cases. The Nulls approach also seems superficially more similar to the approach of DataFrames rather than that of DataTables. I think the changes that would need to be made to DataFrames to make them use Nulls would be relatively minor. I suppose it would be a bit foolhardy to start on this before seeing that union types will indeed become as efficient as it has been suggested, but it’s tempting to look forward to the ultimate solution.
I was just wondering what the thinking was among the data people.