[ANN] A new lightning fast package for data manipulation in pure Julia

My understanding is that the package:

  1. was a fresh re-write (EDIT: after reading the source codes of the package it seems it took the DataFrames.jl sources that the creator liked and dropped parts that were baggage), so it does not have a baggage of not breaking things we have in DataFrames.jl.
  2. it currently makes more assumptions what data it can store/process and uses these assumptions in the algorithms (DataFrames.jl is designed to store anything that is valid Julia “as is”). Of course in the future maybe these restrictions would be lifted.

An example of the second point:

julia> name = Dataset(ID = vcat.([1, 2, 3]), Name = ["John Doe", "Jane Doe", "Joe Blogs"])
3×2 Dataset
 Row │ ID        Name
     │ identity  identity
     │ Array…?   String?
─────┼─────────────────────
   1 │ [1]       John Doe
   2 │ [2]       Jane Doe
   3 │ [3]       Joe Blogs

julia> job = Dataset(ID = vcat.([1, 2, 2, 4]), Job = ["Lawyer", "Doctor", "Florist", "Farmer"])
4×2 Dataset
 Row │ ID        Job
     │ identity  identity
     │ Array…?   String?
─────┼────────────────────
   1 │ [1]       Lawyer
   2 │ [2]       Doctor
   3 │ [2]       Florist
   4 │ [4]       Farmer

julia> leftjoin(name, job, on = :ID)
ERROR: MethodError: Cannot `convert` an object of type Vector{Int64} to an object of type Integer

julia> leftjoin(DataFrame(name), DataFrame(job), on = :ID)
4×3 DataFrame
 Row │ ID      Name       Job
     │ Array…  String     String?
─────┼────────────────────────────
   1 │ [1]     John Doe   Lawyer
   2 │ [2]     Jane Doe   Doctor
   3 │ [2]     Jane Doe   Florist
   4 │ [3]     Joe Blogs  missing
5 Likes