Hello everyone!
I am really happy to finally announce the next JuliaML package reaching a stable state: MLDataPattern.jl
Github: https://github.com/JuliaML/MLDataPattern.jl
and with it the long overdue update of MLDataUtils.jl, which now uses MLDataPattern as one of its back-ends; thus serving as a meta package. It took us months to finally get here. The last tag of MLDataUtils was right around the 0.5 release. Since then we have completely redesigned the data munging functionality and just recently - because of code complexity - outsourced them into their own package MLDataPattern. With this change, the original package MLDataUtils will now serve as a convenient end-user facing package that reexports all data related functionality of JuliaML
Github: https://github.com/JuliaML/MLDataUtils.jl
Description
MLDataPattern is a long running effort from a few of us to design and implement a package for common ML data access pattern in a Julian manner. As such you may find it a bit unintuitive at first if you are used to other frameworks from other languages. Yet we think the benefits are worth it. Most notably the package provides a number of pattern for lazy shuffling, partitioning, and resampling data sets of various types and origin. At its core, the package was designed around the key requirement of allowing any user-defined type to serve as a custom data source and/or access pattern in a first class manner. We tried to accomplish this by designing the package to be as data container agnostic as we could.
Check it out! The documentation is very comprehensive.
Documentation: MLDataPattern.jl’s documentation — MLDataPattern.jl 0.1 documentation
Closing Words
Let me know what you think. Any kind of feedback or criticism is very welcome!
Big thanks to @tbreloff @oxinabox for design and code contributions to the data access pattern!