MLJ - A machine learning toolbox for Julia

ablaom · April 30, 2019, 3:25am

MLJ - Machine Learning in Julia

MLJ is a new flexible framework for composing and tuning supervised and unsupervised learning models, currently scattered in assorted Julia packages, as well as wrapped models from other languages. The MLJ project also seeks to focus efforts in the Julia ML community, and in particular to help inter-operability and maintainability of key ML packages.

The package has been developed primarily at The Alan Turing Institute but enjoys a growing list of advisors and contributors. If you like the project, please star the GitHub repo to boost the prospects of a pending funding review.

Quick links

☞ MLJ vs ScikitLearn.jl

☞ Video from London Julia User Group meetup in March 2019 (skip to demo at 21’39)

☞ Basic Usage and Tour

☞ Building a self-tuning random forest

☞ An MLJ docker image (including tour)

☞ Implementing the MLJ interface for a new model

☞ How to contribute

☞ Julia Slack channel: mlj.

Key implemented features

Learning networks. Flexible model composition beyond traditional
pipelines (more on this below).
Automatic tuning. Automated tuning of hyperparameters, including
composite models. Tuning implemented as a model wrapper for
composition with other meta-algorithms.
Homogeneous model ensembling.
Registry for model metadata. Metadata available without loading
model code. Basis of a “task” interface and facilitates
model composition.
Task interface. Automatically match models to specified learning
tasks, to streamline benchmarking and model selection.
Clean probabilistic API. Improves support for Bayesian
statistics and probabilistic graphical models.
Data container agnostic. Present and manipulate data in your
favorite Tables.jl format.
Universal adoption of categorical data types. Enables model
implementations to properly account for classes seen in training but
not in evaluation.

Some planned enhancements

Integrate deep learning packages, such as Flux.jl.
Model agnostic gradient descent tuning using automatic
differentiation.
Enhance support for time series and sparse data.
Add support for heterogeneous/distributed architectures.
Package common learning network architectures (linear pipelines,
stacks, etc) as simple one-line operations.
Implement systematic benchmarking for models matching a given task.
Automated estimates of cpu and memory requirements for given task/model.
Implement DAG style scheduling.
Extend and integrate existing loss function libraries to better handle
probabilistic prediction.
Add interpretable machine learning measures.
Add online learning support.

Feedback, and offers of help very welcome!

Topic		Replies	Views
[ANN] MLJ: an update Machine Learning	7	1277	December 1, 2019
MLJ (Machine Learning in Julia) 0.12 update Package Announcements	4	581	July 23, 2020
Automate training MLJ models Machine Learning machine-learning , mlj	14	2116	February 17, 2020
[ANN] BetaML v0.8: Model defininition, hyperparameters tuning and fitting in 2 lines Machine Learning package , announcement , machine-learning	4	500	October 3, 2022
Machine Learning Toolset Improvement Machine Learning	10	2564	December 13, 2018

MLJ - A machine learning toolbox for Julia

MLJ - Machine Learning in Julia

Quick links

Key implemented features

Some planned enhancements

Related topics