What is the relation between MLJ and Flux?

The thing that we did right with DifferentialEquations.jl is we used multiple dispatch to build a system where anyone could add algorithms. It’s all described here:

https://www.sciencedirect.com/science/article/abs/pii/S0965997818310251

MLJ is building a very similar system for machine learning. Of course they have to write the first however many packages and wrappers for the system, because without it being useful there won’t be buy-in. However, at this point, if you create a fancy new machine learning algorithm and want MLJ users to use it, you can fairly easily add a few lines to then allow people to call your algorithm as part of an ensemble.

SciKitLearn on the otherhand is very top-down, where the algorithms that can be used are the ones that are in the blessed repo. That paper describes some of the advantages we’ve seen from a confederated system:

  1. Original authors more readily adopt the system because they can keep the package and get academic credit. Overtime we’ve seen many of these migrate over to a standardized organization for helping with maintenance, but that’s not necessarily all of the cases.

  2. There’s multitudes of the same implementation allowed, which allows for different performance characteristics and a nice system for benchmarking for research.

  3. Even if people act adverse to the whole world and write a new cool method/implementation in a way that is incompatible, you can just make a new package that slaps the common interface on it and now it’s usable from the system while not requiring code to move to a new repo. Easy peezy.

  4. Being confederated, if some people don’t agree with a certain code structure, that’s fine. You can work in different repos in a way that doesn’t effect users.

In total, I think there’s a lot of advantages to this approach, and am glad MLJ has gone down this route.

Back to the original question, Flux is a deep learning library so it’s completely different.

15 Likes