Absolutely. The main difference right now is providing an intuitive programming model while also being able to take advantage of optimisations and new hardware accelerators easily. There’s been less in the way of new PL features that support ML so far, but projects like Myia have a good opportunity to start exploring that area.
As existing languages go, Julia and Swift are by far the best suited, but they’re ultimately still general-purpose languages that weren’t designed with ML in mind. They inherently bring engineering challenges and expressiveness issues that something more specialised might not have.
An engineering example – The Swift docs give a good idea of the challenges involved in extracting TensorFlow-compatible graphs from a program. It sounds like it should be pretty easy to turn m = Chain(Dense(10, 5, relu), ...)
into a graph, for example, until you realise that a model might do m.layers[3] = Dense(...)
halfway through the forward pass.
While these things are solvable, mutable data structures causes a lot of issues here as well as with AD and other optimisations, and are not even necessary for the way people code against data frames and GPUs. A new language could easily have a functional data model and simplify things hugely.
For an expressiveness issue, consider my ideal definition of the Dense (FullyConnected) layer:
Dense(in, out) = x -> W * x .+ b
where W = randn(out, in), b = randn(out)
The Flux docs actually introduce layering this way, but in real layers we have to define a struct and a bunch of boilerplate. To actually make it work we need to be able to treat closures as data structures (to move them to the GPU, for example) and perhaps have nicer ways to name and refer to closures based on where they came from (i.e. not (::#9)
). These really seem like general language-level features that just happen not to supported anywhere.
ML has a bunch of cases like this, where certain patterns seem unusual or even downright backwards from the traditional software engineering standpoint, but turn out to be valid use cases that just aren’t prioritised by mainstream languages. Novel abstractions and language features could help hugely here, which is part of what makes the field exciting for PL people.