Swift for Tensorflow rationale

Thanks. I didn’t mean hacking as something negative, I’ve actually been a subscriber to the 2600 Magazine. I mean specifically modifying a language interpreter to implement a new “niche” feature that might even not make it to upstream. It is starting to seem to me that Julia is attaining something special by not just being a modern language based on the LLVM, but by sticking to a few extra principles and clear priorities, and embracing meta-programming early on might turn out to be a huge strategic move for Julia, even if no concrete plans were in sight. I guess this will all become clearer maybe this year!

4 Likes

RIP- Swift for TensorFlow Shuts Down ( 2021 - Feb -12 )
“Swift for TensorFlow was an experiment in the next-generation platform for machine learning, incorporating the latest research across machine learning, compilers, differentiable programming, systems design, and beyond. It was archived in February 2021.”

https://github.com/tensorflow/swift

4 Likes

I do hope this didn’t come as a surprise to anyone…

3 Likes

Looking at the repo it does surprise me. A lot of contributors, well written tutorials, significant code base, and a lot of stars. Admittedly I know nothing about Swift and very little of tensorflow so maybe there’s something I’m missing. In any case making ML a first class citizen in any language would be an awesome development independently of the language IMHO so that’s why I don’t see the rationale of stopping the development?

Edit: just saw this Welp! Swift for TensorFlow Archived - #8 by ToucheSir

Oh I wasn’t trying to say anything bad about the project, it indeed is an impressive piece of work. I was just referring to Google’s consistent lack of follow-through on things like this.

Just last year when Chris Lattner left they assured everyone that the project was in good hands and strong roadmaps and all that. Then, just over a year later, it’s cancelled.

In some ways it’s worse that the project had such strong documentation and tutorials, significant codebase and stars. If it didn’t have those things, probably much fewer people would have the rug pulled out from under them when it suddenly gets canned.

11 Likes

I would definitely agree with that sentiment. I never cared for Google’s way of doing software development or indeed many other aspects of their business methods. It did say in the link though that most of the differential stuff is getting out into Swift. But must be tough for the devs to see their baby go away.

The tragedy to me is that had Julia been chosen as the language for this, rather than Swift, the work done would probably live on even if/when Google pulled the plug. Swift is a nice language, but ML and autodiff and the like is not what the existing Swift community is about. This was a great opportunity lost for Google, but, as said above, no surprise.

2 Likes

I don’t look at it as too bad Google dropped another project, but rather that the project has received the ultimate vindication. The AD capability has been upstreamed to Swift proper, as requested by core Swift team, and carried out by the key S4TF people. Google may no longer be supporting it officially, but some of their devs are still participating as part of the upstreaming team. Had Chris Lattner stayed at Google Brain, we could envisioned the same thing happening, just maybe sooner. But it’s exciting that Swift is now differentiable at ground level. This also vindicate Julia’s approach, which has shared similar philosophies.

Some nice Swift syntax:

func f(_ x: Float) -> Float {
    x * x
}
let dfdx = gradient(of: f)
dfdx(3) // 6

Are there any sizeable projects in Swift right now that use this AD capability?

Richard Wei of Apple’s CoreML team says work continues on the AD front, so I assume this means that Apple will be taking advantage of this for their own purposes.

That probably means its safe from bitrot.

not sure what’s especially “nice” about this when this is the same for other languages:

julia> f(x::Number) = x*x

julia> dfdx(x) = gradient(f,x)[1]
dfdx (generic function with 1 method)

julia> dfdx(3)
6

julia> dfdx(3.0)
6.0
In [1]: from jax import grad

In [2]: f = lambda x: x*x

In [3]: dfdx = grad(f)

In [4]: dfdx(3.0)
Out[4]: DeviceArray(6., dtype=float32)
1 Like

I didn’t mean nicer, just nice, and specifically similarly as nice as Flux.

I also don’t mean to promote Swift, which I know little about, so much as to put context on S4TF being archived. I also know little about Jax, but I do have some questions. Just how much of numpy and python have they recreated, enough that people don’t run into the limitations inherent to cython/pypy/numba? Can they autodiff through ODE integrators (I believe they can do reverse adjoints)? And even though they say it is composable, how far does that go, can you autodiff with a new custom type, or is it limited to Float32/64? Is Jax as promising as Julia?

As for Swift, I’m not sure what the limitations are, but it seems sensible to have AD built into a high level language. I suspect Swift is currently the most direct path to go from prototyping an algorithm that uses differential programming, to distributing on consumer phones.