Is it a good time for a PyTorch developer to move to Julia? If so, Flux? Knet?

If anyone knows pytorch well enough, it might be useful to have that as an additional comparison?

1 Like

Wow the statistics cheatsheet is great! Once DataFrames hits 1.0 we should submit a PR for this.

3 Likes

That^

In the short term, https://github.com/MikeInnes/Mjolnir.jl

In the medium term, https://github.com/JuliaLang/julia/pull/33955 and future PRs building on it will allow faster and more composable Zygote, Cassette and other code transform packages built on typed IR

A side question: why does Flux.jl have dependency on the Juno.jl? Will there be any missing features of Flux if I use VSCode with Julia extension instead of Atom with Juno?

Juno.jl is an extremely lightweight dependency along the same lines as RecipesBase.jl to allow packages to integrate with Juno without depending on the heavier Atom.jl. I can’t speak from experience to what features that enables for Juno though.

Glancing at the source, it appears to simply be used for defining a nice, foldable representation for the Juno REPL which they might be able to do by depending on TreeViews instead?

1 Like

I am tempted to add pytorch code for comparison but I fail to see the point of comparing frameworks using such a small network. Am I missing something? :slight_smile:

@bafonso PyTorch has a very different approach than Keras. In fact, one of the main reasons I moved from Keras to Pytorch was the break-down of the training loop.
Where in Keras you would have something like model.fit(data) in Pytorch you have this breakdown of the “fit” to:

for epoch in range(epochs):
    gen = DataGenerator(....)
    for x, y in gen:
        model.zero_grad()
        output = model(x)
        loss = criterion(output, y)
        loss.backward()
        optimizer.step()

There are many reasons that make this more powerful than the Keras approach. One example is that you can easily call the optimizer step after, say, every 5 batches, essentially increasing the batchsize 5 times without taking more GPU RAM, which is important for large inputs (Video, 3D volumes…). like that:

for epoch in range(epochs):
    gen = DataGenerator(....)
    for x, y in gen:
        
        output = model(x)
        loss = criterion(output, y)
        loss.backward()
        if epoch % 5 ==0:
             optimizer.step()
             model.zero_grad()
5 Likes

@Alon, I understand the differences between keras and pytorch, I’ve used both and now I am trying to test some small projects on Julia. I was just mentioning that I think it would be beneficial to use bigger networks to compare performance as opposed to a small network. I guess using a small network lets you see more the syntax differences but will not let you conclude anything about real world performance. :slight_smile:

3 Likes

Can you elaborate more about how to use Revise? I’m very curious about it!

Load your package via using Revise, MyPackage. Then it will track your files and reload them automatically when there are changes. (You can also load Revise your Julia startup.jl file so that it is always used.)

3 Likes

If you do ] add Revise to your project in the vscode extension once then then everytime you use vscode afterwards I believe it will automatically load in the background.

The only trick, I think, is that if you activate separate manifest/project files… Which I strongly encourage… You would need to add the package for all of them.

3 Likes

I think VSCode loads Revise.jl automatically on startup these days. It’s a setting for the extension.

3 Likes