Converting PyTorch to Flux while keeping performance

ToucheSir · May 29, 2022, 8:26pm

Yes, but note here how you’re calling sum(xs) and not sum(f, xs).

You may need to import LoopVectorization in order for Tullio to generate a fully optimized kernel. More importantly, I would extract deformation_indexed[:, 1, :] into its own local variable to potentially save on a lot of compute/memory overhead.

Also, what is register? It seems like there is more code here that may have an influence on performance (e.g. if register is a mutable struct), so a MWE would be much appreciated.

Topic		Replies	Views
Flux multi-cpu parallelism? New to Julia question , flux , zygote	9	2935	November 21, 2020
Flux vs pytorch cpu performance Machine Learning first-steps , flux	59	9237	October 2, 2020
Different behaviour between Flux.jl and Pytorch Machine Learning machine-learning	17	2275	February 13, 2021
Flux running slow? Machine Learning	16	2752	August 19, 2021
My Flux Application painfully slow General Usage question	21	1496	October 20, 2020

Converting PyTorch to Flux while keeping performance

Related topics