Significant compile time latency in Flux with a GAN

I wouldn’t expect to see precompile show up (did you ensure the package was already precompiled?), but I don’t know enough about compilation to comment. Ideally any tracing would only start capturing from the point when pullback is called (i.e. after all packages are imported and the forward pass is warmed up), but again I’m not sure if that’s possible.

The reason I asked about CPU-only perf is because the cross-post in Extremely high first call latency Julia 1.6 versus 1.5 with multiphysics PDE solver - #25 by Alexander-Barth might be indicative of https://github.com/JuliaGPU/GPUCompiler.jl/issues/65 (lots of time spent in LLVM middle end during GPU compilation).

1 Like