How to efficiently evaluate a Flux.jl neural network millions of times on the GPU?

jmair · October 17, 2022, 7:16pm

You want to input a batch into the neural network, so your input is y=CUDA.ones(32, N), where N is the number of inputs you want to process in parallel. This will be the easiest way to parallelise the execution. You should get an output matrix which is 31 by N.

Topic		Replies	Views
Neural network in Flux.jl using CUDA is slower General Usage	0	478	July 15, 2020
Flux on gpu and inference optimization GPU	2	339	January 17, 2023
Using trained neural networks inside GPU computations General Usage neural-network	0	57	October 18, 2024
Which library supports a non-allocating neural network model Specific Domains package , gpu , machine-learning	13	788	August 17, 2022
Flux on GPU too slow Machine Learning gpu , cuda , flux	5	1110	September 22, 2022

How to efficiently evaluate a Flux.jl neural network millions of times on the GPU?

Related topics