Hello everyone. I am new to Julia, but somewhat experienced in physics-informed neural networks.
I have some questions about several features of NeuralPDE, their compatibility with GPUs and some choices made in the example in the documentation here.
Specifically:
-
The code from the documentation does not use my GPU at all out of the box. After searching, I found that replacing
using CUDA
withusing LuxCUDA
solves the issue. Are the two packages equivalent or is there something wrong with my system? -
If I choose the
GridTraining
strategy (as in the example) , 100% of my GPU is used. However, if I chooseQuasiRandomTraining
with the default sampling algorithm 0% (zero) of my GPU is used and all the calculations are being carried out by 1 core of my CPU. If I change the sampling algorithm toSobolSample()
, my GPU is indeed used, but only ~30%. Is there any reason for this behaviour? -
In the example, after the network is trained the following lines are used to calculate the solution in some grid for plotting and analysis
phi = discretization.phi
ts, xs, ys = [infimum(d.domain):0.1:supremum(d.domain) for d in domains]
u_real = [analytic_sol_func(t, x, y) for t in ts for x in xs for y in ys]
u_predict = [first(Array(phi(gpu([t, x, y]), res.u))) for t in ts for x in xs for y in ys]
If I understand correctly, the grid points t, x, y are moved to the GPU (where also the parameters res.u reside) and then the solution is evaluated. However, this is extremely slow. Instead, if I replace the last line with
u_predict = [first(Array(phi(([t, x, y]), cpu(res.u)))) for t in ts for x in xs for y in ys]
meaning that I am moving the parameters res.u to the CPU, I get a significant gain in speed. Is there any reason to prefer one or the other and if not, why the documentation uses the slow version?
Below is a MWE, which is basically the example from the documentation with a few lines added by me regarding the above points (my fixes are commented out)
using NeuralPDE, Lux, Random, ComponentArrays
using Optimization
using OptimizationOptimisers
import ModelingToolkit: Interval
using QuasiMonteCarlo
using BenchmarkTools
using CUDA # This does not recognise my GPU
# using LuxCUDA # This does
@parameters t x y
@variables u(..)
Dxx = Differential(x)^2
Dyy = Differential(y)^2
Dt = Differential(t)
t_min = 0.0
t_max = 2.0
x_min = 0.0
x_max = 2.0
y_min = 0.0
y_max = 2.0
# 2D PDE
eq = Dt(u(t, x, y)) ~ Dxx(u(t, x, y)) + Dyy(u(t, x, y))
analytic_sol_func(t, x, y) = exp(x + y) * cos(x + y + 4t)
# Initial and boundary conditions
bcs = [u(t_min, x, y) ~ analytic_sol_func(t_min, x, y),
u(t, x_min, y) ~ analytic_sol_func(t, x_min, y),
u(t, x_max, y) ~ analytic_sol_func(t, x_max, y),
u(t, x, y_min) ~ analytic_sol_func(t, x, y_min),
u(t, x, y_max) ~ analytic_sol_func(t, x, y_max)]
# Space and time domains
domains = [t ∈ Interval(t_min, t_max),
x ∈ Interval(x_min, x_max),
y ∈ Interval(y_min, y_max)]
# Neural network
inner = 25
chain = Chain(Dense(3, inner, Lux.σ),
Dense(inner, inner, Lux.σ),
Dense(inner, inner, Lux.σ),
Dense(inner, inner, Lux.σ),
Dense(inner, 1))
# strategy = GridTraining(0.05) #This uses 100% of my GPU
strategy = NeuralPDE.QuasiRandomTraining(500, resampling=true) #This uses 0% of my GPU
# strategy = NeuralPDE.QuasiRandomTraining(500, sampling_alg = SobolSample(), resampling=true) #This uses 30% of my GPU
ps = Lux.setup(Random.default_rng(), chain)[1]
ps = ps |> ComponentArray |> gpu .|> Float64
discretization = PhysicsInformedNN(chain,
strategy,
init_params = ps)
@named pde_system = PDESystem(eq, bcs, domains, [t, x, y], [u(t, x, y)])
prob = discretize(pde_system, discretization)
symprob = symbolic_discretize(pde_system, discretization)
callback = function (p, l)
println("Current loss is: $l")
return false
end
res = Optimization.solve(prob, Adam(0.01); callback = callback, maxiters = 100)
phi = discretization.phi
ts, xs, ys = [infimum(d.domain):0.1:supremum(d.domain) for d in domains]
u_real = [analytic_sol_func(t, x, y) for t in ts for x in xs for y in ys]
u_predict = [first(Array(phi(gpu([t, x, y]), res.u))) for t in ts for x in xs for y in ys] #This is slow
# u_predict = [first(Array(phi([t, x, y]), cpu(res.u))) for t in ts for x in xs for y in ys] #This is faster
@btime u_predict = [first(Array(phi(gpu([t, x, y]), res.u))) for t in ts for x in xs for y in ys]
@btime u_predict = [first(Array(phi([t, x, y], cpu(res.u)))) for t in ts for x in xs for y in ys]
And some additional info
(@v1.9) pkg> st
Status `~/.julia/environments/v1.9/Project.toml`
[c29ec348] AbstractDifferentiation v0.5.3
[6e4b80f9] BenchmarkTools v1.3.2
[052768ef] CUDA v5.0.0
⌃ [13f3f980] CairoMakie v0.10.10
⌅ [b0b7db55] ComponentArrays v0.14.1
[587475ba] Flux v0.14.6
[59287772] Formatting v0.4.2
[f6369f11] ForwardDiff v0.10.36
⌃ [e9467ef8] GLMakie v0.8.10
[033835bb] JLD2 v0.4.35
[b964fa9f] LaTeXStrings v1.3.0
⌃ [b2108857] Lux v0.5.5
[d0bbae9a] LuxCUDA v0.3.1
[dde8697e] MakiePublication v0.3.2
⌃ [961ee093] ModelingToolkit v8.70.0
⌃ [315f7962] NeuralPDE v5.8.0
⌅ [3bd65402] Optimisers v0.2.20
[7f7a1694] Optimization v3.19.1
⌃ [36348300] OptimizationOptimJL v0.1.9
⌃ [42dfb2eb] OptimizationOptimisers v0.1.5
[91a5bcdd] Plots v1.39.0
[d330b81b] PyPlot v2.11.2
⌅ [8a4e6c94] QuasiMonteCarlo v0.2.19
[e88e6eb3] Zygote v0.6.65
[02a925ec] cuDNN v1.2.0
nvidia-smi
Wed Oct 11 19:51:23 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06 Driver Version: 525.125.06 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:2E:00.0 Off | N/A |
| N/A 46C P8 2W / 60W | 351MiB / 6144MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1378 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 306393 C ...ros/julia-1.9.3/bin/julia 344MiB |
+-----------------------------------------------------------------------------+
Sorry for the long post, but I wanted to be as detailed as possible.
Thank you in advance for your help.