Nvidia Jetson Nano


Juilia already runs on the Raspberry Pi. Would this be a good system for prototyping Flux.jl applications?
Maybe a bit short on memory.


My Jetson Nano arrived this morning. ARM 8 64 bit architecture.
Master version of Julia has been compiling all day!

1 Like

If prototyping is recognized to be the phase where developer time is more important than hardware cost, then I guess

answers your question.


Does the prebuilt AArch64 binary work?


It is ideal for low-cost DIY projects :slight_smile:


“For the price, the Jetson Nano TensorRT inference performance is looking very good.”
“The Jetson Nano did come out much faster than the ODROID-XU4 for the multi-threaded Rust benchmarks.”

“Overall this is arguably the best sub-$100 Arm developer board we’ve seen to date depending upon your use-cases. The Jetson Nano will certainly open up NVIDIA Tegra SoCs to appearing in more low-cost DIY projects and other hobbyist use-cases as well as opening up GPU/CUDA acceleration that until now has really not been possible on the low cost boards.”


Simon, I built version 1.2.0-DEV from GitHub yesterday. The build worked perfectly and we tested CUarrays yesterday.
I do not have the board at work today. I will report back regarding the prebuilt AArch64 binary this evening UK time. I See no reason why it would not work.


I confirm the Jetson Nano runs the AARCH64 build for the Raspberry PI
Version 1.0.3 TLS downloaded from the Julialang.org site.

Can anyone suggest benchmarks or tests I could run?


If you are still looking for a benchmark, maybe you can test the speed of inference of the pre-trained vgg19 model (https://github.com/FluxML/Metalhead.jl) ?

Something as mundane as a matrix multiplication would already be quite interesting to me:

using CuArrays
using BenchmarkTools

A = randn(Float32,1000,10000); cuA = CuArray(A)
B = @btime Array(cuA*cuA');

(I get 3.214 ms on a GeForce GTX 1080, but the GPU is actually busy with other work too)


CuArray test Note 128 core GPU onboard
A is 64x64 array Time 426 usec
A is 128x128 array Time 644 usec
A is 256x256 array Time 2.538 msec
A is 512x512 array Time 4.262 msec
A is 1024x1024 array Time 15.633 msec

A is 1000x1000 array Time 15.703 msec


I have access to a boat load of rackmount GPU servers if you would like some specific tests run :slight_smile:

Sadly I have Julia 1.1.0 installed on my pet 1080 server - and CUArrays has a conflict with the SIUnits package. Grrr…


Just got mine yesterday, glad to know that Julia will run


One question, the prebuild Flux model vgg19 would not run as it seems to need more than 4Gbytes of RAM.
Is there an easy way to estimate how much RAM a model will consume?