Hello, I have never worked with GPU and it’s time to change our server, so…
Which would be a entry-level NVIDIA GPU suitable to work with neural network operations / Flux ?
I mean something that would still be noticeable faster than a good CPU (…otherwise…) … would e.g. the RTX 3070 ti (8GB) be enough (~around 600$), or one would definitely suggest the RTX 40x series (e.g. 4090 16 GB) that is noticeably also more expensive (1600$) ?
In cuda.jl readme the only requirement given is “*CUDA-capable GPU with *compute capability 3.5 (Kepler) or higher” but I have no idea of what that means, it’s a pity it isn’t given a list of models in a more explicit term, or at least suggestions for entry level / serious work / state of the art computation for complete GPU beginners…
A quick search indicates Kepler is ~10 years old. I would imagine both 3xxx and 4xxx cards will be totally fine. I’m facing a similar choice and my thought process has been more along the lines of getting something cheap (say a 4060) for testing/prototyping and until I understand my needs better.
yep, I fully agree with you…
You should be aware that putting an RTX card in a server probably goes against nvidia’s terms of service (if you care). Besides that, I would probably not recommend a 4060. The 8 gigs of vram will likely limit you pretty quickly if you are trying to do anything serious (the RTX 3060 ironically has a 12 gb version) Depending on how much you are willing to spend, I would probably go with one of
- 3060 (12 GB)
- 3080 (24 Gb)
- 4080 (16 GB)
- 4090 (24 GB)
In general, the VRAM will control what you can run while the speed of the gpu will determine how fast it runs.
No, I wasn’t aware there were a licence term on hardware product.
Edit: not an expert, but:
“We recognize that researchers often adapt GeForce and TITAN products for non-commercial uses or their other research uses that do not operate at data center scale. NVIDIA does not intend to prohibit such uses,” a spokesperson for the California chip architects said.
I have a 3070Ti that is pretty good for what I do. I have occasionally run into memory constraints, though. The speedup is very nice.
Not all nets will get the same speedup. A convolutional net or a feedforward net gets a very good speedup. A recursive net, not as much.
Curious, what speedups are we talking about vs CPU?
I don’t have any benchmarking results. For what I do, I would say it’s in the neighborhood of 20X faster, maybe more. I don’t tend to use the CPU much for the things I can use the GPU for, so it’s hard to remember how well it worked.
Depending on the type of a server, you might run into trouble fitting a commercial card inside.
We have rack mounted 2U servers and all the PCI slots are 2 slot high at max.
You might want to take a look at NVIDIA RTX A4000, which is reasonably priced (we paid around 1000$, I think) and has 16GB of RAM. Might not be as fast though, but we haven’t done any comparisons.
For comparison, The RTX 4080 costs about the same (1150 on amazon), has 50% more CUDA cores, 60% higher frequency and roughly twice as fast memory. The A4000 will be a bunch faster for Float64 compute and uses a lot less power, but other than that it is pretty completely outclassed.
Oh yeah, for sure.
Just wanted to give a perspective from a usecase, where server would not fit any higher-tier consumer card inside.
Can you elaborate a bit more, as I am not sure I understood… Our IT “team” has an office with two racks (mostly routers I think) and we would like to put the server with the GPU card there… what exactly I have to check in terms of physical compatibility between the server “space” and the GPU card?
So we have these and you can see in the pictures, that it is pretty tight at the back, where GPUs would go. On the side of the page it tells you the number of PCI-E connections and maximal dimensions that should fit. FH - full height - is equivalent to two slots in a typical PC case. RTX3000 and especially 4000 series can be as high as 4 slots.
If it is a prebuild - HP, Dell, SuperMicro etc. - just check the specification of the server model, it usually tells you how many how big GPUs you can fit there.
SuperMicro additionally has a chart of compatible hardware for each server model.
There are also sometimes quirks related to proprietary nature of server architectures
At our lab I am a self-taught “IT team” and was surprised that I could not connect GPUs with standard 8-pin power connectors, because SuperMicro has a different pin layout, for some reason. So power delivery could also be problematic if not checked beforehand.
If you however want to build the server yourself and just put it in a rack-mounted chasis, just be sure you pick a big enough one.
I 've also been thinking about purchasing GPU for our institute. However, I thought maybe it makes sense to start of with the cloud in order to get a gist of my requirements and benefits. Does anybody have experience with JuliaHub ? I used to think that they were mostly B2B, but it seems that their current business model encourages individual users. Their current pricing is per thread hour, which leaves me wondering about the cost of using GPU resources.
One thing to think about is power supply, too. Adding a GPU to a pre-built server may not work, even if the card fits the enclosure, if the server does not supply enough power.
Another option is an eGPU, with a dedicated power supply. For some types of work that have limited I/O operations, it can work well, and it’s portable.