How to choose a workstation for optimal performance

drarnau · March 26, 2021, 6:04pm

Hello,

I am looking for some advice on how to choose a workstation. My budget is around $5000.

I simulate structural economic models using Julia. My codes typically use big arrays (through which I iterate using for loops), involve large Monte Carlo simulations, and minimisation algorithms. I parallelise as much as I can.

As I understand it, it would be beneficial for me to have a machine with as many cores as possible and quite a lot of RAM. However, I am not sure how to balance these two. What is the trade-off? Also, does the quality of the cores matter?

Is there anything else that I should take into account apart from RAM and cores/CPU?

Any help is appreciated.

Thanks!

sampope · March 26, 2021, 7:40pm

Just my $0.02 experience.

We’ve been happy with workstations based on AMD ThreadRipper and IBM POWER9 CPUs running linux. Lots of cores, lots of memory bandwidth, and more or less affordable. Luckily, RAM is cheap, especially if you don’t need ECC RAM.

If you disable hyper-threading, you may find the raw cores are far faster, depending on operating system (windows, linux, bsd, etc).

The latency between the ram and cpu will matter more than sheer ram speed, if you’re constantly sending data to/from ram and the cpu.

nilshg · March 27, 2021, 7:03am

My gut feel is that you’re more likely to be CPU than memory bound, so that’s probably where the marginal CHF is better spent, but you might also want to think about GPU if parts of your algorithms are amenable to that.

That said thinking back to my PhD days when I was estimating similar models, I would usually just run a toy version locally, and send the full model (with finer grids etc.) to the HPC cluster.

JeffreySarnoff · March 27, 2021, 9:05am

Each of the major components (CPU, GPU, RAM, SSD (solid state drive[s]) is only as helpful as the way they integrate in a given system. So very good matching (which really means, selecting a vendor and model) will be more helpful than trying to max out one or two of the components. Julia has very good support for numeric processing on some GPUs and large Monte Carlo simulations are a good candidate for acceleration that way. To play that part safely, choosing a recent model GPU from ??? looking for guidance on a non NVIDA GPU that our stuff works well on (if any)
with more than the minimum GPU memory (I’d say at least 2x the minimum, 4x is better – you would be using that memory for what it does with calculation rather than for rendering very high resolution (or well animated) graphics. Other people know more about those specifics.

I have found 1TB of fast, reliable (e.g. Samsung 960 Pro or for next level speed maybe Sabrent 2TB Rocket 4 [I have no direct knowledge of Sabrent]) SSD memory is really needed because some of that goes to swapping and other system nonsense. Again, it becomes a matter of budget balancing, however here – especially if you are dealing huge, changeable data sets/simulations and it helps if they load rather quickly – 1.5TB will be better (esp. on Windows). As a good way to satisfice, 1TB SSD and 0.5 or 1TB fast hard drive gives you a place to keep archival info and less often used large resources while still having at hand the things you use the most. Bear in mind that such a setup requires you do the management (what goes into which store) yourself [another reason that 1.5 or 2TB of SSD is nicer … if you do not need a huge amt of fast storage – that’s fine too].

For the CPU, something quite modern (it need not be the fastest of them all, just current with good specs and some local acclaim) – with a healthy number of “threads” is going to help performance. Since you are getting a workstation rather than a portable computer, your base costs are going to be less than for similar performance in a desktop/laptop. Do not ignore the sound or the heat that your box will generate. You should choose models that include good cooling with quiet fans and best are boxes that have noise abatement built in. Both the CPU and the GPU will have their own fans (probably, and that’s best). So each needs to be rather quiet unless your box lives away from your keyboard and monitor.

Do go with a vendor/model that you choose from what others’ recommend. It is easy to be mislead by the write-ups generally available. Members of The Julia Community who are on top of this information have proven highly reliable.

LaurentPlagne · March 27, 2021, 9:36am

I agree with @nilshg

That said thinking back to my PhD days when I was estimating similar models, I would usually just run a toy version locally, and send the full model (with finer grids etc.) to the HPC cluster

If GPGPU is targeted (MonteCarlo…), the problem is that good NVidia stuff is amost impossible to buy at a decent price these days (partly due to this huge ecological scandal of cryptomining).

I would recommend to have a snappy (high boost frequency + nvme ) and silent machine for development and to test different nodes architectures in the cloud (hourly paid A/V100 on AWS for example) adapted to your different types of heavy computations. Hence you can buy the most suitable machine when the price and availability will be back to normal (arguably assuming this will happen eventually).

JeffreySarnoff · March 27, 2021, 10:17am

editing that part of my response – asking for GPU guidance (what our stuff works well with that is not NVIDA – if anything)

LaurentPlagne · March 27, 2021, 11:10am

Precisely. It is not so long ago that one can easily test (on the cloud) each architecture and choose, with full knowledge, the machine that best suits one’s own needs.

I was just making the remark about production machines that can allow to have a very high computing throughput while being disappointing for development because of a sub-optimal single-core performance (boost frequency…).

drarnau · March 27, 2021, 1:09pm

Thanks all for the advice.

Just to clarify. The Monte-Carlo part is not the bottleneck for me. It just uses a lot of memory. The issue is that I work with arrays in which what one has to do to one element of the array depends on what has been done to another. Hence, GPUs seem not a great solution.

Re HPC on the cloud. My impression is that using Amazon is quite a pain. Moreover, the work process involves a back and forth that requires changes to the model. That is, it’s not about having a toy model that then just needs to be scaled up. One doesn’t know ex-ante whether the scaled-up version will do the job. Otherwise, my life would be pretty easy.

I’m after more simple questions like is there a rule of thumb to choose RAM and CPUs? It seems that to have 18 CPUs and 16GB RAM is useless. What should be the ratio of RAM to CPU?

Thanks!

Elrod · March 27, 2021, 1:26pm

4 GiB per core is more than enough for most of my workloads, but it depends on just how memory hungry your program is. I’ve struggled with code where the GC only rarely freed memory – i.e. it’d free after the function completed, but memory would grow without bound as the program ran – and then not even 8 GiB per core was enough.

So I’d see how much RAM your Monte-Carlo program is consuming now, and scale it up for however many cores you plan to buy vs have now.

Tamas_Papp · March 27, 2021, 1:43pm

I think that $5000 is an insane amount of money to spend on a workstation that will mostly be idling.

I would go with a reasonable AMD Ryzen 5000 CPU, max out the RAM in the motherboard (128 GB is easily accessible these days), buy ECC RAM (again, with AMD chipsets, it is now easy and not expensive at all), and spend the remaining $2–3000 on cloud computing time.

$5000 “workstations” in the classic sense do not really make sense for computational economics these days — in fact, a lot of grant agencies and university administrators are getting very difficult to convince to finance this kind of hardware instead of cloud time (optionally, in the university’s own cluster), and they have a point.

drarnau · March 27, 2021, 1:56pm

Thanks, Tamas.

Which cloud service do you recommend? I typically work with models that are solved by brute force (due to kinks in the policy function) either with value function iteration or backwards induction (life-cycle models). I use a version of TikTak (similar to yours) for the calibration/estimation in which I parallelise the local optimisation stage.

cgeoga · March 27, 2021, 1:57pm

I’m actually thinking about a similar question myself, although with a much lower budget. My current thinking, with the help of somebody who knows much more about this, is that while AMD CPUs are very exciting and I’d much rather buy one of those, a new Intel CPU might also be pretty appealing because of the integrated graphics. Personally, I don’t need a GPU right away, so I might buy a new Intel Rocket Lake CPU, which has AVX-512 (@Elrod approved, if I’m not mistaken) and good enough integrated graphics that I can still use it for normal stuff effectively. Once the GPU market settles down a bit, I’ll buy a good GPU for a fraction of the price that they sell for now—as in, at MSRP. And so long as you’re a bit aggressive about making sure you’re utilizing AVX, performance is probably going to be pretty comparable to what you’d get with a cooler AMD CPU.

Tamas_Papp · March 27, 2021, 4:48pm

Our institute has computing infrastructure and I have access to various other clusters via coauthors so I am not really familiar with the commercial options. Also, these days my Julia code is fast enough to run on a home server for initial exploration, and then I keep fiddling with it and it never gets to the cluster.

However, I heard colleagues say good things about “off-peak” options, almost all providers have them these days. Explore and go with whatever is cheapest/easiest for you, the whole point is that it is easy to switch.

That said, for

I would try to think hard about a continuous parameterization with some trick (disregard if you already considered this and found it impossible).

Tamas_Papp · March 27, 2021, 4:52pm

I am not sure this would be a concern for this kind of machine unless you have a shortage of slots. Basic video cards go for about 40 EUR, or 5–10 EUR used (it will not run high spec games, but be better than integrated graphics anyway).

RoyiAvital · March 27, 2021, 5:00pm

I’d wait few months and buy Zen 3 based Threadripper with 16 / 24 Cores and 64 / 128 GB of memory.
It will be as fast as possibly can be achieved in this very generous budget.

Oscar_Smith · March 27, 2021, 5:15pm

IMO threadripper probably isn’t worth it unless you are going to 32 cores. Otherwise the increase in platform costs are probably not worth it.

RoyiAvital · March 27, 2021, 10:42pm

That’s not accurate as the main advantage of the Threadripper isn’t the number of cores but having Quad Channel memory.

If you want to fully utilize the cores, even 8 and above, you need the appropriate memory bandwidth. Threadrippers has twice the memory bandwidth of the Ryzen’s.

jzr · March 27, 2021, 11:10pm

It would be a useful resource to have utilization measurements for some common tasks. If I were deciding whether invest in Threadripper, I’d want to know if my application would saturate two channels or if the bottlenecks are elsewhere.

RoyiAvital · March 28, 2021, 6:48pm

All BLAS operations on modern core with regular memory speed will be saturated by 6-8 cores (Namely bandwidth limited) on 2 channels setup.
You can get away with this with either buying high quality high speed memory (~3800 on AMD and ~4200 and above on Intel). Namely gain some headroom of 20–30% (Namely you can do 10 cores with high speed memory).

Anything on top of that will require Quad Channel memory.

Think what’s the main difference between CPU’s and GPU’s. It’s the memory system. Once you want to work like GPU (Many threads, SIMD) you need memory bandwidth.

With a budget of 5000$, get a quad channel system.

Oscar_Smith · March 28, 2021, 8:10pm

Lots of monte-carlo stuff is very processing dependent and not that ram dependent though (depends a lot on the specifics).

Topic		Replies	Views
Workstation advice (for mostly Julia use) Offtopic question	22	3260	December 25, 2020
Thinking about buying a multicore system of ebay. Would appreciate any thoughts or experiences Offtopic multithreading	40	2123	January 7, 2020
Show off Julia performance on your PC! Performance	53	4503	April 26, 2020
Ebay server vs home constructed ONLY for julia coding Offtopic	8	901	October 19, 2021
What can cause significantly different performance for pisum microbenchmark on different workstations Performance	11	1062	May 12, 2019

How to choose a workstation for optimal performance

Related topics