Back in 2015/2016, we designed and acquired the “octopus” GPU supercomputer at the Swiss Geocomputing Centre with 136 consumer electronics GPUs and we were probably in the exact same situation as you are now. On one hand, even with B2B accounts, consumer electronics companies were not very helpful in acquiring large amounts of hardware, in particular the latest GPUs. On the other hand, companies that are specialized in building clusters or supercomputers deliver typically some extensive help in the design stage as well as an extensive warranty on the system as a whole, but this obviously increases substantially the price of the system as a whole. Having a small or medium-sized academic budget you likely do not want to pay a large amount for such additional service.
We were lucky to find a company that fit exactly our needs, which I thus warmly recommend without hesitation for building your academic cluster/supercomputer: Colfax. This is how it approximately worked for us in 2015/2016:
- we assembled our system on their website with the available parts and submitted it for an offer
- we got contacted by them
- we told them that component X was not available within the parts to choose from and that we would like to replace component Y with component X.
- we had some rather brief discussions on a few specific parts (expert to expert discussion - if you want a company that does the design of the cluster for you, then you need to contact them probably differently or contact another company and pay the cost of this service)
- they assembled a test node, did a 24h stress test and shipped it to us
- we made our tests
- we ordered the remaining compute nodes with the same configuration
- they assembled them and did a 24h stress test for each node and shipped them.
Only the GPUs were consumer electronics components, all other parts are what are called professional parts and the compute nodes are rack-able in our case. Here you can see the details of the octopus supercomputer. The overall price was about the same as it would have been by buying all parts from places like newegg etc. Note that we got an academic discount. Octopus is since 2016 successfully in use for scientific multi-GPU computations.
I hope this helps!