I’ve way more RAM than VRAM. Code runs fine on CPU. I thought using CUDA unified memory lets GPU tap system RAM but it still OOM.
I’m asking bc I’m getting a new laptop for ML work. Does unified memory mean VRAM is no longer a hard ceiling and as important of a spec? (I know having GPU tap system RAM is way slower but at least it runs)
What you’re describing sounds more like an integrated GPU sharing memory with the CPU. Apple has a Unified Memory Architecture that does this in small part. CUDA’s Unified Memory is an unrelated software abstraction that lets the GPU and CPU code access data with the same pointer, even when the data is actually being migrated between the CPU’s and GPU’s memory.
Cool didn’t know it can split a single variable. I’m on Windows cuda 12 but a really old gpu so prob not bother debugging. for otherfolks: you can let cuda.jl alloc unified buffer by default by adding LocalPreferences.toml to your env folder w/ lines of
[CUDA]
default_memory =“unified”