As far as I understand, currently the go-to way if you need multithreaded Sparse Matrix operations is to interface with MKL using MKLSparse.jl.
Now, imagine I have a machine with several NUMA nodes and want to run multithreaded sparse operations on each of the nodes in parallel. To do that, I have a set of julia workers with each worker initialized on the corresponding NUMA node. Each of the workers uses MKLSparse.jl to run the operations. How do I control the affinity of the threads used by MKL so that MKL called by the worker on i-th node uses threads on the same node? Could you also remind me how to control the number of threads? (It is done by environment variable, right?)