Distributed Arrays

hello how can I divide DistributedArrays
A=drand(5,5)
B=drand(5,1)
C=A\B
Im doing this but still getting Error

Played around with the DArray struct a bit, what you want is probably the following:

C = A.localpart\B.localpart

Not sure this would be any faster than using Arrays, though.

The backslash division operator \(A, B) farms out to one of several matrix factorizations depending on the properties of its inputs. Unfortunately, distributed matrix factorization is fundamentally difficult due to the many-to-many nature of those factorizations, which imposes a huge network I/O overhead that effectively kills any gains you might achieve from using multiple machines; the processors will spend most of their time just waiting for data from the network. Solving only the local parts won’t work, since it ignores the many-to-many reality of factorizing.

If you’re only using a single machine, the linear algebra library will use a shared-memory routine dispatched across as many threads as you’ve made available with BLAS.set_num_threads, and there’s nothing to be gained by using DistributedArrays:

julia> A = rand(8000, 8000); B = rand(8000);

julia> BLAS.set_num_threads(1)

julia> @time A\B;
  9.291382 seconds (7 allocations: 488.404 MiB)

julia> BLAS.set_num_threads(4)

julia> @time A\B;
  4.121306 seconds (7 allocations: 488.404 MiB, 1.08% gc time)

There may be some iterative solvers that can be effectively parallelized across machines for your problem, but the domain specificity of those requires more information about what you’re actually trying to solve.

3 Likes

I am not claimign expertise, however @stillyslalom highlights a good point about parallel programming. Look at how long it takes to set up and move the data for a parallel operation. If this is comparable to the time taken by that step then it is not helpful to parallelise it.
Or maybe to put this better - if you are distributing across separate servers, go for the low hanging fruit of coarse parallelisation.