Parallelising a broadcast on a large-allocating function

I would like to parallelise a broadcast of a complicated function on my machine.

Being new to parallel computing, I’m getting swamped and lost in all the posts and comments and open issues on it. My context is a little more involved than all the tutorials’ contexts. Otherwise documentation for the various packages and procedures aren’t comprehensive/novice-proof enough for me to know if I can just willy nilly run my code in parallel.

My context is as follows. I have a function `f(x, y)` that is defined to involve root-finding, interpolations, a lot of stuff involved. I would like to compute

``````x = LinRange(0.0, 2e3, Nx)
y = LinRange(0.0, 5e2, Ny)
f_grid = @time [f(x_, y_) for y_ in y, x_ in x]
heatmap(x, y, f_grid)
``````

where the comprehension takes a long time given the root-finding, etc.

``````Nx = 11; Ny = 9;
52.993936 seconds (173.35 M allocations: 4.156 GiB, 2.45% gc time)
``````
``````Nx = 31; Ny = 25;
466.169140 seconds (1.39 G allocations: 33.223 GiB, 2.28% gc time)
``````

Mind you, my goal of `Nx = 1001; Ny = 801;` should suffice for my plotting purposes.

As per my title, alternatively I could compute

``````f_grid = @time f.(x', y)
``````

Questions:

1. Should I go back to my code to first really focus on squeezing out as many allocations as I can? (As prompted by this comment.)
• If so, feel free to explore the ProfileView.jl outputted flame graph here in my repo. (No idea if it gives any insights.)
2. If not Q1, or once Q1 is done,
• Should I run this type of code (with root-finding, interpolation, automatic differentiation, summations, and loops in loops in loops) in CPU or my NVIDIA GPU?
• Will it be safe to just run Julia with N cores, then use Strided.jl? (An elegant design imo.)
• I have two CPU cores, and 8 logical cores. should I use N = 8 or N = 2? This post doesn’t specify exactly how he altered the number of workers. (What is the difference between workers and threads? A Google search leaves it somewhat unclear on me.)
• Do I have to use `@everywhere` on all the functions my function `f` depends on? (Prompted by this comment.)

I think that’s enough preliminary questions to present my confusion.