Parallel merge in Cilk+ vs Julia

As my first Julia program, I had some fun debugging a Julia implementation of a parallel merge, which I had already developed in Cilk+. I thought I’d make my first post by pointing the following out to warn others developing divide and conquer algorithms with @spawn.

Here’s my Cilk+ function:

// parallel merge of sorted sub-arrays i ∈ [low1..up1) and i ∈ [low2..up2) of a into b 
// starting at index start
void parMerge(int *a, int low1, int up1, int low2, int up2, int *b, int start){
	int k1 = up1 - low1;
	int k2 = up2 - low2;
	int mid1, mid2;
// Sequential cutoff
	if(k1+k2 < 100000)
                // usual sequential non-recursive merge
		sequentialMerge(a, low1, up1, low2, up2, b, start);
		if(k1 >= k2){
			mid1 = (low1+up1-1)/2;
			mid2 = binarySearch(a, low2, up2, mid1);
		} else{
			mid2 = (low2+up2-1)/2;
			mid1 = binarySearch(a, low1, up1, mid2)-1;
		cilk_spawn parMerge(a, low1, mid1+1, low2, mid2, b, start);
		start = start + mid1 - low1 + 1 + mid2 - low2;
		parMerge(a, mid1+1, up1, mid2, up2, b, start);

Note that I’ve updated start after the first recursive call to parMerge, which works fine since the value of start has already been used in that recursive call.

Here’s my Julia version, modified to deal with 1-based indexing:

# Merge a[lo1:hi1] and a[lo2:hi2] into b starting at start
function parMerge!(b, a, lo1, hi1, lo2, hi2, start)
	k1 = hi1 - lo1 + 1
	k2 = hi2 - lo2 + 1
	mid1, mid2 = 0, 0
	# sequential cutoff
	if k1+k2 < 100000
		sequentialMerge!(b, a, lo1, hi1, lo2, hi2, start)
	if k1 >= k2
		mid1 = (lo1+hi1)>>> 1
		mid2 = binarySearch(a, lo2, hi2, mid1)
		mid2 = (lo2+hi2)>>>1
		mid1 = binarySearch(a, lo1, hi1, mid2)-1
		mid2 += 1
	first = @spawn parMerge!(b, a, lo1, mid1, lo2, mid2-1, start)
	parMerge!(b, a, mid1+1, hi1, mid2, hi2, start + mid1 - lo1 + 1 + mid2 - lo2)

Note that here I didn’t update start before the second recursive call, but instead put the expression into the last parameter.

I had initially updated start before the second recursive call (as in the Cilk+ program). What happened was that the first recursive call used the value of start that was updated in the following line, leading to out of bound array accesses. From my limited understanding of Julia, I take it that this is because the expression in the @spawn macro is “escaped” (esc), which results in variables referring back to the scope where the macro is called.

@spawn builds a closure for the thunk which it passes to the task. This can indeed make variable references tricky.