After fixing the line D[i,p] = ldiv!(lu!(B), Dl) and changing your program to use in-place functions, I’d think further optimizations would be microscopic.
I don’t know if you’ll get any speedup, but in the function H(i,n,X,Y,p,γ,sh), you could move the square root in the assignment of c to the exponents within D(c, _, 2,γ,sh).