How to efficiently do a martix multiplication?

What exactly are you timing?

As @sgaure points out, R is not defined within myAllocTest.
I assume you could add R = similar(A) within the function, but I couldn’t replicate the timings.

using BenchmarkTools

function myAllocTest()
    p = rand(50,1);
    A = rand(50,50);
    W = similar(A);
    D = similar(A);
    R = similar(A);
    m = 2.0
           
    for i=1:50
        p = D[i:end, i]
        D[i:end, :] = D[i:end, :] - (m*p)*(p'*R[i:end,:])
        W[:, i:end] = W[:, i:end] - (W[:,i:end]*p)*(m*p)';
    end
end


@btime myAllocTest() # 1.003 ms (839 allocations: 4.11 MiB)

A small portion of the time goes to allocating the matrices initially, which I don’t think you want to measure. You could separate this into two functions.

using BenchmarkTools

function mytest!(W,D,R,m)
    for i = 1:50
        p = D[i:end, i]
        D[i:end, :] .-= (m*p) * (p' * R[i:end,:])
        W[:, i:end] .-= (W[:, i:end] * p) * (m*p)'
    end
end

function bench()
    W, D, R = (rand(50,50) for _ in 1:3)
    m = 2.0
    @btime mytest!($W, $D, $R, $m)
end

bench() # 711.449 μs (710 allocations: 3.05 MiB)