I had a very similar question, with a lot of nice answers
Simple Mat-Vec multiply (understanding performance, without the bugs)
my favorite by far was to use @tullio to avoid coding loops at all, just use Einstein tensor notation
return @tullio x[i]*y[i]