ForwardDiff with function using broadcasting

I am trying to use ForwardDiff to check some analytic gradients for a mse loss and logistic hypothesis function. So just looking at the loss of a single data point we have

mse(x) = (1/2)*((y - logistic.(transpose(x)*w))^2)

Where x and w are dimensionally consistent arrays. When using

 ForwardDiff.gradient(mse,X)

Where w and y were globally defined and I pass in x I am getting very different values from what I had calculated for the gradients. Is this the correct use of ForwardDiff?

Please provide an MWE.