I have been trying to use Flux.jl library for simple Machine learning. But in my case, my each data point is an array of some fixed length greater than 1000. And so I haven’t been able to convert the following basic gradient descent algorithm with 1 hidden layer consisting of 100 neurones.

```
n=100
x1 = randn(length(10000))
learning_rate = 0.1
W1 = randn(n, 1)
W2 = randn(1, n)
X1_W1 = randn(n, length(x1))
X1_W1_s = randn(n, length(x1))
X2_W2 = randn(1, length(x1))
dw1 = randn(n, 1)
dw2 = randn(1, n)
error = 0
for i in 1:10
X1_W1 .= W1 * x1' # FeedForward
X1_W1_s .= sigmoid.(X1_W1) # FeedForward
X2_W2 .= W2 * X1_W1_s
error = norm(X2_W2 .- Sin_)/sqrt(length(X2_W2))
println(error)
dw2 .= (0.5/length(X2_W2)) .* ( X1_W1_s * (X2_W2' .- Sin_))'
W2 .= W2 .- learning_rate .* dw2
dw1 .= (0.5/length(X2_W2)) .* ( X1_W1_s * (X2_W2' .- Sin_))' * X1_W1_s .* (1 .- X1_W1_s) * x1
W1 .= W1 .- learning_rate .*dw1
W1 .= W1/norm(W1)
W2 .= W2/norm(W2)
end
```

Is there a way someone could be generous enough to help convert the above to the Flux paradigm, deeply appreciate it. Thank you.