So you’re not wrapping it in a function? That’s the first thing the performance tips advice against
After putting it in a function, some cheap performance gains could be achieved by sprinkling @view in front of some of the alpah[t, :] = ... calls

It is in a function thats not the point I’m just trying to optimize the second loop. I was just testing the timing of each in REPL. What does @view do?

help?> @view
@view A[inds...]
Creates a SubArray from an indexing expression. This can only be applied
directly to a reference expression (e.g. @view A[1,2:end]), and should not
be used as the target of an assignment (e.g. @view(A[1,2:end]) = ...). See
also @views to switch an entire block of code to use views for slicing.
julia> A = [1 2; 3 4]
2×2 Array{Int64,2}:
1 2
3 4
julia> b = @view A[:, 1]
2-element SubArray{Int64,1,Array{Int64,2},Tuple{Base.Slice{Base.OneTo{Int64}},Int64},true}:
1
3
julia> fill!(b, 0)
2-element SubArray{Int64,1,Array{Int64,2},Tuple{Base.Slice{Base.OneTo{Int64}},Int64},true}:
0
0
julia> A
2×2 Array{Int64,2}:
0 2
0 4

Learned something here as well: it should have been @views not @view

If you give code that people can copy paste and experiment with, you are likely to get more answers. Having to figure out all the arguments, and dimensions is too painful and running code in the head can be missleading.

If the dimensions used here is representative for your problem you could consider StaticArrays with something like:

T = 1000000
A = @SMatrix [.3 .7; .2 .8]
p = @SVector [.8, .2]
beta = fill(zeros(SVector{2, Float64}), T)
c = zeros(T)
B = copy(beta)
using StaticArrays
function f(beta, c, A, B, T)
beta[T] = c[T] * ones(eltype(beta))
@inbounds for t in (T-1):-1:1
beta[t] = beta[t+1] .* (A * B[t+1]) * c[t]
end
end

which runs in 0.007 seconds. Not sure the translation is 100% correct but it should do the same amount of work.

Use them when you have low dimensional arrays (like coordinates in space). They are awesome and will give tremendous speed increase for dimensions like in your problem (make sure the code is type stable though).

@kristoffer.carlsson thanks for the help. It seemed like I was getting the wrong results using my loop and so I wrote it out in straight for loops (rather than using matrix multiplication) can you explain why the matrix method is resulting in increasing errors?

## Backward recursion
beta = zeros(T, N)
beta[T,: ] = c[T]
@inbounds for t in (T-1):-1:1
beta[t,: ] .= (Diagonal(beta[t+1,: ]) * (A * B[t+1,: ])) .* c[t]
end

the following code should be equivalent but doesn’t accumulate errors

beta = zeros(T, N)
beta[T,: ] = c[T]
for t in (T-1):-1:1
for i in 1:N
for j in 1:N
beta[t,i] += A[i,j] * B[t+1, j] * beta[t+1, j]
end
end
beta[t,: ] *= c[t]
end

Please provide a working example that can be copy/pasted into Julia. The post from @kristoffer.carlsson is a good example. With Kristoffer’s example, I can copy and paste the section of code, and it runs (after moving the using StaticArrays line to the top).

using Distributions
n=1000000
qidx = rand(1:100, n)
x = rand([0,1], n)
theta = randn()
gamma=randn(2)
d = randn(100)
A = [.3 .7; .2 .8]
p=[.8, .2]
N = length(p)
M = length(gamma)
T = length(x)
alpha = zeros(T, N)
beta = zeros(T, N)
c = zeros(T)
B = zeros(T, N)
@inbounds for t in 1:T
g = cdf(Logistic(d[qidx[t]] - gamma[1]), theta)
s = cdf(Logistic(theta - gamma[2]), d[qidx[t]])
B[t,: ] = x[t] == 1 ? [1.0 - s, g] : [s, 1.0 - g]
end
## Forward recursion
alpha[1,: ] = p .* B[1,: ]
c[1] = 1.0 / sum(alpha[1,: ])
alpha[1,: ] *= c[1]
@inbounds for t in 2:T
alpha[t,: ] = (alpha[t-1,: ]' * A) * diagm(B[t,: ])
c[t] = 1.0 / sum(alpha[t,: ])
alpha[t,: ] *= c[t]
end
log_likelihood = -sum(log.(c))

Now the following two loops should be equivalent from a mathematical perspective but I am getting different results

## Backward recursion
beta = zeros(T, N)
beta[T,: ] = c[T]
@inbounds for t in (T-1):-1:1
beta[t,: ] .= (Diagonal(beta[t+1,: ]) * (A * B[t+1,: ])) .* c[t]
end

vs just using for loops

beta = zeros(T, N)
beta[T,: ] = c[T]
for t in (T-1):-1:1
for i in 1:N
for j in 1:N
beta[t,i] += A[i,j] * B[t+1, j] * beta[t+1, j]
end
end
beta[t,: ] *= c[t]
end

You seem to misunderstand minimum here. Minimum means strip away everything that is not relevant to the problem, every single line should have a purpose for the question you are asking.

This is a minimum example:

T = 3
c = rand(T)
B = rand(T, 2)
A = rand(2, 2)
function f1(c, T, A, B)
beta = zeros(T, N)
beta[T,: ] = c[T]
for t in (T-1):-1:1
for i in 1:N
for j in 1:N
beta[t,i] += A[i,j] * B[t+1, j] * beta[t+1, j]
end
end
beta[t,: ] *= c[t]
end
return beta
end
function f2(c, T, A, B)
beta = zeros(T, N)
beta[T,: ] = c[T]
for t in (T-1):-1:1
beta[t,: ] .= (Diagonal(beta[t+1,: ]) * (A * B[t+1,: ])) .* c[t]
end
return beta
end
r1 = f1(c, T, A, B)
r2 = f2(c, T, A, B)

A = rand(2,2); b = rand(2); β = rand(2);
# 1
beta1 = β .* A * b
# 2
beta2 = zeros(2);
for i in 1:N
for j in 1:N
beta2[i] += β[j] * A[i,j] * b[j]
end
end
beta2

They are not. The β[j] is wrong and should be β[i].
The corresponding change in your code is beta[t+1, j] → beta[t+1, i]

Ah thanks @kristoffer.carlsson ! What would the corresponding change be to the matrix notation if I actually wanted the result as is written with teh for loops.