using LinearAlgebra
function lu_nopivot(A)
n = size(A, 1)
L = Matrix{eltype(A)}(I, n, n)
U = copy(A)
for k = 1:n
L[k+1:n,k] = U[k+1:n,k] / U[k,k]
U[k+1:n,:] = U[k+1:n,:] - L[k+1:n,k]*U[k,:]
end
return L, U
end

But calling the function L, U = lu_nopivot(A) gives an error MethodError: no method matching *(::Vector{Float64}, ::Vector{Float64}) on L[k+1:n,k]*U[k,:] Tried the matlab version and it worked fine. What could be the reason that make this failed?

The things you are multiplying arenâ€™t matrices, (based on the error saying you canâ€™t use * on 2 Vector{Float64}s). You might be running an old version of the code, what happens if you restart your repl?

julia> using LinearAlgebra
julia> x = [1.0; 2.0; 3.0]; y = [0.0; 1.0; 2.0];
julia> x*y
ERROR: MethodError: no method matching *(::Vector{Float64}, ::Vector{Float64})
Closest candidates are:
*(::Any, ::Any, ::Any, ::Any...) at ~/packages/julia-1.7.0/share/julia/base/operators.jl:655
*(::StridedMatrix{T}, ::StridedVector{S}) where {T<:Union{Float32, Float64, ComplexF32, ComplexF64}, S<:Real} at ~/packages/julia-1.7.0/share/julia/stdlib/v1.7/LinearAlgebra/src/matmul.jl:44
*(::StridedVecOrMat, ::Adjoint{<:Any, <:LinearAlgebra.LQPackedQ}) at ~/packages/julia-1.7.0/share/julia/stdlib/v1.7/LinearAlgebra/src/lq.jl:266
...
Stacktrace:
[1] top-level scope
@ REPL[15]:1
julia> dot(x,y)
8.0

x*y works in Matlab because Matlab doesnâ€™t distinguish scalars, vectors, and matrices (e.g. a Matlab scalar is really a 1 x 1 complex matrix). Julia is more particular about types in order to achieve efficiency and generality.

I think the confusion here is that, even though this is selecting the k-th row of U, the result of a 1d slice in Julia is always treated as a â€ścolumn vectorâ€ť (i.e. a 1d slice is a 1d array, no matter what dimension is sliced).

I would again like to point out that lu(A, NoPivot()) will be way way way faster.
For a 1000x1000 matrix, lu_nopivot takes 2.4 seconds, while lu(A, NoPivot()) takes 185 ms (12x faster).

Also, lu allocates 7mb vs lu_nopivot which allocates 11 GB of ram (over 1000x less)

Definitely, but anyone who is doing LU without pivoting (which is numerically unstable) doesnâ€™t care about performance anyway â€” itâ€™s only used for learning exercises.