# Variation in computation time for execution of the exact same code

Here’s another strange one. What is the purpose of `nt`? And why create `U` as a 3D array, but then throw away one of the dimensions? Why not just make it a 2D array in the first place? And why make the third dimension have length `NDT`, but only use `1:NDT-4`. Just make it length `NDT-4` in the first place:

``````function stepSource()
U = zeros(Float64, nc, NDT-4) #intermediate voltage variable (starts at t=0)
U[1, 50:end] .= 300_000 #step function at n = 50, change first element to change number of phases
return U  #short four to account for offset in initialization
end
``````
2 Likes

The ‘nt’ variable line is just one I forgot to delete - good catch. Regarding the extra dimension of U: Later in the code (in a line I deleted from this abbreviated block, but shown below). I copy the values of U to another matrix that has to be in three dimensions to play well with other calculations.

``````    #call one of the above functions to define a source type
Vs[:,1,5:NDT] = stepSource()#surgeSource()#zeroSource()
``````

I appreciated the detailed look. This has been a very helpful thread. Looks like I have plenty of homework tonight cleaning up my code per the above suggestions.

Have you tried using the profiler? It can be useful in determining where you spend most of your time. Also `--track-allocations` can be useful.

1 Like

What is the function of the \$ prefix in the function calls?

1 Like

But `stepSource` returns a 2d array, not a 3d one. What I’m saying is that inside `stepSource` you create a 3d array, and then create a 2d slice which you return. The fact that it started with a 3d array has no effect outside the function, except that it causes an extra memory allocation.

2 Likes

That makes sense. My recollection is that that using slices forced me into extra dimensions due to later calculations with other 3D elements. However, I am reworking the code with the StaticArray format you suggested. As an example here is what I have attempted for the three-phase sine source. Each sub-vector stores three voltages (one for each phase of a three-phase voltage source) at a given time step. This seems to output the right numerical data. One item that puzzles me is illustrated in the print statements. The first two make sense. In the third I am trying a quick way to get all the first elements of each vector for later plotting purposes. I could do this with another variable and loop, but is there an easier way?

``````using Plots
using LinearAlgebra
using StaticArrays

function threePhaseSine(θ)
Δt = 2.0e-3; ω = 2.0*π*60.0  #
Aphase(nt) = sqrt(2/3)*cos.(ω * Δt * nt .+ θ)  #nt is time step
Bphase(nt) = sqrt(2/3)*cos.(ω * Δt * nt .- 2*π/3 .+ θ)
Cphase(nt) = sqrt(2/3)*cos.(ω * Δt * nt .+ 2*π/3 .+ θ)
Vsin = [MVector{3,Float64}([Aphase(nt), Bphase(nt), Cphase(nt)]) for nt in 1:10]

return Vsin
end

Vabc = threePhaseSine(0)

display(Vabc)
println("First vector of Vabc = ",Vabc[1])
println("First element of vector 1 of Vabc = ",Vabc[1][1])
println("First element of all vectors of Vabc = ",Vabc[:][1])  #this is unexpected

plot(Vabc[:][1]) #attempt to plot the voltage for Phase A as a function of time (doesn't work)
``````

`Vabs[:]` is just a copy of `Vabs`, so `Vabs[:][1]` is the same as writing `Vabs[1]`, except you make an extra copy.

You probably want `first.(Vabs)`.

Drop the square brackets around the phase functions here, this allocates a normal array and then converts it to an MVector.

In fact, it’s better to drop the `{3,Float64}` since the former is automatically understood from the number of inputs, and the element type is inferred, and will then allow also other types than `Float64`, so:

``````Vsin = [MVector(Aphase(nt), Bphase(nt), Cphase(nt)) for nt in 1:10]
``````

Also consider whether you can use an SVector instead.

BTW: The dots you put inside the `Aphase` etc. functions don’t really make any sense, since `threePhaseSine` only works for scalar input. You can, however, broadcast the outer function over vector input, as `threePhaseSine.(θ)`.

1 Like

Roger on the dots. I see there is no `second.(Vabs)` (i.e. for accessing the second or third elements of each vector). Is there a more general way to access?

Likewise, if I am using MVector or Mmatrix elements (because they have to be updated), how do I index certain blocks and update in place? For example in the code below, Vout is a vector of vector of vectors (?? multiple conductors, space, and time). I want to write a small array of Voltages (trappedChargeVoltage) to each conductor (all nc), each spatial step (all NDZ), but only the first time step (NDT = 1).

``````using StaticArrays
using LinearAlgebra

nc = 3
NDZ = 5
NDT = 10

VV = [MVector{nc,Float64}(zeros(nc)) for _ in 1:NDZ] #initialize vector of vectors to store update voltages at eachs spatial node
Vout = [[MVector{nc,Float64}(zeros(nc)) for _ in 1:NDZ] for _ in 1:NDT] #initialize array to store all output variables
#Vout contains voltages for conductors (nc) at at each spatial step (NDZ), and each time step (NDT)

trappedChargeV = [SVector{3,Float64}(10.0,0.0,-10.0)]
VV .= trappedChargeV
first.(Vout) .= trappedChargeV
display(first.(Vout))
``````

There is `first` and `last`, otherwise use `getindex.(Vabs, n)` to get the nth index.

You can update with an `SVector` too, but then you have to replace the entire subvector, for example:

``````VV = [zero(SVector{nc, Float64}) for _ in 1:NDZ]
# or
VV = Vector{SVector{nc, Float64}}(undef, NDZ)

trappedChargeV = SVector(10.0,0.0,-10.0)
VV .= (trappedChargeV,)
``````

Note that again you create unnecessary temporary arrays with `MVector{nc,Float64}(zeros(nc))` since `zeros(nc)` creates an ordinary array which is converted to a StaticArray. Use unstead `zero(MVector{3,Float64})`, etc.

But here you may as well make it easier on yourself by just writing

``````trappedChargeV = SVector(10.0,0.0,-10.0)  # <- no need for {3, Float64}
VV = fill(trappedChargeV, NDZ)
``````

This will not work:

``````first.(Vout) .= trappedChargeV
``````

The left side creates a temporary array and assigns to that, and it disappears without touching `Vout`.

I would do this:

``````for v in Vout
v[1] = trappedChargeV
end
# or
setindex!.(Vout, (trappedChargeV,), 1)
``````

Notice that this works for `SVector`s.

1 Like

Note that keeping `nc = 3` as a variable is a performance issue.

If you can, write them as literal values. Compare the following three functions. `foo` is yours, `bar` is the same but with literal instead of `nc`, and `baz` uses `SVector`. Also note that the first two don’t do what you (probably) want.

``````using BenchmarkTools, StaticArrays

function foo()
nc = 3
NDZ = 5
NDT = 10

VV = [MVector{nc,Float64}(zeros(nc)) for _ in 1:NDZ]

trappedChargeV = [SVector{3,Float64}(10.0,0.0,-10.0)]
VV .= trappedChargeV

Vout = [[MVector{nc,Float64}(zeros(nc)) for _ in 1:NDZ] for _ in 1:NDT]
first.(Vout) .= trappedChargeV

return Vout, VV
end

function bar()
NDZ = 5
NDT = 10

VV = [MVector{3,Float64}(zeros(3)) for _ in 1:NDZ]

trappedChargeV = [SVector{3,Float64}(10.0,0.0,-10.0)]
VV .= trappedChargeV

Vout = [[MVector{3,Float64}(zeros(3)) for _ in 1:NDZ] for _ in 1:NDT]
first.(Vout) .= trappedChargeV

return Vout, VV
end

function baz()
NDZ = 5
NDT = 10

trappedChargeV = SVector(10.0, 0.0, -10.0)
VV = fill(trappedChargeV, NDZ)

Vout = [[zero(SVector{3, Float64}) for _ in 1:NDZ] for _ in 1:NDT]
setindex!.(Vout, (trappedChargeV,), 1)

return Vout, VV
end
``````

Benchmarks:

``````julia> @btime foo();
74.461 μs (665 allocations: 45.28 KiB)

julia> @btime bar();
3.046 μs (140 allocations: 10.03 KiB)

julia> @btime baz();
642.392 ns (14 allocations: 2.58 KiB)
``````
1 Like

It’s surprising to me that constant propagation didn’t handle the `foo` case…

3 Likes

Is there a good way to make something like this work, or is it best just to go to MVector.

``````vec1 = SVector{3}(rand(3))
for i = 1:4
vec1 .+= SVector{3}(rand(3))
end
``````

Just do `vec1 += `.

3 Likes

Yeah, there’s no need to update in-place. You can think of `StaticArray`s as values instead of containers.

No, it creates a new instance. But it will be super-fast, since it’s a stack-allocated value type.

``````SVector{3}(rand(3))
``````

write

``````rand(SVector{3})
``````

then you will avoid creating an unnecessary temporary array.

Observe:

``````julia> function foo()
vec1 = rand(MVector{3})
for i = 1:4
vec1 .+= rand(MVector{3})
end
return vec1
end

julia> function bar()
vec1 = rand(SVector{3})
for i = 1:4
vec1 += rand(SVector{3})  # <- no dot!
end
return vec1
end

julia> @btime foo();
88.801 ns (5 allocations: 160 bytes)

julia> @btime bar();
74.856 ns (0 allocations: 0 bytes)  # <- no allocations!
``````
2 Likes

Almost done implementing the staticarray approach and will post some figures showing the performance gains. Two more items that are stumping me.

1. I need to perform an elementwise absolute value operation on a result in the form of that shown below. I think I understand why what I have won’t work. But is there a straightforward way to do this?
``````vec1 = [[SVector{3,Float64}(-i,-i,-i) for i in 1:4] for _ in 1:10]
abs.(vec1)
``````
1. I need to save the vec1 static-array output to a file. I normally save matrix results out as a csv file using writedlm, but that seems tedious as it keeps all the brackets. Any suggestions for a convenient way to store out the data? The intent would be to read the data back in for later post processing and plotting. Or is there a convenient way to convert the data back to a non-static array form where manipulating contents is a bit easier.

May be worth pointing out that with a couple modifications, `MArray`s don’t allocate either:

``````julia> @inline inlinerand(::Type{<:MVector{3}}) = MVector(rand(),rand(),rand())
inlinerand (generic function with 1 method)

julia> function foo2()
vec1 = inlinerand(MVector{3})
for i = 1:4
vec1 .+= inlinerand(MVector{3})
end
return SArray(vec1)
end
foo2 (generic function with 1 method)

julia> @btime foo2()
47.758 ns (0 allocations: 0 bytes)
3-element SArray{Tuple{3},Float64,1,3} with indices SOneTo(3):
2.4573027826314195
2.1526819466919456
2.4786087715660896
``````
2 Likes

Yes, this doesn’t work since `vec1` is a vector of vectors of vectors. It’s a bit awkward to perform broadcasting to arbitrarily deep levels, but if you dig around on this forum you might find some nice solutions. Here’s one, I don’t think it’s the best one:

``````myabs(x) = abs(x)
myabs(x::AbstractArray) = myabs.(x)
``````

Then `myabs(vec1)` should work.

BTW FYI, you can make a matrix of `SVector`s, if you want:

``````julia> vec1 = [SVector{3,Float64}(-i,-i,-i) for i in 1:4, j in 1:10]
4×10 Array{SArray{Tuple{3},Float64,1,3},2}:
[-1.0, -1.0, -1.0]  [-1.0, -1.0, -1.0]  [-1.0, -1.0, -1.0]  …  [-1.0, -1.0, -1.0]  [-1.0, -1.0, -1.0]
[-2.0, -2.0, -2.0]  [-2.0, -2.0, -2.0]  [-2.0, -2.0, -2.0]     [-2.0, -2.0, -2.0]  [-2.0, -2.0, -2.0]
[-3.0, -3.0, -3.0]  [-3.0, -3.0, -3.0]  [-3.0, -3.0, -3.0]     [-3.0, -3.0, -3.0]  [-3.0, -3.0, -3.0]
[-4.0, -4.0, -4.0]  [-4.0, -4.0, -4.0]  [-4.0, -4.0, -4.0]     [-4.0, -4.0, -4.0]  [-4.0, -4.0, -4.0]
``````

Or maybe you could define a recursive helper function, like this:

``````rdot(f, x) = f(x)  # recursive dot
rdot(f, x::AbstractArray) = rdot.(f, x)
``````

and then call

``````vec2 = rdot(abs, vec1)
``````

This would work with other functions beside `abs` as well.

1 Like

Results update

It took me a couple of days to re-write the old allocation heavy code per the static array suggestions above, but got it working and here are the results:

Old Code (allocation heavy due to use of base array and slicing approach):

``````455.657741 seconds (585.01 M allocations: 73.176 GiB, 88.12% gc time)
``````

Revised Code (making use of static vector/matricies (StaticArrays.jl))

``````23.023161 seconds (71.79 M allocations: 3.910 GiB, 66.59% gc time)
``````

Between the above improvements and those discussed on a related thread that’s about a 1200X improvement to my original ugly code. Thanks to all who contributed comments and suggestions!

5 Likes