# Variation in computation time for execution of the exact same code

1 Like

But `stepSource` returns a 2d array, not a 3d one. What Iām saying is that inside `stepSource` you create a 3d array, and then create a 2d slice which you return. The fact that it started with a 3d array has no effect outside the function, except that it causes an extra memory allocation.

2 Likes

That makes sense. My recollection is that that using slices forced me into extra dimensions due to later calculations with other 3D elements. However, I am reworking the code with the StaticArray format you suggested. As an example here is what I have attempted for the three-phase sine source. Each sub-vector stores three voltages (one for each phase of a three-phase voltage source) at a given time step. This seems to output the right numerical data. One item that puzzles me is illustrated in the print statements. The first two make sense. In the third I am trying a quick way to get all the first elements of each vector for later plotting purposes. I could do this with another variable and loop, but is there an easier way?

``````using Plots
using LinearAlgebra
using StaticArrays

function threePhaseSine(Īø)
Īt = 2.0e-3; Ļ = 2.0*Ļ*60.0  #
Aphase(nt) = sqrt(2/3)*cos.(Ļ * Īt * nt .+ Īø)  #nt is time step
Bphase(nt) = sqrt(2/3)*cos.(Ļ * Īt * nt .- 2*Ļ/3 .+ Īø)
Cphase(nt) = sqrt(2/3)*cos.(Ļ * Īt * nt .+ 2*Ļ/3 .+ Īø)
Vsin = [MVector{3,Float64}([Aphase(nt), Bphase(nt), Cphase(nt)]) for nt in 1:10]

return Vsin
end

Vabc = threePhaseSine(0)

display(Vabc)
println("First vector of Vabc = ",Vabc[1])
println("First element of vector 1 of Vabc = ",Vabc[1][1])
println("First element of all vectors of Vabc = ",Vabc[:][1])  #this is unexpected

plot(Vabc[:][1]) #attempt to plot the voltage for Phase A as a function of time (doesn't work)
``````

`Vabs[:]` is just a copy of `Vabs`, so `Vabs[:][1]` is the same as writing `Vabs[1]`, except you make an extra copy.

You probably want `first.(Vabs)`.

Drop the square brackets around the phase functions here, this allocates a normal array and then converts it to an MVector.

In fact, itās better to drop the `{3,Float64}` since the former is automatically understood from the number of inputs, and the element type is inferred, and will then allow also other types than `Float64`, so:

``````Vsin = [MVector(Aphase(nt), Bphase(nt), Cphase(nt)) for nt in 1:10]
``````

Also consider whether you can use an SVector instead.

BTW: The dots you put inside the `Aphase` etc. functions donāt really make any sense, since `threePhaseSine` only works for scalar input. You can, however, broadcast the outer function over vector input, as `threePhaseSine.(Īø)`.

1 Like

Roger on the dots. I see there is no `second.(Vabs)` (i.e. for accessing the second or third elements of each vector). Is there a more general way to access?

Likewise, if I am using MVector or Mmatrix elements (because they have to be updated), how do I index certain blocks and update in place? For example in the code below, Vout is a vector of vector of vectors (?? multiple conductors, space, and time). I want to write a small array of Voltages (trappedChargeVoltage) to each conductor (all nc), each spatial step (all NDZ), but only the first time step (NDT = 1).

``````using StaticArrays
using LinearAlgebra

nc = 3
NDZ = 5
NDT = 10

VV = [MVector{nc,Float64}(zeros(nc)) for _ in 1:NDZ] #initialize vector of vectors to store update voltages at eachs spatial node
Vout = [[MVector{nc,Float64}(zeros(nc)) for _ in 1:NDZ] for _ in 1:NDT] #initialize array to store all output variables
#Vout contains voltages for conductors (nc) at at each spatial step (NDZ), and each time step (NDT)

trappedChargeV = [SVector{3,Float64}(10.0,0.0,-10.0)]
VV .= trappedChargeV
first.(Vout) .= trappedChargeV
display(first.(Vout))
``````

There is `first` and `last`, otherwise use `getindex.(Vabs, n)` to get the nth index.

You can update with an `SVector` too, but then you have to replace the entire subvector, for example:

``````VV = [zero(SVector{nc, Float64}) for _ in 1:NDZ]
# or
VV = Vector{SVector{nc, Float64}}(undef, NDZ)

trappedChargeV = SVector(10.0,0.0,-10.0)
VV .= (trappedChargeV,)
``````

Note that again you create unnecessary temporary arrays with `MVector{nc,Float64}(zeros(nc))` since `zeros(nc)` creates an ordinary array which is converted to a StaticArray. Use unstead `zero(MVector{3,Float64})`, etc.

But here you may as well make it easier on yourself by just writing

``````trappedChargeV = SVector(10.0,0.0,-10.0)  # <- no need for {3, Float64}
VV = fill(trappedChargeV, NDZ)
``````

This will not work:

``````first.(Vout) .= trappedChargeV
``````

The left side creates a temporary array and assigns to that, and it disappears without touching `Vout`.

I would do this:

``````for v in Vout
v[1] = trappedChargeV
end
# or
setindex!.(Vout, (trappedChargeV,), 1)
``````

Notice that this works for `SVector`s.

1 Like

Note that keeping `nc = 3` as a variable is a performance issue.

If you can, write them as literal values. Compare the following three functions. `foo` is yours, `bar` is the same but with literal instead of `nc`, and `baz` uses `SVector`. Also note that the first two donāt do what you (probably) want.

``````using BenchmarkTools, StaticArrays

function foo()
nc = 3
NDZ = 5
NDT = 10

VV = [MVector{nc,Float64}(zeros(nc)) for _ in 1:NDZ]

trappedChargeV = [SVector{3,Float64}(10.0,0.0,-10.0)]
VV .= trappedChargeV

Vout = [[MVector{nc,Float64}(zeros(nc)) for _ in 1:NDZ] for _ in 1:NDT]
first.(Vout) .= trappedChargeV

return Vout, VV
end

function bar()
NDZ = 5
NDT = 10

VV = [MVector{3,Float64}(zeros(3)) for _ in 1:NDZ]

trappedChargeV = [SVector{3,Float64}(10.0,0.0,-10.0)]
VV .= trappedChargeV

Vout = [[MVector{3,Float64}(zeros(3)) for _ in 1:NDZ] for _ in 1:NDT]
first.(Vout) .= trappedChargeV

return Vout, VV
end

function baz()
NDZ = 5
NDT = 10

trappedChargeV = SVector(10.0, 0.0, -10.0)
VV = fill(trappedChargeV, NDZ)

Vout = [[zero(SVector{3, Float64}) for _ in 1:NDZ] for _ in 1:NDT]
setindex!.(Vout, (trappedChargeV,), 1)

return Vout, VV
end
``````

Benchmarks:

``````julia> @btime foo();
74.461 Ī¼s (665 allocations: 45.28 KiB)

julia> @btime bar();
3.046 Ī¼s (140 allocations: 10.03 KiB)

julia> @btime baz();
642.392 ns (14 allocations: 2.58 KiB)
``````
1 Like

Itās surprising to me that constant propagation didnāt handle the `foo` caseā¦

3 Likes

Is there a good way to make something like this work, or is it best just to go to MVector.

``````vec1 = SVector{3}(rand(3))
for i = 1:4
vec1 .+= SVector{3}(rand(3))
end
``````

Just do `vec1 += `.

3 Likes

It updates in-place, correct?

Yeah, thereās no need to update in-place. You can think of `StaticArray`s as values instead of containers.

No, it creates a new instance. But it will be super-fast, since itās a stack-allocated value type.

Also, instead of

``````SVector{3}(rand(3))
``````

write

``````rand(SVector{3})
``````

then you will avoid creating an unnecessary temporary array.

Observe:

``````julia> function foo()
vec1 = rand(MVector{3})
for i = 1:4
vec1 .+= rand(MVector{3})
end
return vec1
end

julia> function bar()
vec1 = rand(SVector{3})
for i = 1:4
vec1 += rand(SVector{3})  # <- no dot!
end
return vec1
end

julia> @btime foo();
88.801 ns (5 allocations: 160 bytes)

julia> @btime bar();
74.856 ns (0 allocations: 0 bytes)  # <- no allocations!
``````
2 Likes

Almost done implementing the staticarray approach and will post some figures showing the performance gains. Two more items that are stumping me.

1. I need to perform an elementwise absolute value operation on a result in the form of that shown below. I think I understand why what I have wonāt work. But is there a straightforward way to do this?
``````vec1 = [[SVector{3,Float64}(-i,-i,-i) for i in 1:4] for _ in 1:10]
abs.(vec1)
``````
1. I need to save the vec1 static-array output to a file. I normally save matrix results out as a csv file using writedlm, but that seems tedious as it keeps all the brackets. Any suggestions for a convenient way to store out the data? The intent would be to read the data back in for later post processing and plotting. Or is there a convenient way to convert the data back to a non-static array form where manipulating contents is a bit easier.

May be worth pointing out that with a couple modifications, `MArray`s donāt allocate either:

``````julia> @inline inlinerand(::Type{<:MVector{3}}) = MVector(rand(),rand(),rand())
inlinerand (generic function with 1 method)

julia> function foo2()
vec1 = inlinerand(MVector{3})
for i = 1:4
vec1 .+= inlinerand(MVector{3})
end
return SArray(vec1)
end
foo2 (generic function with 1 method)

julia> @btime foo2()
47.758 ns (0 allocations: 0 bytes)
3-element SArray{Tuple{3},Float64,1,3} with indices SOneTo(3):
2.4573027826314195
2.1526819466919456
2.4786087715660896
``````
2 Likes

Yes, this doesnāt work since `vec1` is a vector of vectors of vectors. Itās a bit awkward to perform broadcasting to arbitrarily deep levels, but if you dig around on this forum you might find some nice solutions. Hereās one, I donāt think itās the best one:

``````myabs(x) = abs(x)
myabs(x::AbstractArray) = myabs.(x)
``````

Then `myabs(vec1)` should work.

BTW FYI, you can make a matrix of `SVector`s, if you want:

``````julia> vec1 = [SVector{3,Float64}(-i,-i,-i) for i in 1:4, j in 1:10]
4Ć10 Array{SArray{Tuple{3},Float64,1,3},2}:
[-1.0, -1.0, -1.0]  [-1.0, -1.0, -1.0]  [-1.0, -1.0, -1.0]  ā¦  [-1.0, -1.0, -1.0]  [-1.0, -1.0, -1.0]
[-2.0, -2.0, -2.0]  [-2.0, -2.0, -2.0]  [-2.0, -2.0, -2.0]     [-2.0, -2.0, -2.0]  [-2.0, -2.0, -2.0]
[-3.0, -3.0, -3.0]  [-3.0, -3.0, -3.0]  [-3.0, -3.0, -3.0]     [-3.0, -3.0, -3.0]  [-3.0, -3.0, -3.0]
[-4.0, -4.0, -4.0]  [-4.0, -4.0, -4.0]  [-4.0, -4.0, -4.0]     [-4.0, -4.0, -4.0]  [-4.0, -4.0, -4.0]
``````

Or maybe you could define a recursive helper function, like this:

``````rdot(f, x) = f(x)  # recursive dot
rdot(f, x::AbstractArray) = rdot.(f, x)
``````

and then call

``````vec2 = rdot(abs, vec1)
``````

This would work with other functions beside `abs` as well.

1 Like

Results update

It took me a couple of days to re-write the old allocation heavy code per the static array suggestions above, but got it working and here are the results:

Old Code (allocation heavy due to use of base array and slicing approach):

``````455.657741 seconds (585.01 M allocations: 73.176 GiB, 88.12% gc time)
``````

Revised Code (making use of static vector/matricies (StaticArrays.jl))

``````23.023161 seconds (71.79 M allocations: 3.910 GiB, 66.59% gc time)
``````

Between the above improvements and those discussed on a related thread thatās about a 1200X improvement to my original ugly code. Thanks to all who contributed comments and suggestions!

5 Likes