Struggling with a Point type for several dimensions

first-steps

#12

Thank you all very much for the help static array it is. My final decision is this: It is very much natural for a component of the point to change. This is a computational phyiscs code with a time dependent mesh: mesh movement is the point!

What I need to understand therefore, is it more performant to use an immutable version of the staticarray and pay the price of a copy when I quite naturally do require to update that point’s positions. May happen for all points, millions of them, very time step. Or, use a mutable version like the fieldvector example given, no copies when updating components but compiler doesn’t go full throttle because the state can change. For large calculations and this happening all the time what is likely to win on performance? We are taught not to copy in other languages is the reasons for the questions…std::move

Finally for context which may shape an answer, I’d add that this type will also hold in addition to spatial position information, the velocity field, momentum, and other dimension dependent data. It’s a key performance type for me. All of which will change their data, all the time. In this context am I best to use a copy variant or a mutable variant of the design?

Could a helper be written for the imutable copy variant that looks like operator when updating a component so I could write the whole code and change it to the mutable version later if the copy was too costly??? That way the interface for changing a component of the point was only ever so I can swap it without changing other code. That last one is more of a learning question

Thanks again!
Andy


#13

That should have read: operator “[“ “]”
Thanks.


#14

Your intuition may be misleading you here. The best approach in Julia is not to make these decisions prematurely, but write generic code, and experiment with various approaches.

Possibly something like

"Update x by a, return the new value, potentially change the original one."
function update!(x::Vector, a)
    x .+= a
    x
end

function update!(x::StaticVector, a)
    x .+ a
end

#15

What dimensionality would your points typically have? Would you ever work with individual points or would you always work with large arrays of them?


#16

So the point itself would store 2 or 3 doubles for x,y and possibly z. These would then be within another type say vertex. One point for current position one for old position (due to time stepping) one for vertex velocity (potentially old velocity also depending on time stepping method). So 3 or 4 point types within say a vertex type. Then huge array of vertices. Likewise for a face except many more. A face would have an array of vertex references (?) describing it. One point for face normal, one point for face Center. Then huge array of faces. This goes on for edges, cells and that’s just the mesh. There is then the solution data within cells and as I mentioned that type would hold dimension dependent data also like velocity.
Would you make a different decision based on this?

On a related note a thing that did occur to me down the line is the use of cuda and SoA versus AoS. My point layout already enforces one of those decisions. Is there a way in Julia to store an array of x and array of y and z and then define a point type that is simply a view into a specific x, y, z? So the data is not in the point ( but I can still perform linAlg like dot) and then I may be able to benefit from faster host device copy? First question way more important at the moment…


#17

Definitely use immutables - and StaticArrays or GeometryTypes! Lot’s of work went into making all operations as fast as possible and support most LinAlg operations!

Your mental model about immutables seems to be wrong: with immutables, you don’t need to worry about copies, since the compiler is able to figure out a fairly optimal mix of reuse & copies - and even if it decides to create a copy they’re very cheap!

Trust me, after working a lot with points and meshes, immutable 2-3 dimensional points are always faster or same speed, you can store them continuously in memory (can’t do that with mutables), it makes for cleaner and more readable functional code that looks more like math (since you’re avoiding to do stuff like p.x = p.x + 1.0 and do p = p + 1.0), you don’t accidentally mutate points in different functions, etc :wink:


#18

Ok. Happy to be wrong. So I have this type (no mutable before struct):

struct Point3D <: FieldVector{3, Float64}
    x::Float64
    y::Float64
    z::Float64
end

How do I cleanly update only one component of that type and replace the existing type with the newly modified one so:
p1 = Point3D(1.0, 2.0, 3.0)
p1 = p1.x + 1.0 ?? Gives p1 as a scalar
Confused, surely not:
p1 = ((p1.x + 1.0, p1.y, p1.z))


#19

last line should have been:
p1 = Point3D(p1.x + 1.0, p1.y, p1.z)


#20

I usually try to find the arithmetic operation that doesn’t involve updating single components :wink:
But of course, that’s not always possible, so you can do:

p = Base.setindex(p, p[2] + 1.0, 2)

or

p = p .+ Point(0, 1.0, 0)

I admit that it’s a bit more noise than the mutable version, but it should offer the same performance!
I (luckily?) don’t need to update single components in my code that often, so I don’t feel too bothered by it :wink:


#21

You can use Setfield.jl and do p2 = @set p1.x += 1 which is equivalent to (i.e., as efficient as) p2 = Point3D(p1.x + 1, p1.y, p1.z).


#22

You can use StructArrays.jl for SOA vs AOS:

array = StructArray{Point{2, Float32}}(rand(Float32, 100), rand(Float32, 100))
# and then just use `array` wherever you'd use `Vector{Point{2, Float32}}` and it will magically all work with SOA :) 

#23

Thanks again


#24

Just wanted to point out that immutable 3d/2d points being the most performant isn’t unique to Julia. I’ve written plenty of C++ physics/geometry processing code and I’ve always treated my point structs as immutable. Even for SoA you often want to write to a new buffer and then simply swap pointers after ( for random access pattern, serial update can be ok for mutation ). Cache can’t perform as well if it knows it’s data might be invalidated by a write.


#25

@Tamas_Papp kind of mentioned it briefly, but you may also want to check out in-place operators, usually denoted with a !

It might look something like

function updatePoint!(p1::Array{Point3D, 1}, i, nx, ny, nz)
  p1[i] = Point3d(p1[i].x + nx, p1[i].y + ny, p1[i].z + nz)
  end;

for updating a single point i. Then to move to broadcasting

function updatePoints!(p1::Point3D, nx, ny, nz)
  p1 = Point3d(p1.x + nx, p1.y + ny, p1.z + nz)
  end;

and call as updatePoints!.( vector/matrix of point3d, same sized nx, ny, nz ) to update many at once

Add: Ah, points .= updatePoints!.(points, … in this case?


#26

How do the other approaches perform?

struct Point3D ##<: FieldVector{3, Float64}
    x::Float64
    y::Float64
    z::Float64
end

pv = Point3D.(collect(1.0:0.1:100000),1.0,1.0)
n = length(pv)
xp,yp,zp = rand(n), rand(n), rand(n)
@time for iter=1:74 
  pv .= updatePoints!.(pv, xp,yp,zp);
  end;

Gives 0.99sec for near 74,000,000 simple point updates


#27

Btw, you can also write this roughly as:

pv = rand(SVector{3, Float64}, n)
pv .= pv .+ SVector.(xp, yp, zp)

#28

Just a smidge slower, ~0.993 to ~1.003 sec with the FieldVector part uncommented and using SVector

using StaticArrays

sv = SVector.(xp, yp, zp)
@time for iter=1:74 pv .= pv .+ sv; end;

And even closer with the pv = line, ~0.997


I think @time was pretty accurate, i did use a few runs

BenchmarkTools.Parameters(5.0, 10000, 1, 0.0, true, false, 0.05, 0.01)
times: [1.00113e9, 1.00184e9, 1.0025e9, 1.00361e9, 1.00452e9]
mem: 3552, alloc: 148
BenchmarkTools.Parameters(5.0, 10000, 1, 0.0, true, false, 0.05, 0.01)
times: [9.89299e8, 9.8971e8, 9.90509e8, 9.91875e8, 9.91978e8, 9.92714e8]
mem: 4736, alloc: 148
BenchmarkTools.Parameters(5.0, 10000, 1, 0.0, true, false, 0.05, 0.01)
times: [1.00491e9, 1.00499e9, 1.00532e9, 1.00674e9, 1.00691e9]
mem: 3552, alloc: 148


with loopsize 1
x = @benchmark for iter=1:1 pv .= pv .+ sv; end;
times: “min: 1.3489198e7, mean: 1.3560357623306233e7” (13.x millis)
mem: 48, alloc: 2, samples: 369

x2 = @benchmark for iter=1:1 pv .= px.updatePoints!.(pv, xp,yp,zp); end;
times: “min: 1.3358128e7, mean: 1.3417005728494624e7”
mem: 64, alloc: 2, samples: 372

x3 = @benchmark for iter=1:1 pv .= pv .+ sv; end; -Fieldvec
times: “min: 1.3453972e7, mean: 1.3529067089189189e7”
mem: 48, alloc: 2, samples 370


#29

You should probably use @btime from Benchmarktools to time that, since it’s not in a function and you are using a global variable.


#30

Putting it in one line is pretty important here, since that will then fuse the calculation together, without allocating a temporary Vector{Point} :wink:
Also, that’s how I would benchmark things;


using StaticArrays
pv = SVector.(collect(1.0:0.1:100000),1.0,1.0)
n = length(pv)
xp,yp,zp = rand(n), rand(n), rand(n)
function update_point(p1::SVector, nx, ny, nz)
    SVector(p1[1] + nx, p1[2] + ny, p1[3] + nz)
end

# move the expression you want to benchmark into function scope 
# (optional with benchmark tools and $ infront of input params)
test1(pv, xp, yp, zp) = pv .= pv .+ SVector.(xp, yp, zp)
test2(pv, xp, yp, zp) = pv .= update_point.(pv, xp, yp, zp)

using BenchmarkTools
# If you just want the time
@btime test1($pv, $xp, $yp, $zp)
@btime test2($pv, $xp, $yp, $zp)

# if you want to compare results + and have it statistically sound!
b1 = @benchmark test1($pv, $xp, $yp, $zp)
b2 = @benchmark test2($pv, $xp, $yp, $zp)

judge(minimum(b1), minimum(b2)).time == :invariant

Btw, I think StaticArrays should deprecate fieldvectors… Now, that one can overload getfield (getproperty), one can also get svector.x without a struct.
If anyone would like to have that, I’d be open to accept a PR to GeometryTypes adding getproperty overloads for the Point type (which is just a StaticVector inheriting form StaticArrays) :wink:


#31

include("points2.jl")

13.316 ms (0 alloc, 0 byte) 13.332 ms minimum? in b1
13.321 ms (0 alloc, 0 byte) 13.322 ms minimum for b2
true