Struggling with a Point type for several dimensions

first-steps

#1

Hi,

Very new to Julia (and Discourse), download 1.0.2 yesterday. I have a large code base in python that I want to port. Python code is slow, hence the move. I have another large code base in c++ (c++14) that is also very mature. I don’t want to back port the python to the c++ code, I’d rather move wholesale to Julia for the python part initially. I mention c++ as I make extensive use of templating and meta programming to speed my c++ code up, something I lost on the python parts of the code, and I’d like to get back via Julia syntax.
An important design principle that I want to replicated is the code (computational physics) being independent of dimensions: I already have this capability in python and c++ (at compile time). So define a 2D Point and a 3D Point with dot product etc defined, and then only when I read the mesh in (or write out the solution) do I actually define the dimension (cos I know it). The rest of the code is dimension independent and works on vector/tensor maths. So far so good, but how (with only several hour behind me) do i define a Point type that is “templated” on the dimensions. I want it to be a compile time aspect, so I benefit from the speed. Attempts so far (that don’t work are):

mutable struct Point{S}
data::Array{Float64}(undef, S)
end

I don’t want to do:

mutable struct TwoDPoint
x::Float64
y::Float64
end

because I’m hoping that the Array type used will then give me existing dot, inner, outer product, addition subtraction etc etc.

Can anyone help, this is probably the first class/type I need to get going. I can probably learn a lot from just replicating my existing point class in the new language. Design goals again are, as much information as possible at compile for speed, use existing linAlg methods if already implemented for a base type, fixed length array since two doubles or 3 doubles is as much as I’d store.

Thanks
Andy


#2

I would suggest that you wrap a tuple, which can be parametrized by the length, something like this should get you started:

julia> struct Point{S}
           x::NTuple{S,Float64}
       end

julia> p = Point{2}((1.0, 2.0))
Point{2}((1.0, 2.0))

julia> p = Point{3}((1.0, 2.0, 3.0))
Point{3}((1.0, 2.0, 3.0))

Unless you write this code for learning purposes I suggest that you have a look at the following packages which already implements things like this:

  • Tensors.jl for tensor operations like dot, inner/outer products etc. (Disclaimer: I am one of the authors of this package.)
  • StaticArrays.jl for general fixed size arrays.

#3

Thank you very much for your help and speedy response. I am keen to learn, so since I only need a few operators dot inner etc, I’d like to hand roll it rather than use a library (at least in the short term but thanks for the recommendation).

Could I ask: how might I use dot or vecdot with a struct like the above? Secondly, why will it not work using an Array as the member type (I just tired to convert you example to do so and failed)? I tried to convert to Array, simply because I’m under the impression I need to use the Array type to benefit from the operators in LinearAlgebra, is that the case? or will NTuple work also with those operators in LinearAlgebra?

Thanks,
Andy


#4

I would add, having just found this out, I can’t use tuple. I need the elements of the point to be mutable. So I need to change x and y without destroying the point and creating a new one. So far hand rolled code looks like this, but I can’t use a linearAlgebra provided dot function, since it fails for tuple:

module point2D
    using LinearAlgebra
    export x, y, Point, dot
    mutable struct PointType{S}
#        x::Float64
#        y::Float64
        data::NTuple{S,Float64}
#        PointType{2}(x::Float64,y::Float64) = new{2}(x, y)
    end

    x(p::PointType{2}) = p.data[1]
    y(p::PointType{2}) = p.data[2]
    Point(x::Float64, y::Float64) = PointType{2}((x, y))
    Point() = PointType{2}((0.0, 0.0))
    function dot(u::PointType{2}, v::PointType{2})
        return u.data[1]*v.data[1] + u.data[2]*v.data[2]
    end

    x(p::PointType{3}) = p.data[1]
    y(p::PointType{3}) = p.data[2]
    z(p::PointType{3}) = p.data[3]

    Point(x::Float64, y::Float64, z::Float64) = PointType{3}((x, y, z))
    Point() = PointType{3}((0.0, 0.0, 0.0))

    # test code
    p1 = Point(1.0, 9.0)
    p2 = Point(5.0, 4.0)
    r = dot(p1, p2)
    println(r)
end

This is all academic now and back to the drawing board as I need to mutate the types within the point so tuple is out. I would like the above with an Array where I provide “S”. Can you help with that. I would like to not have to pay the price for a variable length array when I know the size is 2 or 3, but I’d like that type to be compatible with linAlg functions. Thoughts?

Thanks again,
Andy


#5

StaticArrays.jl has a fixed length mutable vector called MVector.

julia> using StaticArrays

julia> struct PointType{S}
           data::MVector{S,Float64}
       end

julia> PointType(args...) = PointType(MVector(args...))
PointType

julia> tt = PointType(2.,3.,4.)
PointType{3}([2.0, 3.0, 4.0])

julia> tt.data[2] = 0.
0.0

julia> tt
PointType{3}([2.0, 0.0, 4.0])


#6

And if you want to forward multiple functions, this macro (adapted from @forward) could be helpful

using MacroTools
macro forward2(ex, fs)
    @capture(ex, T_.field_) || error("Syntax: @forward T.x f, g, h")
    T = esc(T)
    fs = isexpr(fs, :tuple) ? map(esc, fs.args) : [esc(fs)]
    :($([:($f(x::$T, y::$T) =
           (Base.@_inline_meta; $f(x.$field, y.$field)))
         for f in fs]...);
      nothing)
end

import LinearAlgebra: dot
import Base: +, -
@forward2 PointType.data dot, +, -

x = PointType(1., 2.)
y = PointType(2., 3.)
dot(x, y)
x - y
x + y

#7

It sounds like StaticArrays provides exactly what you’re looking for with FieldVector. See http://juliaarrays.github.io/StaticArrays.jl/stable/pages/api.html#FieldVector-1 . The new Point3D type inherits dot and the rest of the linear algebra commands.


#8

StaticArrays are so fast, you can often come out ahead by using StaticArrays even if you are looking for a mutable array. If you need to update a component of your Point3D, make a new Point3D out of the component you want to update along with the remaining (unchanged) components. You might need some helper functions for this.


#9

You can use the built in setindex for that.

julia> using StaticArrays

julia> struct Point3D <: FieldVector{3, Float64}
           x::Float64
           y::Float64
           z::Float64
       end

julia> p = Point3D(1,2,3)
3-element Point3D:
 1.0
 2.0
 3.0

julia> p2 = setindex(p, 5, 1)
3-element Point3D:
 5.0
 2.0
 3.0

#10

Strongly second this. It’s very unnatural for points to be mutable and compilers are very good at reasoning about fixed-size immutable structures like this.


#11

Others have posted really good suggestions, but I figured it might help at least to clear up a syntatical misunderstanding, as you dive deeper on your julia voyage.

You tried this:

mutable struct Point{S}
    data::Array{Float64}(undef, S)
end

There are a couple of issues with this:

  1. The type specification in the struct definition should only specify the type, not a value. The type in this case is Array{Float64, 1} (or Vector{Float64}, which is an alias).
  2. The Array{T, N} type doesn’t keep its size as part of the type information, only the number of dimensions (1D vector vs. 2D matrix, etc…)
  3. Even if you want to mutate the data inside the array, you don’t need your struct to be mutable, because you wouldn’t be changing the binding (for a given x::Point, x.data would never be changed to point to a different array, you’d just be mutating that array.

As others have mentioned, for performance reasons you probably want to use Tuples or StaticArrays, which know their length based on their type information, but for learning sake if you want to use an Array the syntax would be:

struct Point
    data::Vector{Float64}
end

#12

Thank you all very much for the help static array it is. My final decision is this: It is very much natural for a component of the point to change. This is a computational phyiscs code with a time dependent mesh: mesh movement is the point!

What I need to understand therefore, is it more performant to use an immutable version of the staticarray and pay the price of a copy when I quite naturally do require to update that point’s positions. May happen for all points, millions of them, very time step. Or, use a mutable version like the fieldvector example given, no copies when updating components but compiler doesn’t go full throttle because the state can change. For large calculations and this happening all the time what is likely to win on performance? We are taught not to copy in other languages is the reasons for the questions…std::move

Finally for context which may shape an answer, I’d add that this type will also hold in addition to spatial position information, the velocity field, momentum, and other dimension dependent data. It’s a key performance type for me. All of which will change their data, all the time. In this context am I best to use a copy variant or a mutable variant of the design?

Could a helper be written for the imutable copy variant that looks like operator when updating a component so I could write the whole code and change it to the mutable version later if the copy was too costly??? That way the interface for changing a component of the point was only ever so I can swap it without changing other code. That last one is more of a learning question

Thanks again!
Andy


#13

That should have read: operator “[“ “]”
Thanks.


#14

Your intuition may be misleading you here. The best approach in Julia is not to make these decisions prematurely, but write generic code, and experiment with various approaches.

Possibly something like

"Update x by a, return the new value, potentially change the original one."
function update!(x::Vector, a)
    x .+= a
    x
end

function update!(x::StaticVector, a)
    x .+ a
end

#15

What dimensionality would your points typically have? Would you ever work with individual points or would you always work with large arrays of them?


#16

So the point itself would store 2 or 3 doubles for x,y and possibly z. These would then be within another type say vertex. One point for current position one for old position (due to time stepping) one for vertex velocity (potentially old velocity also depending on time stepping method). So 3 or 4 point types within say a vertex type. Then huge array of vertices. Likewise for a face except many more. A face would have an array of vertex references (?) describing it. One point for face normal, one point for face Center. Then huge array of faces. This goes on for edges, cells and that’s just the mesh. There is then the solution data within cells and as I mentioned that type would hold dimension dependent data also like velocity.
Would you make a different decision based on this?

On a related note a thing that did occur to me down the line is the use of cuda and SoA versus AoS. My point layout already enforces one of those decisions. Is there a way in Julia to store an array of x and array of y and z and then define a point type that is simply a view into a specific x, y, z? So the data is not in the point ( but I can still perform linAlg like dot) and then I may be able to benefit from faster host device copy? First question way more important at the moment…


#17

Definitely use immutables - and StaticArrays or GeometryTypes! Lot’s of work went into making all operations as fast as possible and support most LinAlg operations!

Your mental model about immutables seems to be wrong: with immutables, you don’t need to worry about copies, since the compiler is able to figure out a fairly optimal mix of reuse & copies - and even if it decides to create a copy they’re very cheap!

Trust me, after working a lot with points and meshes, immutable 2-3 dimensional points are always faster or same speed, you can store them continuously in memory (can’t do that with mutables), it makes for cleaner and more readable functional code that looks more like math (since you’re avoiding to do stuff like p.x = p.x + 1.0 and do p = p + 1.0), you don’t accidentally mutate points in different functions, etc :wink:


#18

Ok. Happy to be wrong. So I have this type (no mutable before struct):

struct Point3D <: FieldVector{3, Float64}
    x::Float64
    y::Float64
    z::Float64
end

How do I cleanly update only one component of that type and replace the existing type with the newly modified one so:
p1 = Point3D(1.0, 2.0, 3.0)
p1 = p1.x + 1.0 ?? Gives p1 as a scalar
Confused, surely not:
p1 = ((p1.x + 1.0, p1.y, p1.z))


#19

last line should have been:
p1 = Point3D(p1.x + 1.0, p1.y, p1.z)


#20

I usually try to find the arithmetic operation that doesn’t involve updating single components :wink:
But of course, that’s not always possible, so you can do:

p = Base.setindex(p, p[2] + 1.0, 2)

or

p = p .+ Point(0, 1.0, 0)

I admit that it’s a bit more noise than the mutable version, but it should offer the same performance!
I (luckily?) don’t need to update single components in my code that often, so I don’t feel too bothered by it :wink: