Accessing a array type from a struct without declaring a new array

I have a function that needs to access a matrix from a struct, but every time the function does this inside this loop, it’s like it creates a new matrix m = foo.bar, or at least the time to run and the allocation necessary is significantly larger than when I put in the function the matrix “distance” as the input instead of “instance” and using “instance.distances”. Anyone could tell me what is the problem?

“solution” is an array{Int64, 1}
“instance” is my struct
“distances” is an array{Float64, 2

instance

    # [...]
    i = 1
    while i < length(solution)
        sum += instance.distances[solution[i], solution[i + 1]];
        i += 1
    end
    # [...]

As written that should not allocate anything. It would be nice if you provided a working example of what you are experiencing. The allocations probably come from other source or reason.

1 Like

Ok. The difference between the two codes is this line and another one that’s literally the same thing,
sum += instance.distances[solution[i], solution[i + 1]] to sum += distances[solution[i], solution[i + 1]]. This is when I put to run:

instance = instance_TSP(distances, points)
s2 = [8, 1, 5, 9, 7, 13, 4, 12, 6, 11, 2, 10, 3];
D = instance.distances
@benchmark calculate_objective(s2, D)
# Output:
`BenchmarkTools.Trial: 
  memory estimate:  16 bytes
  allocs estimate:  1
  --------------
  minimum time:     45.454 ns (0.00% GC)
  median time:      47.273 ns (0.00% GC)
  mean time:        48.686 ns (0.90% GC)
  maximum time:     1.516 μs (96.61% GC)
  --------------
  samples:          10000
  evals/sample:     990`

But if I put to run this:

@benchmark calcular_objetivo(s2, instance)
# Output:
`BenchmarkTools.Trial: 
  memory estimate:  224 bytes
  allocs estimate:  14
  --------------
  minimum time:     437.693 ns (0.00% GC)
  median time:      453.261 ns (0.00% GC)
  mean time:        481.157 ns (1.33% GC)
  maximum time:     7.831 μs (94.02% GC)
  --------------
  samples:          10000
  evals/sample:     199
`

This is how I declared the struct types:

struct point
    coordenadas::Tuple{Float64,Float64};
end;

struct instance{T <: AbstractFloat}
    distances::Array{T};
    point::Array{point};
end;

I don’t understand what your code is doing, but I see that you are using abstract types for your struct fields, which can lead to allocations and slowdown:

Array{T} is an abstract type, since it does not specify how many dimensions it has. You should specify dimensionalities, like this:

struct instance{T <: AbstractFloat}
    distances::Array{T, 2}  # same as Matrix{T}
    point::Array{point, 1}  # same as Vector{point}
end

(I’m guessing about your dimensions, replace with the appropriate value.)

Also, it isn’t idiomatic to use semicolons, the only place you need them is in the REPL, when you want to suppress printing of a return value.

And finally, there is a strong convention to use CapitalCase for type names, so Instance and Point, instead of instance and point.

4 Likes

As an example of the above:

julia> struct A
         d::Vector{Int}
       end

julia> struct B
         d::Vector
       end

julia> a = A([1,2,3])
A([1, 2, 3])

julia> b = B([1,2,3])
B([1, 2, 3])

julia> function f(x)
         s = 0
         for i in 1:length(x.d)
           s += x.d[i]
         end
         s
       end
f (generic function with 1 method)

julia> using BenchmarkTools

julia> @btime f($a)
  2.473 ns (0 allocations: 0 bytes)
6

julia> @btime f($b)
  94.774 ns (0 allocations: 0 bytes)
6

Note the huge difference in performance (there are no allocations here in either case, but in a different example there could be in when using B). The difference is that in A the vector can only be of Ints, while the elements of b.d could be of any type. Therefore, the compiler can build a specialized version of the function f knowing that a.d will be a vector of integers, and that is fast, but it cannot know in advance that the vector d of structure of type B is of integers or anything else, so it cannot specialize its operations.

The same happens in your case, but with the size of the arrays. So you either build your structure with a fixed type and size of array (Vector{T} or Matrix{T}, or Array{T,2}, etc), or just let the entire type of the array be on the signature of the type, such as:

julia> struct A{T}
         m::T
       end

julia> a = A([1,2,3])
A{Vector{Int64}}([1, 2, 3])

Now a function will be specialized for exactly A{Vector{Int64}}, and if you start an instance of A with a different type of array, like:

julia> a = A([1 2
              3 4])
A{Matrix{Int64}}([1 2; 3 4])


functions that work on these will be specialized to A{Matrix{Int64}} instead.

3 Likes