Is it "dangerous" to extend base methods for tuples?

I have to work on a structure (a gradient of multiple layers) that I represent as a vector of tuples of Float/Arrays, where the different tuples in the array may have different structure, e.g.

a = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]),([1.0,2.0,3.0],1.0)]
b = [([0.1 0.2; 0.3 0.4], [0.1,0.2,0.3]),([0.1,0.2,0.3],0.1)]

I have then to perform some operations on this structure, like a = a - 0.1 * b (gradient descent) or computing averages.

I can either write complex ugly functions or I can resort to extend base methods for tuples:

import Base.+
import Base.-
import Base.*
import Base./
+(a::Tuple,b::Tuple) = a .+ b
-(a::Tuple,b::Tuple) = a .- b
*(a::Tuple,b::Number) = a .* b
/(a::Tuple,b::Number) = a ./ b

Then I can do directly a = a - 0.1 * b or avg = +([a,a,a]...)/3.

But is it considered dangerous/bad practice ? Could summing/subtracting two tuples or multiplying/dividing for a scalar have other meaning than the obvious one ?

Well, this is type piracy and you should not do it. In short: you are defining methods of functions (which you do not β€œown”) dispatching on types which you don’t own either.

I recommend creating your own struct to represent the data. If you want to use fixed length arrays, you can use the StaticArrays.jl package.

3 Likes

In this specific case, you can use the @. macro to automatically add dots to your calculations.

In this specific case, you can use the @. macro to automatically add dots to your calculations.

How? It doesn’t work for me:


a = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]),([1.0,2.0,3.0],1.0)]
b = [([0.1 0.2; 0.3 0.4], [0.1,0.2,0.3]),([0.1,0.2,0.3],0.1)]

@. a = a - b*1.0 # ERROR: MethodError: no method matching *(::Tuple{Array{Float64,2},Array{Float64,1}}, ::Float64)

c = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]),([1.0,2.0,3.0],1.0)]
@. c = sum([a,a,a])/3 # ERROR: DimensionMismatch("array could not be broadcast to match destination")
using Statistics
@. c = mean([a,a,a]) # ERROR: DimensionMismatch("array could not be broadcast to match destination")

At the end I solved renaming the functions (as suggested in the link posted by @tamasgal):

gradSum(a::Tuple,b::Tuple) = a .+ b
gradSum(a::Tuple) = a
gradSub(a::Tuple,b::Tuple) = a .- b
gradMul(a::Tuple,b::Number) = a .* b
gradDiv(a::Tuple,b::Number) = a ./ b
# For summing more than two I had to resort to this function:
function gradSum(β–½β‚›)
    o = β–½β‚›[1]
    for i in 2:length(β–½β‚›)
        o = gradSum.(o,β–½β‚›[i])
    end
    return o
end

a = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]),([1.0,2.0,3.0],1.0)]
b = [([0.1 0.2; 0.3 0.4], [0.1,0.2,0.3]),([0.1,0.2,0.3],0.1)]

c = gradSub.(a,gradMul.(b,0.1))
d = gradDiv.(gradSum([a,a,a]),3)

Sorry, I was too quick to answer, should have tested the code before :frowning:

Actually I was referring to something like this:

julia> using StaticArrays


julia> struct Gradient{T} <: FieldVector{4, T}
           d1::T
           d2::T
           d3::T
           d4::T
       end

Tince this will give you quite a few basic operations for free, as seen here:

julia> g = Gradient(1.0, 2.0, 3.0, 4.0)
4-element Gradient{Float64} with indices SOneTo(4):
 1.0
 2.0
 3.0
 4.0

julia> g * 2
4-element Gradient{Float64} with indices SOneTo(4):
 2.0
 4.0
 6.0
 8.0

julia> g - g
4-element Gradient{Float64} with indices SOneTo(4):
 0.0
 0.0
 0.0
 0.0

julia> gradients = [g, Gradient(3, 2, 1, 0), Gradient(0.1, 0.2, 0.1, 0.9)]
3-element Array{Gradient,1}:
 [1.0, 2.0, 3.0, 4.0]
 [3, 2, 1, 0]
 [0.1, 0.2, 0.1, 0.9]

julia> gradients .* 5
3-element Array{Gradient,1}:
 [5.0, 10.0, 15.0, 20.0]
 [15, 10, 5, 0]
 [0.5, 1.0, 0.5, 4.5]
2 Likes

The problem here is not the tuples, but you wanting to apply two semantically different methods depending on the element type. If your code had Vectors instead of Tuples it would have the same problem. The problem is that you want to apply Base.* that is defined for matrices and mean matrix multiplication when the elements are matrices and you want to apply .* that is element-wise multiplication when the elements are tuples or vectors. One solution is to define a operator that actually does what you want:

⋆(a::Tuple, b::Tuple) = a .⋆ b
⋆(a::Vector, b::Vector) = a .⋆ b
⋆(a::Any, b::Any) = a * b

a = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]), ([1.0,2.0,3.0], 1.0)]
b = [([0.1 0.2; 0.3 0.4], [0.1,0.2,0.3]), ([0.1,0.2,0.3], 0.1)]
c = a ⋆ b

gives

2-element Array{Tuple{Array{Float64,N} where N,Any},1}:
 ([0.7 1.0; 1.5 2.2], [0.1, 0.4, 0.8999999999999999])
 ([0.1, 0.4, 0.8999999999999999], 0.1)
1 Like