I have to work on a structure (a gradient of multiple layers) that I represent as a vector of tuples of Float/Arrays, where the different tuples in the array may have different structure, e.g.
a = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]),([1.0,2.0,3.0],1.0)]
b = [([0.1 0.2; 0.3 0.4], [0.1,0.2,0.3]),([0.1,0.2,0.3],0.1)]
I have then to perform some operations on this structure, like a = a - 0.1 * b
(gradient descent) or computing averages.
I can either write complex ugly functions or I can resort to extend base methods for tuples:
import Base.+
import Base.-
import Base.*
import Base./
+(a::Tuple,b::Tuple) = a .+ b
-(a::Tuple,b::Tuple) = a .- b
*(a::Tuple,b::Number) = a .* b
/(a::Tuple,b::Number) = a ./ b
Then I can do directly a = a - 0.1 * b
or avg = +([a,a,a]...)/3
.
But is it considered dangerous/bad practice ? Could summing/subtracting two tuples or multiplying/dividing for a scalar have other meaning than the obvious one ?
Well, this is type piracy and you should not do it. In short: you are defining methods of functions (which you do not βownβ) dispatching on types which you donβt own either.
I recommend creating your own struct
to represent the data. If you want to use fixed length arrays, you can use the StaticArrays.jl package.
3 Likes
In this specific case, you can use the @.
macro to automatically add dots to your calculations.
In this specific case, you can use the @.
macro to automatically add dots to your calculations.
How? It doesnβt work for me:
a = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]),([1.0,2.0,3.0],1.0)]
b = [([0.1 0.2; 0.3 0.4], [0.1,0.2,0.3]),([0.1,0.2,0.3],0.1)]
@. a = a - b*1.0 # ERROR: MethodError: no method matching *(::Tuple{Array{Float64,2},Array{Float64,1}}, ::Float64)
c = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]),([1.0,2.0,3.0],1.0)]
@. c = sum([a,a,a])/3 # ERROR: DimensionMismatch("array could not be broadcast to match destination")
using Statistics
@. c = mean([a,a,a]) # ERROR: DimensionMismatch("array could not be broadcast to match destination")
At the end I solved renaming the functions (as suggested in the link posted by @tamasgal):
gradSum(a::Tuple,b::Tuple) = a .+ b
gradSum(a::Tuple) = a
gradSub(a::Tuple,b::Tuple) = a .- b
gradMul(a::Tuple,b::Number) = a .* b
gradDiv(a::Tuple,b::Number) = a ./ b
# For summing more than two I had to resort to this function:
function gradSum(β½β)
o = β½β[1]
for i in 2:length(β½β)
o = gradSum.(o,β½β[i])
end
return o
end
a = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]),([1.0,2.0,3.0],1.0)]
b = [([0.1 0.2; 0.3 0.4], [0.1,0.2,0.3]),([0.1,0.2,0.3],0.1)]
c = gradSub.(a,gradMul.(b,0.1))
d = gradDiv.(gradSum([a,a,a]),3)
Sorry, I was too quick to answer, should have tested the code before
Actually I was referring to something like this:
julia> using StaticArrays
julia> struct Gradient{T} <: FieldVector{4, T}
d1::T
d2::T
d3::T
d4::T
end
Tince this will give you quite a few basic operations for free, as seen here:
julia> g = Gradient(1.0, 2.0, 3.0, 4.0)
4-element Gradient{Float64} with indices SOneTo(4):
1.0
2.0
3.0
4.0
julia> g * 2
4-element Gradient{Float64} with indices SOneTo(4):
2.0
4.0
6.0
8.0
julia> g - g
4-element Gradient{Float64} with indices SOneTo(4):
0.0
0.0
0.0
0.0
julia> gradients = [g, Gradient(3, 2, 1, 0), Gradient(0.1, 0.2, 0.1, 0.9)]
3-element Array{Gradient,1}:
[1.0, 2.0, 3.0, 4.0]
[3, 2, 1, 0]
[0.1, 0.2, 0.1, 0.9]
julia> gradients .* 5
3-element Array{Gradient,1}:
[5.0, 10.0, 15.0, 20.0]
[15, 10, 5, 0]
[0.5, 1.0, 0.5, 4.5]
2 Likes
The problem here is not the tuples, but you wanting to apply two semantically different methods depending on the element type. If your code had Vector
s instead of Tuple
s it would have the same problem. The problem is that you want to apply Base.*
that is defined for matrices and mean matrix multiplication when the elements are matrices and you want to apply .*
that is element-wise multiplication when the elements are tuples or vectors. One solution is to define a operator that actually does what you want:
β(a::Tuple, b::Tuple) = a .β b
β(a::Vector, b::Vector) = a .β b
β(a::Any, b::Any) = a * b
a = [([1.0 2.0; 3.0 4.0], [1.0,2.0,3.0]), ([1.0,2.0,3.0], 1.0)]
b = [([0.1 0.2; 0.3 0.4], [0.1,0.2,0.3]), ([0.1,0.2,0.3], 0.1)]
c = a β b
gives
2-element Array{Tuple{Array{Float64,N} where N,Any},1}:
([0.7 1.0; 1.5 2.2], [0.1, 0.4, 0.8999999999999999])
([0.1, 0.4, 0.8999999999999999], 0.1)
1 Like