Is it possible to solve "convex" neural network using `Convex.jl`?

iHany · December 22, 2020, 8:03am

Hi,

I’m trying to solve a convex optimisation. More specifically, I build a neural network NN(x, y), which is convex in argument y. Such neural network can be constructed as suggested in Input Convex Neural Networks.

So I wanna find the optimal variable y^* which minimises NN(x, y) for given x.

However, (as I was worried,) simple construction of the NN using Flux cannot be solved using Convex.jl. That may be due to that Convex.jl works only for the predefined operations to make sure that it is convex problem.

I wonder if it’s impossible to do this.

+) edit: I will mostly use ReLU activation.
++) edit: Here’s my implementation codes (actually I don’t understand the role of @functor):

using Flux
using Flux: @functor
using Flux: glorot_uniform

using Convex


"""
Wz should be nonnegative
"""
struct ICNN_Layer
    # x-path
    W̃
    b̃
    g̃
    # y-path
    Wz
    Wzu
    bz
    Wy
    Wyu
    by
    Wu
    b
    g
end

function ICNN_Layer(uin::Integer, uout::Integer,
        zin::Integer, zout::Integer,
        y::Integer,  # y's dim
        g̃=identity, g=identity;
        initW = glorot_uniform, initb = zeros)
    layer = ICNN_Layer(
        initW(uout, uin), initb(uout), g̃,  # W̃, b̃, g̃ (x-path)
        max.(initW(zout, zin), 0.0), initW(zin, uin), initb(zin),  # "Wz", Wzu, bz
        initW(zout, y), initW(y, uin), initb(y),  # Wy, Wyu, by
        initW(zout, uin), initb(zout), g  # Wu, b, g
    )
    return layer
end

@functor ICNN_Layer

function (a::ICNN_Layer)(input)
    u, z, y =input
    # params
    W̃, b̃, g̃ = a.W̃, a.b̃, a.g̃
    Wz, Wzu, bz = a.Wz, a.Wzu, a.bz
    Wy, Wyu, by = a.Wy, a.Wyu, a.by
    Wu, b, g = a.Wu, a.b, a.g
    # network operation
    u_next = g̃.(W̃*u .+ b̃)
    z_next = g.(
        Wz * (z .* max.(Wzu*u .+ bz, 0.0))
        + Wy * (y .* (Wyu*u .+ by))
        + (Wu * u .+ b)
    )
    return u_next, z_next, y
end

function _selector(input)
    u, z, y = input
    return z
end

function Flux.relu(x::Convex.AdditionAtom)
    return Convex.pos(x)
end


## ICNN
act = relu
n_x = 5
n_h = 128
n_y = 3
n_V = 1
layers = [
          ICNN_Layer(n_x, n_h, n_h, n_h, n_y, act, act),
          ICNN_Layer(n_h, n_h, n_h, n_V, n_y, act, act),
          _selector,
         ]
# Wz_0 is meaningless; z_0 = zeros(1) is convention
m(x, y) = foldl((input, m) -> m(input), layers, init = (x, zeros(n_h), y))

x = collect(1:n_x)
y = collect(n_x+1:n_x+n_y)
ẑ = m(x, y)
println(ẑ)


## Convex.jl + ICNN
y_opt = Variable(n_y)
minimize(m(x, y_opt))

And errors:

ERROR: LoadError: MethodError: no method matching zero(::Convex.AdditionAtom)
Closest candidates are:
  zero(::Type{Pkg.Resolve.VersionWeight}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Resolve/versionweights.jl:15
  zero(::Type{Pkg.Resolve.FieldValue}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Resolve/fieldvalues.jl:38
  zero(::Type{ModelingToolkit.TermCombination}) at /home/jinrae/.julia/packages/ModelingToolkit/1qEYb/src/linearity.jl:67
  ...
Stacktrace:
 [1] generic_matvecmul!(::Array{Convex.AdditionAtom,1}, ::Char, ::Array{Float32,2}, ::Array{Convex.MultiplyAtom,1}, ::LinearAlgebra.MulAddMul{true,true,Bool,Bool}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:685
 [2] mul! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:81 [inlined]
 [3] mul! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:208 [inlined]
 [4] *(::Array{Float32,2}, ::Array{Convex.MultiplyAtom,1}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:51
 [5] (::ICNN_Layer)(::Tuple{Array{Int64,1},Array{Float64,1},Variable}) at /home/jinrae/.julia/dev/GliderPathPlanning/test/icnn.jl:53
 [6] (::var"#35#36")(::Tuple{Array{Int64,1},Array{Float64,1},Variable}, ::ICNN_Layer) at /home/jinrae/.julia/dev/GliderPathPlanning/test/icnn.jl:83
 [7] BottomRF at ./reduce.jl:81 [inlined]
 [8] _foldl_impl(::Base.BottomRF{var"#35#36"}, ::Tuple{Array{Int64,1},Array{Float64,1},Variable}, ::Array{Any,1}) at ./reduce.jl:58
 [9] foldl_impl at ./reduce.jl:48 [inlined]
 [10] mapfoldl_impl(::typeof(identity), ::var"#35#36", ::NamedTuple{(:init,),Tuple{Tuple{Array{Int64,1},Array{Float64,1},Variable}}}, ::Array{Any,1}) at ./reduce.jl:44
 [11] mapfoldl(::Function, ::Function, ::Array{Any,1}; kw::Base.Iterators.Pairs{Symbol,Tuple{Array{Int64,1},Array{Float64,1},Variable},Tuple{Symbol},NamedTuple{(:init,),Tuple{Tuple{Array{Int64,1},Array{Float64,1},Variable}}}}) at ./reduce.jl:160
 [12] #foldl#205 at ./reduce.jl:178 [inlined]
 [13] m(::Array{Int64,1}, ::Variable) at /home/jinrae/.julia/dev/GliderPathPlanning/test/icnn.jl:83
 [14] top-level scope at /home/jinrae/.julia/dev/GliderPathPlanning/test/icnn.jl:93
 [15] include(::String) at ./client.jl:457
 [16] top-level scope at REPL[1]:1
i

ericphanson · December 22, 2020, 3:59pm

Yeah, that’s exactly right. Convex.jl applies a sequence of extended formulations that are only valid for convex problems by an operator overloading approach, where only certain predefined functions (“atoms”) are supported, and it checks that these atoms are combined in valid ways.

That being said, it’s not too difficult to write more atoms once you are familiar with Convex’s code, and with GitHub - jump-dev/DiffOpt.jl: Differentiating convex optimization programs w.r.t. program parameters it might be possible to differentiate through solves as well.

iHany · December 23, 2020, 12:40am

Thanks

But I just realised that the error I asked is quite weird.
While debugging it, the error is due to the line Wy * (y .* (Wyu*u .+ by)) in function (a::ICNN_Layer)(input) (the line calculating z_next).

Here’s the comparison:

with y = [6, 7, 8] (that is, an array)

function test()
    Wy = rand(128, 3)
    Wyu = rand(3, 5)
    by = rand(3)
    # y = Variable(3)
    y = [6, 7, 8]
    @bp
    result = Wy * (y .* (Wyu*zeros(5) .+ by))
end

@enter test()

Result:

1|julia> result
128-element Array{Float64,1}:
...

with y = Variable(3)

function test()
    Wy = rand(128, 3)
    Wyu = rand(3, 5)
    by = rand(3)
    y = Variable(3)
    # y = [6, 7, 8]
    @bp
    result = Wy * (y .* (Wyu*zeros(5) .+ by))
end

@enter test()

Result:

ERROR: MethodError: no method matching zero(::Convex.AdditionAtom)
Closest candidates are:
  zero(::Type{Pkg.Resolve.VersionWeight}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Resolve/versionweights.jl:15
  zero(::Type{Pkg.Resolve.FieldValue}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Pkg/src/Resolve/fieldvalues.jl:38
  zero(::Type{ModelingToolkit.TermCombination}) at /home/jinrae/.julia/packages/ModelingToolkit/1qEYb/src/linearity.jl:67
  ...
Stacktrace:
 [1] generic_matvecmul!(::Array{Convex.AdditionAtom,1}, ::Char, ::Array{Float64,2}, ::Array{Convex.MultiplyAtom,1}, ::LinearAlgebra.MulAddMul{true,true,Bool,Bool}) at /home/jinrae/julia-1.5.3/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:685
 [2] mul!(::Array{Convex.AdditionAtom,1}, ::Array{Float64,2}, ::Array{Convex.MultiplyAtom,1}, ::Bool, ::Bool) at /home/jinrae/julia-1.5.3/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:81
 [3] mul!(::Array{Convex.AdditionAtom,1}, ::Array{Float64,2}, ::Array{Convex.MultiplyAtom,1}) at /home/jinrae/julia-1.5.3/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:208
 [4] *(::Array{Float64,2}, ::Array{Convex.MultiplyAtom,1}) at /home/jinrae/julia-1.5.3/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:51
 [5] test() at /home/jinrae/.julia/dev/GliderPathPlanning/test/icnn.jl:105

iHany · December 23, 2020, 1:08am

@ericphanson
I found what’s the reason.

As listed on the Operations table of Convex.jl, elementwise multiplication should be performed with dot(*)(x, y).

I think it’s little bit uncomfortable as a user

ericphanson · December 23, 2020, 2:07am

We should be able to support broadcasting; can you file an issue on the Convex.jl repository with a short example where it fails to work?

iHany · December 23, 2020, 2:09am

Of course my pleasure.

Topic		Replies	Views
Best practice to combine Flux.jl and Convex.jl General Usage optimization , machine-learning	4	898	March 14, 2021
Can I extend other functions to make it compatible with Convex.jl (element-wise operation) Optimization (Mathematical) optimization	0	283	December 29, 2020
Convex.jl and Icnn Optimization (Mathematical)	4	459	May 12, 2022
Best practice for Flux.jl: how to untrack gradient? General Usage optimization , machine-learning	1	561	March 15, 2021
Convex.jl: Trouble with partially specified function inside another problem? Optimization (Mathematical)	5	1218	August 13, 2019

Is it possible to solve "convex" neural network using `Convex.jl`?

Related topics