Error when calling ScikitLearn from a function in a Package

I have a function inside a package that I am writing, that looks like this:

using ScikitLearn

function logistic_skl(points::AbstractMatrix{<:Real}, labels::AbstractVector{Bool})
    @sk_import linear_model: LogisticRegression
    log_reg = fit!(LogisticRegression(penalty="l2"), points', labels)
    w = vec(log_reg.coef_)
    b = only(log_reg.intercept_)
    return w, b
end

But the package containing this function (say MyPkg.jl) now fails to pre-compile, with the following error (pointing to the function above):

ERROR: LoadError: LoadError: syntax: unsupported `const` declaration on local variable around /home/cossio/.julia/packages/ScikitLearn/Kn82b/src/Skcore.jl:187
Stacktrace:
 [1] top-level scope at /home/cossio/jl/MyPkg.jl/src/logistic_sklearn.jl:5
 [2] include(::Function, ::Module, ::String) at ./Base.jl:380
 [3] include at ./Base.jl:368 [inlined]
 [4] include(::String) at /home/cossio/jl/MyPkg.jl/src/MyPkg.jl:1
 [5] top-level scope at /home/cossio/jl/MyPkg.jl/src/MyPkg.jl:26
 [6] include(::Function, ::Module, ::String) at ./Base.jl:380
 [7] include(::Module, ::String) at ./Base.jl:368
 [8] top-level scope at none:2
 [9] eval at ./boot.jl:331 [inlined]
 [10] eval(::Expr) at ./client.jl:467
 [11] top-level scope at ./none:3
in expression starting at /home/cossio/jl/MyPkg.jl/src/logistic_sklearn.jl:5
in expression starting at /home/cossio/jl/MyPkg.jl/src/MyPkg.jl:26

How can I fix this?

The const comes from the expansion of @sk_import. So, put that line in the toplevel, not inside the function.

1 Like

If I put it at the top level,

@sk_import linear_model: LogisticRegression
function logistic_skl(points::AbstractMatrix{<:Real}, labels::AbstractVector{Bool})
    log_reg = fit!(LogisticRegression(penalty="l2"), points', labels)
    w = vec(log_reg.coef_)
    b = only(log_reg.intercept_)
    return w, b
end

I get a segmentation fault:


signal (11): Segmentation fault
in expression starting at /home/cossio/jl/MyPkg.jl/test/logistic.jl:22
PyObject_Call at /home/cossio/.julia/conda/3/lib/libpython3.8.so.1.0 (unknown line)
Allocations: 383287667 (Pool: 382734699; Big: 552968); GC: 213
ERROR: Package MyPkg errored during testing (received signal: 11)

The referred line 22 in the test file simply calls logistic_skl with some data.

Likely some incompatibility in data layouts.
Which data types do you pass for points and labels?

Here is the data I am using. Note that if I run these lines of code from a console, it works fine. The problem comes when this is inside a package.

@sk_import linear_model: LogisticRegression
function logistic_skl(points::AbstractMatrix{<:Real}, labels::AbstractVector{Bool})
    log_reg = fit!(LogisticRegression(penalty="l2"), points', labels)
    w = vec(log_reg.coef_)
    b = only(log_reg.intercept_)
    return w, b
end

# Generate data.
n = 2 # dimensionality of data
N = 10 # number of positive examples
M = 10 # number of negative examples
points = randn(n, N + M) .+ [fill(5, n, N) fill(-5, n, M)]
labels = [trues(N); falses(M)]

# logistic regression
w, b = logistic_skl(points, labels)

This looks related,

I tried doing this, but is not working:

    using ScikitLearn, PyCall
    const LogisticRegression = PyNULL()
    function __init__()
        @eval global LogisticRegression = pyimport("linear_model")
    end

Gives this error:

  Got exception outside of a @test
  LoadError: InitError: PyError (PyImport_ImportModule
  
  The Python package linear_model could not be found by pyimport. Usually this means
  that you did not install linear_model in the Python version being used by PyCall.

What if you do @eval @sk_import linear_model: LogisticRegression in the __init__ function?

1 Like

Thanks! This works,

using ScikitLearn, PyCall

const LogisticRegression = PyNULL()

function __init__()
    @eval @sk_import linear_model: LogisticRegression
end

function logistic_skl(points::AbstractMatrix{<:Real}, labels::AbstractVector{Bool})
    log_reg = fit!(LogisticRegression(penalty="l2"), points', labels)
    w = vec(log_reg.coef_)
    b = only(log_reg.intercept_)
    return w, b
end

However, I do get a warning from modifying the const:

WARNING: redefinition of constant LogisticRegression. This may fail, cause incorrect answers, or produce other errors.

So I suppose there should still be a better way.