[PyCall] Pre-installing a python package required by a Julia package

Hello,
I’m developing OdsIO, a module that uses the Python ezodf (and, in turn, lxml) modules to import/export data from/to OpenDocument spreadsheets.

How do I guarantee that ezodf and lxml are available to the user? On the PyCall page they say to use pyimport_conda(modulename, condapkg) but I am not sure how to use it.
The problem is I never used conda distributions (I just use pip on linux) so I don’t even “understand” what’s the difference between a module and a conda package.

I did search on Anaconda package list and I didn’t find ezodf.
However, in an other anaconda package, there is a reference on it.

So, how can I be sure that my users have the packages ezodf and lxml available when they install OdsIO (and display a nice green badge of test passed on the github’s page :slight_smile: ) ??

2 Likes

The pyimport_conda docstring says it all:

Returns the result of pyimport(modulename) if possible. If the module
is not found, and PyCall is configured to use the Conda Python distro
(via the Julia Conda package), then automatically install condapkg
via Conda.add(condapkg) and then re-try the pyimport. Other
Anaconda-based Python installations are also supported as long as
their conda program is functioning.

If PyCall is not using Conda and the pyimport fails, throws
an exception with an error message telling the user how to configure
PyCall to use Conda for automated installation of the module.

The third argument, channel is an optional Anaconda “channel” to use
for installing the package; this is useful for packages that are not
included in the default Anaconda package listing.

If the user’s distribution isn’t conda-based and they don’t have the package, pyimport_conda will give an error message. You can always catch this error and display an error to the user to make sure they have the necessary packages installed and then retry your package.

For Travis testing, use the before_install hooks to install the packages, as in https://github.com/JuliaPy/Pandas.jl/blob/master/.travis.yml#L11.

1 Like

pyimport_conda will solve the installation problem for users whose Python comes from Anaconda (either Conda.jl, which installs its own Julia-specific Python install, or a full Anaconda install). This will include most Mac and Windows users, since Conda is the default Python used by PyCall on those systems.

So, in your module (to be precompile-safe as explained in the PyCall README), you would define const ezodf = PyNULL() and then, in your __init__() function, do:

copy!(ezodf, pyimport_conda("ezodf", "ezodf", "openglider"))

since it seems that ezodf is only available from the openglider channel in anaconda. Unfortunately, it looks like they only have a Linux package, not MacOS or Windows. Maybe you can work with the openglider people to provide Mac and Windows packages?

In your package build script (in OdsIO/deps/build.jl), you can try to check for the availability of the Python packages you want and run pip or something, or suggest that the user run it. It is pretty hard to do this in a way that works on lots of platforms, though, because there are so many ways to install packages on different systems. See the BinDeps package, which lets you specify ways to install things in different package managers.

Welcome to the wonderful world of external dependencies, unfortunately. Anaconda makes things a lot easier for Python packages, but not when there is no conda package for something.

Best bet for getting conda packages nicely built would be trying to get it added to conda-forge conda-forge | community driven packaging for conda. If you aren’t familiar with the process, contacting someone who also has an interest in the package (from the conda files list, it looks like that Linux package may have been built and uploaded by https://github.com/looooo ?) and seeing if they’d like to help would be worthwhile.

Unfortunately, it looks like they only have a Linux package, not MacOS or Windows. Maybe you can work with the openglider people to provide Mac and Windows packages?

I did try to install ezodf on a clean windows machine with pip installing just python and it works great… I don’t know why in that channel they specifically put linux64, as the package seems to be os-independent.
In any case, that looks a channel for something very specific… I may consider creating my own conda channel, but I’ll take some time…

It might be reasonable to change PyCall.jl try to use pip to install a package if there’s no Conda package.

1 Like

I am trying to use build.jl to install the dependency trough pip, as done here, but the content of the script is not executed… what I am missing?

I just put it under OdsIO/deps/build.jl… when this script is executed ? Is it part of BinDeps or is it a julia “feature” ?

It’s executed when the user types Pkg.build("yourpackage"), which also happens automatically when it is first added.

Thank you, but It seems build.jl is not automatically run when I do Pkg.clone(pkg), import pkg… (and hence I get the error of missing python module) I need to explicitly run Pkg.build(pkg).
Is it because I am using Pkg.clone() instead of Pkg.add() ?? (I am using a clean .julia directory at every test)

Conversely I have the build.jl script running if in my __init__() function I place @BinDeps.load_dependencies, but that’s giving me lots of warnings…

Yes, Pkg.clone does not run the build script.

Thqnk you all :slight_smile: :slight_smile:
On Linux, the build.jl script works great, it installs the required python modules and the test pass.

I did however tested it also on windows, but I got the following error:

I don’t know where that c option come from… the relevant part of the build.jl script does:

using PyCall
const PACKAGES = ["ezodf", "lxml"]
@pyimport pip
args = String[]
if haskey(ENV, "http_proxy")
    push!(args, "--proxy")
    push!(args, ENV["http_proxy"])
end
push!(args, "install")
push!(args, "--user")
append!(args, PACKAGES)
println("Using pip to install required modules.")
pip.main(args)

Do you know how to make the script work on windows too ?

I did try to simulate the script from a windows terminal, and it works:

C:\Users\Admin\.julia\v0.5\Conda\deps\usr>python
Python 2.7.13 |Continuum Analytics, Inc.| (default, Dec 19 2016, 13:29:36) [MSC
v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> pip.main(["install", "--user", "ezodf", "lxml"])
Collecting ezodf
  Using cached ezodf-0.3.2.tar.gz
Collecting lxml
  Downloading lxml-3.7.3-cp27-none-win_amd64.whl (3.4MB)
    100% |################################| 3.5MB 273kB/s
Building wheels for collected packages: ezodf
  Running setup.py bdist_wheel for ezodf ... done
  Stored in directory: C:\Users\Admin\AppData\Local\pip\Cache\wheels\3c\24\ad\ea
85bf0713eeb5746872b62e5c84b6e05a00705b95179d5667
Successfully built ezodf
Installing collected packages: ezodf, lxml
Successfully installed ezodf-0.3.2 lxml-3.7.3
0
>>>

So why from the script it doesn’ work ??? :neutral_face:

I tried on an other machine and got the same outcome…
The workaround is to manually install the required python packages before building the modules, e.g.:


cmd
cd C:\Users\[YOUR_USER_NAME]\.julia\v0.5\Conda\deps\usr
C:\Users\[YOUR_USER_NAME]\.julia\v0.5\Conda\deps\usr>python
>>> import pip
>>> pip.main(["install", "--user", "ezodf", "lxml"])
>>> quit()

julia> Pkg.build("OdsIO")
julia> Pkg.test("OdsIO")
INFO: Testing OdsIO
INFO: OdsIO tests passed

It seems to me that the problem is not much with OdsIO nor the required python modules… it’s like the build script of conda non initialising well the python repository…

I can’t reproduce this problem. I just tried:

julia> using PyCall
julia> pyimport("pip")["main"](["install","--user","lxml"])

and it worked fine for Conda on both a Windows and Mac.

Ah, the problem occurs only for pyimport("pip")["main"](["install","--user","ezodf"]), which works fine on a Mac but gives the error you report on Windows.

It may be that it is spawning a python executable, and is running into trouble because the wrong python executable is in the PATH? (I tried this on a Windows machine that has a separate Python install in addition to Conda.)

I can reproduce the error only uninstalling all conda, pip, python and Julia stuff from %appdata% roaming and local (so, really a “clean” machine).

But I did found a solution that works on all OSs:
run($(PyCall.python) -m pip install --user --upgrade pip setuptools)
run($(PyCall.python) -m pip install --user ezodf lxml)

A silly question however… I don’t know how to generalise the above code setting variables inside the commands to run…
The following code returns a julia error:

const PACKAGES = ["ezodf", "lxml"]

# Use eventual proxy info
proxy_arg=""
if haskey(ENV, "http_proxy")
    proxy_arg *= " --proxy "
    proxy_arg *= ENV["http_proxy"]
end

@pyimport pip

run(`$(PyCall.python) $(proxy_arg) -m pip install --user --upgrade pip setuptools`)
packages_arg = join(PACKAGES, " ")
run(`$(PyCall.python) $(proxy_arg) -m pip install --user $(packages_arg)`)

In particular the last row gives:

run(`$(PyCall.python) $(proxy_arg) -m pip install --user $(packages_arg)`)
/usr/bin/python: can't find '__main__' module in ''
ERROR: failed process: Process(`python '' -m pip install --user 'ezodf lxml'`, ProcessExited(1)) [1]
 in pipeline_error(::Base.Process) at ./process.jl:616
 in run(::Cmd) at ./process.jl:592

Done it! :slight_smile: :slight_smile:

At the end… a OS-independent way to install python required packages using pip, independently if these are in conda or not:

using PyCall

println("Running build.jl for the OdsIO package.")

# Change that to whatever packages you need.
const PACKAGES = ["ezodf", "lxml"]

# Use eventual proxy info
proxy_arg=String[]
if haskey(ENV, "http_proxy")
    push!(proxy_arg, "--proxy")
    push!(proxy_arg, ENV["http_proxy"])
end

# Import pip
try
    @pyimport pip
catch
    # If it is not found, install it
    println("Pip not found on your sytstem. Downloading it.")
    get_pip = joinpath(dirname(@__FILE__), "get-pip.py")
    download("https://bootstrap.pypa.io/get-pip.py", get_pip)
    run(`$(PyCall.python) $(proxy_arg) $get_pip --user`)
end

println("Installing required python packages using pip")
run(`$(PyCall.python) $(proxy_arg) -m pip install --user --upgrade pip setuptools`)
run(`$(PyCall.python) $(proxy_arg) -m pip install --user $(PACKAGES)`)
2 Likes

If your specific goal is to have a package available through pip easy to install in Julia, I recommend creating a package on anaconda.org. That gave me the fewest headaches. You can turn any package available via pip into such a package by following the instructions here:

https://conda.io/docs/build_tutorials/pkgs.html#build-a-simple-package-with-conda-skeleton-pypi

(FYI: PyPI is the repository that pip uses to install packages. )

It might be a bit of a pain the first time through to figure this out, but once you do figure it out, it’s easy to do for just about any python package you want to make use of.

Once you’ve created an anaconda package that you can follow @stevengj’s advice for using pyimport_conda.

1 Like

one that works for me is:

import PyCall: pyimport

# See https://stackoverflow.com/questions/12332975/installing-python-module-within-code.
const PIP_PACKAGES = ["package1", "package2"]

sys = pyimport("sys")
subprocess = pyimport("subprocess")
subprocess.check_call([sys.executable, "-m", "pip", "install", "--user", "--upgrade", "--force-reinstall", PIP_PACKAGES...])