Julia-Python interface fails in Julia library likely due to wrong SSH credentials

Hi all,

I am experiencing a very complicated bug regarding SSL certification in a Julia package integrated with Python using PyCall and/or PythonCall (we have problems with the two actually, so probably the source of error is the same). In the package Sleipnir.jl we have a python setup with some Python dependencies handled with conda and some a posteriori installs using pip (unfortunately, we need these extra pip installs). The whole library is tested in continuous integration with the complete setup, so you can have an idea of how the different installs are handled. Currently we are testing in both MacOS and Ubuntu. Here is the workflow in CI (the specific details of the Julia library does not matter, just know that some preprocessing is done in Python and some analysis in Julia):

name: Run Tests
on:
   pull_request:
    branches:
      - main
   push:
    branches: []
    tags: '*'
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: ${{ startsWith(github.ref, 'refs/pull/') }}
jobs:
  test:
    name: Julia ${{ matrix.version }} - ${{ matrix.os }} - ${{ matrix.arch }} - ${{ github.event_name }}
    runs-on: ${{ matrix.os }}
    defaults:
       run:
         shell: bash -el {0}
    strategy:
      fail-fast: false
      matrix:
        version:
          - '1'
        python:
          - 3.12
        os:
          - ubuntu-latest 
          - macos-latest
        arch:
          - x64
    steps:
      - uses: actions/checkout@v2
      - name: Set up Python 🐍 ${{ matrix.python }}
        uses: actions/setup-python@v2
        with:
          python-version: ${{ matrix.python }}  
      - name: Create environment with micromamba πŸπŸ–€
        uses: mamba-org/setup-micromamba@v1
        with: 
          micromamba-version: '2.0.2-2'
          environment-file: ./environment.yml
          environment-name: oggm_env                # it is recommendable to add both name and yml file. 
          init-shell: bash
          cache-environment: false
          cache-downloads: false
      - name: Update certifi
        run: | 
            pip install --upgrade certifi
        shell: bash -el {0}
      - name: Set ENV Variables for PyCall.jl 🐍 πŸ“ž
        run: | 
          echo "PYTHON=/home/runner/micromamba/envs/oggm_env/bin/python" >> "$GITHUB_ENV"
        shell: bash -el {0}
      - uses: julia-actions/setup-julia@v1
        with:
          version: ${{ matrix.version }}
          arch: ${{ matrix.arch }}
      - name: Check Julia SSL certifications πŸ”ŽπŸ”
        run: |
          julia -e 'using NetworkOptions; println(NetworkOptions.bundled_ca_roots()); println(NetworkOptions.ca_roots_path()); println(NetworkOptions.ssh_key_path()); println(NetworkOptions.ssh_key_name()); println(NetworkOptions.ssh_pub_key_path())'
          echo "SSH_PATH=$(julia -e 'using NetworkOptions; println(NetworkOptions.bundled_ca_roots())')" >> "$GITHUB_ENV"
        shell: bash -el {0}
      - name: Install dependencies on Ubuntu
        if: matrix.os == 'ubuntu-latest'
        run: |
          sudo apt-get update
          sudo apt-get install -y libxml2 libxml2-dev libspatialite7 libspatialite-dev
          dpkg -L libxml2
          dpkg -L libspatialite7
          echo "LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH" >> "$GITHUB_ENV"
          # export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
      - name: Install dependencies on macOS
        if: matrix.os == 'macos-latest'
        run: |
          brew install libxml2 libspatialite
          echo "========= Checking on Installs =========="
          brew info libxml2
          echo "========= Checking on Installs =========="
          brew info libspatialite
          echo "PKG_CONFIG_PATH=/opt/homebrew/opt/libxml2/lib/pkgconfig" >> "$GITHUB_ENV"
      - name: Check that new paths had been exported
        if: matrix.os == 'macos-latest'
        run: |
          echo $PYTHON
          echo $PKG_CONFIG_PATH
      - uses: julia-actions/cache@v1
        with:
          cache-registries: "true"
          cache-compiled: "true"
      - name: Build Julia packages in Ubuntu
        uses: julia-actions/julia-buildpkg@v1
        if: matrix.os == 'ubuntu-latest'
        env:
          PYTHON : /home/runner/micromamba/envs/oggm_env/bin/python
          # The SSL certificate path can be readed from the action "Check Julia SSL certifications"
          SSL_CERT_FILE: /etc/ssl/certs/ca-certificates.crt
      - name: Build Julia packages in MacOS
        uses: julia-actions/julia-buildpkg@v1
        if: matrix.os == 'macos-latest'
        env:
          PYTHON : /home/runner/micromamba/envs/oggm_env/bin/python
      - name: Run tests in Ubuntu
        uses: julia-actions/julia-runtest@v1
        if: matrix.os == 'ubuntu-latest'
        env:
          PYTHON : /home/runner/micromamba/envs/oggm_env/bin/python
      - name: Run tests in MacOS
        uses: julia-actions/julia-runtest@v1
        if: matrix.os == 'macos-latest'
        env:
          PYTHON : /home/runner/micromamba/envs/oggm_env/bin/python
      - uses: julia-actions/julia-processcoverage@v1
      - uses: codecov/codecov-action@v3
        with:
          token: ${{secrets.CODECOV_TOKEN}}
          files: lcov.info

This workflow used to work until a few months ago, where independently even having most of the Python and Julia libraries versions predetermined, something starter failed. I suspect the problem is coming from how SSL credentials are handled by both Julia and Python, but I cannot find the solution to the problem. I am playing with the paths in the different installations, trying different ways of importing the Python libraries, but nothing seems to work in CI. What I find weird is that I can make this work on my local machine on MacOS (not sure about Ubuntu), so I guess the problem is also coming from something that CI is doing.

In the past we experienced problems with the SSL credential management, the reason why the workflow has some SSL management involved. However, this does not seem to be working now. This issue is also a continuation of the related post that @JordiBolibar wrote, an issue that we still haven’t fully solved.

Any idea of where the problem can be coming from and how to fix it? Here is the CI workflow, the Project.toml, the environment.yml (for PyCall), and how the setup of the Python libraries is carried out inside Sleipnir.jl.

I am happy to provide more details! To be honest, the bug is even difficult to report since I don’t even understand exactly what is failing, but I hope someone in the community is more aware of these topics and can give us a hand!!!

1 Like