Best practices to share simulations with collaborators?

I’m trying to ship my Julia simulation to a collaborator (who is not a programmer) and since a few attempts failed, I’m asking here for advice to do it better next time.

Setting:

I have basically one package and one script which my collaborator should run.
(and he should be able to update the simulation code whenever I create a new version.)

Question:

  • Is there some tool which creates, for example, a Windows installer (something like PyInstaller) and installs all needed dependencies and a shortcut to start a Julia script?
  • Is there a guide with best practices about how to share simulations with collaborators and which special cases to consider?

Attempts:

  1. Use Julia portable, set everything up and zip everything.
    What went wrong:
    • Window’s zip doesn’t support long filenames, which were created when the packages are installed
  2. Use Julia portable, make a few .bat scripts
    What went wrong:
    • The path contained spaces and I didn’t escape everything currently. (Ok, that was clearly my fault.)
  3. Install Julia + Atom and do the setup steps via Skype. (Following the Juno instructions)
    What went wrong:
    • The university network had atom.io blocked
  4. Use JuliaPro:
    What went wrong:
    • The registration for my collaborator didn’t work, since he never received the email

After all these attempts, I honestly feel a bit stupid and I started questioning my choice of Julia in the last week quite a bit.
Ok, 1. & 2. are actually almost successful, since it should work 1. if using 7z and 2. with corrected batch files.
But 1-4 have in common that they all failed due to unexpected reasons.

3 Likes

I have not used Docker before, but it sounds potentially useful.

Aside from that, I would recommend creating a package with a Project.toml file and sharing it through github or any method that works for your collaborator. Put something like this at the top of your main script:

cd(@__DIR__)
using Pkg
Pkg.activate("relative_path_to_package_folder")
# script below

This should activate the environment of your package and install your dependencies. You may need to call Pkg.instantiate() the first time. So please check that with a fresh install. I usually point people to VSCode because it is easy to setup and has a GUI interface for GitHub. This has worked well when sharing code with collaborators who have limited programming experience.

2 Likes

You might also checkout PackageComiler.jl. According to the documentation, you can create standalone executables.

DrWatson.jl is designed with things like this in mind, I’d suggest you take a look.

I’m actually already using DrWatson.jl. But it requires a Julia REPL to start with. My problems were all related to get a Julia REPL in a comfortable environment to run.

PackageCompiler would be an option. I use it to to precompile Plots (it was actually one of the batch scripts in approach 2.). Doing a standalone app has currently the disadvantage that the binaries are very large, i.e. for each update of the simulation I would need to sent a 300MB file.

I installed Julia quite often on all kind of systems (linux, mac, windows, windows portable). If it works, it works really well out of the box. But there are special cases to consider, which are mostly not Julia’s fault but just external [e.g. missing powershell; blocked by firewall].

My question is more about how to minimize possible special cases and finding the most bullet-proof approach. [For example, I considered Julia+Atom or JuliaPro as bullet proof… But it wasn’t. Maybe just plain Julia is most bullet-proof?]

2 Likes

I would go for that if your collaborator can deal with the REPL for simple commands. Understanding the workflow and configuring an IDE is more complicated than copying and pasting commands, not mentioning that some IDEs are very resource hungry. The plain Julia plus hosting the packages on GitHub makes things fairly easy (except for dealing with the path to script and data files, because windows users usually are not aware of the directory structure).

Maybe a good alternative is to share a notebook as well. I don’t have much experience with that, though.

1 Like

Would this help? A portable Julia installation, which pops up VS code when everything is done right in the script that you’d like the person to run.

1 Like

Would it be possible to make a lightweight interface with PyJulia and create an executable with PyInstaller. This might work if PyInstaller includes non-Python dependencies.

Try the JuliaWin Portable Distribution.

2 Likes

Using 7zip or whatever? You can also make self-extracting archives with it

That looks very promising. That’s exactly what I was looking for. I will try it!

Yes, 7z should work. Good idea with the self-extraction, I didn’t had this in mind. That would avoid the problem that someone unzips it with the wrong program.

If any of these is true:

  • you hava a spare server that your colleague can reach
  • your simulations are not so much computationally intensive that you can use a free service
  • you have money to pay a cloud computational environment

Then, the simplest thing may just be set up your simulations as a notebook that you colleague can use, à la Jupyter…

Do you know a good tutorial, manual, for doing that, given that one has a server where to install Julia? I suppose we are assuming that Julia will be running on the server side, and that the user won’t need to install anything (Plots, etc).

Yep . I installed Jupyterhub for my lab. Now I am on my mobile, trying to let my kid falling asleep… :slight_smile:
Tomorrow morning I’ll forward you my setup notes for installing Jupyterhub with Julia on a server…

1 Like

Good luck! I beat you by 10 minutes!

1 Like

I am attaching my complete but messy notes for the installation of our lab Ubuntu 20.04 server and the apache config file. Note there is much more there than what you need, as it is a computational server with JupyterHub, RServer Studio and a file server with NextCloud. But I am attaching it as there are some important parts for dealing with security (eg. setting up a basic firewall) and installing the middleware in general (sections 12 and 13).
What you need for JupyterHub and Julia kernel specifically is this:

# ---------------------------------------------------
# 16 - Install Jupyter and Jupyter Hub

sudo apt-get install npm nodejs-legacy
sudo apt-get install python3-pip
pip3 install --upgrade pip
pip3 install jupyter
npm install -g configurable-http-proxy
pip3 install jupyterhub
pip3 install --upgrade notebook
cd /etc
jupyterhub --generate-config
nano jupyterhub_config.py

c.JupyterHub.base_url = '/jupyter/'
c.JupyterHub.logo_file = '/var/www/example/imgs/jupyter-lef-logo.png'

# Enable jupyterhub as a system service:
nano /lib/systemd/system/jupyterhub.service
nano /etc/systemd/system/jupyterhub.service
(edit both)
>>>>
[Unit]
Description=Jupyterhub

[Service]
User=root
Environment="PATH=/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"
ExecStart=/usr/local/bin/jupyterhub -f /etc/jupyterhub_config.py

[Install]
WantedBy=multi-user.target
<<<<
sudo systemctl daemon-reload

# enable at startup:
systemctl enable jupyterhub.service

To start jupyterhub:
- manually (for test): jupyterhub
- as a service: sudo systemctl start jupyterhub (also: stop|status)
If you stop it, then run also `sudo pkill node` or jupyterhub will not restart as it will find the proxy port busy.

# Autostop idle kernels AND servers..
Download cull_idle_servers.py from https://github.com/jupyterhub/jupyterhub/blob/master/examples/cull-idle/cull_idle_servers.py
and put it in /etc/jupyter

python -m pip install tornado

nano /etc/jupyterhub_config.py
>>>>>>>>>
c.JupyterHub.services = [
    {
        'name': 'cull-idle',
        'admin': True,
        'command': 'python /etc/jupyter/cull_idle_servers.py --timeout=604800'.split(),
    }
]
<<<<<<<<<<
(1 week)

cd /etc/jupyter
cp /root/.jupyter/jupyter_notebook_config.py .
nano /etc/jupyter/jupyter_notebook_config.py
>>>>>>>>>>
c.MappingKernelManager.cull_idle_timeout = 86400
<<<<<<<<<<
(1 day. If idle the kernel is not killed.)

Note that to stop and restart jupyterhub you need also to pkill node after stopping jupyterhub and before restarting jupyterhub, otherwise jupyterhub would find the port busy and don't resume.

For some reasons the jupyter executable is not installed, only one of the various sub-comands, however some programs expect to deal with just "jupyter".
So le't create it:
nano /usr/bin/jupyter
>>>>
#!/usr/bin/python3
# EASY-INSTALL-ENTRY-SCRIPT: 'jupyter-core==4.6.3','console_scripts','jupyter'
__requires__ = 'jupyter-core==4.6.3'
import re
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(
        load_entry_point('jupyter-core==4.6.3', 'console_scripts', 'jupyter')()
    )

<<<<<<

Check available kernels with: jupyter kernelspec list
Reach the notebooks at https://example.org/jupyter 

# ---------------------------------------------------
# 17 - Install Julia

mkdir /usr/lib/julia
cd /usr/lib/julia
wget https://julialang-s3.julialang.org/bin/linux/x64/1.4/julia-1.4.2-linux-x86_64.tar.gz
cd /usr/bin
ln -s /usr/lib/julia/julia-1.4.2/bin/julia julia1.4
ln -s julia1.4 julia

Now, I found that the best way is to install Julia systemwide but havinng each user manage his own packages, including the Jupiter kernels.
What I do is I add a line to adduser.local so that when I add a user with addusers mynewuser a script is run that install a minimal base of Julia packages (including Jupyter kernel) to the user:

nano /usr/local/sbin/adduser.local
>>>
#!/bin/bash

user=$1
su $1 -c "julia /usr/bin/initJuliaRepository.jl"
<<<<
chmod +x /usr/local/sbin/adduser.local

The content of "/usr/bin/initJuliaRepository.jl" is:
>>>
import Pkg

Pkg.update()
Pkg.add("IJulia")
Pkg.add("DataFrames")
Pkg.add("Plots")
Pkg.build("IJulia")
<<<

The relevants part for the apache proxy server are:

<IfModule mod_ssl.c>
	<VirtualHost _default_:443>
	DocumentRoot /var/www/example
        <Proxy *>
           Allow from localhost
        </Proxy>

        RewriteEngine on

        #JupyterHub...
        RewriteCond %{HTTP:Connection} Upgrade [NC]
        RewriteCond %{HTTP:Upgrade} websocket [NC]
        RewriteRule /jupyter/(.*) ws://127.0.0.1:8000/jupyter/$1 [P,L]
        RewriteRule /jupyter/(.*) http://127.0.0.1:8000/jupyter/$1 [P,L]
        #proxy to JupyterHub
        ProxyPass /jupyter/ http://127.0.0.1:8000/jupyter/
        ProxyPassReverse /jupyter/  http://127.0.0.1:8000/jupyter/
	    <Location "/jupyter">
            # Only here as `ProxyPreserveHost On` interferes with Rstudio server login
	        ProxyPreserveHost On
	    </Location>
        [....]
1 Like

@SteffenPL ah cool, I found how our discussion originated
https://github.com/heetbeet/juliawin/issues/44

hope Juliawin works out for you