Local & remote workflow

question

#1

Suppose I have two machines, LAPTOP and SERVER. LAPTOP is resource-limited but convenient, SERVER is powerful but not super-convenient. Both run Linux, I have standard tools (and I can ask for other ones on SERVER, which is not under my control).

My strategy for working on large problems is to develop a “toy” version on my laptop (eg using small simulated data, a subsample of the data, etc), which I can benchmark and test, until it is ready to run. Then I would like to continue working on the server. I may have to go back and forth a few times.

Some of the code that is packaged gets developed in tandem with the application. Some of the code isn’t packaged, but lives in a git repo. The latter is mostly “scripts”, not modules.

I am curious what workflow people use for this. I can

  1. mount the server directory, and work with that, and rsync back and forth,
  2. sync via private repos on Github or Gitlab (committing the Manifest.toml),
  3. use Dropbox (but I would prefer to avoid this as I don’t like it much).

FWIW I am using Emacs, so remote editing is a breeze.


#2

As an even simpler alternative to 2, you can push branches directly from one machine to another:

$ git remote add server ssh://me@server.edu/home/me/.julia/dev/MyPkg   # do this just once
$ git push server me/fix_issue_13

where the last line is pushing a branch of the current repo (MyPkg) to the server for testing.


#3

My approach to distributing packages between the two locations is to create a git folder in my home directory on the server, and store the packages there. Then I can create a custom registry, and work with those packages using Pkg3.

The drawback to this approach is that there are two copies of each package on the server, but I has sufficient disk space so that isn’t a concern.


#4

Definitely good advice here on using Git.
Another less elegant approach would be to have a shared folder on laptop and server using sshfs


#5

This Juno workflow works quite well imho:
http://docs.junolab.org/latest/man/faq.html#Connecting-to-a-Julia-session-on-a-remote-machine-1


#6

Another option which I like is running a jupyter server on the remote and connecting to it from my laptop.

https://www.blopig.com/blog/2018/03/running-jupyter-notebook-on-a-remote-server-via-ssh/


#7

Related: for a calculation running for an ex ante unknown amount of time (potentially days), is there a way to somehow notify myself when it is done?

I am currently calling mailx to send myself an e-mail, which works, but I wonder what people use.


#8

Your cluster manager should have a feature for that. I know that in slurm's sbatch files you can specify an option to that effect.


#9

Thanks, but there is no cluster manager, just a powerful server I can ssh to.


#10

I discussed workflows for HPC over on the Beowulf list. Someone smart pointed out gitfs
https://wiki.archlinux.org/index.php/Gitfs

That sounds very interesting - @Tamas_Papp Could that be a part of your workflow?


#11

Thanks, but gitfs looks too automated (I want more control), and in any case my git workflow is super-convenient with magit.

Things are working out smoothly at the moment. In case someone is interested, this is how I do it:

  1. set up a private repo on Gitlab, ssh keys and smooth access to the repo on both machines.
  2. prepare a pilot run on the laptop on a git branch, push to the repo.
  3. pull the branch from the server. if it works out, merge to master, otherwise
    a. for minor tweaks, edit remotely via tramp,
    b. for larger changes, work on the laptop and push/pull again.

Long runs courtesy of tmux.


#12

Regarding the notifications, I too would use mailx.
Some Googling reveals this (Python!) utility https://ntfy.readthedocs.io/en/latest/
It uses the Pushover service https://pushover.net/

Pushover has a REST API https://pushover.net/api
So it should be easy to send a notification to your phone and get it to play a little tune when your process ends.


#13

For a workstation without a cluster manager, I find Task Spooler ts (or tsp in Ubuntu etc.) very useful for handling a bunch of processes (e.g., limit number of concurrent processes, run some commands on success of another, automatic stdout storage). Similar tools: systemd-run --user, GNU parallel, … what else?

I remember using slack notification API via curl was not difficult. But that probably is not very different from sending an email.


#14

I use unison as I am often swapping between machines, sometimes before I want to commit code
https://www.cis.upenn.edu/~bcpierce/unison/

I just synch a julia folder with all of my packages and scripts in it. Then edit them on whichever machine I’m on, and run

unison julia -auto

when I need to synch. It connects over ssh, works out which copies are newest or prompts if there is conflict. Then does rsync or similar. I’ve had no dramas after three months using it.


#15

I used to use unison, but stoped about 5 years ago when I learned git. It is a nice tool, but once you are familiar with version control it pales in comparison. No history, no branches, tricky failure modes (eg on divergence).


#16

Sure, I use both! basically both of my machines are identical and many things are synced (not just julia), and git is for pushing to the rest of the world or for when I actually need revisions.

But it doesn’t always make sense to make a revision and push in three or four repositories when I go to sit outside on the laptop or want to try something on the GPU. Even when all my git commands are keyboard shortcuts in rangerfm…


#17

I think one key feature that is missing in emacs Julia support is integration of tramp and Julia repl. I tried to follow the example in python.el but that seems specific to comint. If we can figure out how it works with ansiterm then we may get remote executing/completion almost for free. In other editors there’s nothing like tramp and sshfs don’t play well with git.

Currently I just ssh to the remote machine and execute. All editing is via tramp. In case you don’t already know you can alias your server in .ssh/config and setup key based login, so no longish filenames and passwords when editing remote files.