Package for remote computation

Consider the following scenario: the user has access to a

  1. “workstation”, ideal for interactive debugging and development, and exploration of results (such as generating graphs for a paper)
  2. “server”, much more powerful but not ideal for interactive work, can install Julia, has full shell access, git and rsync

My usual workflow is to develop the Julia code in a package, eg working on a toy version with a much lower dimension that finishes in minutes instead of days, sync data to the server, pull the package from a repo, run the computation, rsync back the data. This is implemented by a bunch for scripts (Julia and Bash), and is somewhat fragile.

I am wondering of there is a more robust approach, eg someone already wrote a package for this.

1 Like

I find VS Code Remote (SSH) very helpful for this use case.

6 Likes

Thanks, I am looking for something independent of IDEs.

Developing and running code on a remote server is super common and works quite well nowadays. I use Jupyter (for plots), emacs ssh-editing + REPL in ssh, or tmux.

much more powerful but not ideal for interactive work, can install Julia, has full shell access, git and rsync

Why is that not ideal for interactive work? Why do you need IDE-independence?

I’ve also seen people use something like TeamViewer to remotely connect. That would be IDE independent, though it’s not the most pleasant thing IMO.

Like @liuyxpp, I also recommend the “Remote - SSH” extension of VS Code. I started using it recently, and I am satisfied with it.

In my case, though, I cannot use the REPL built in VS Code when using Remote - SSH, because it connects to a login node of the Linux cluster. Starting the built-in REPL starts the REPL on the connected login node, which is not for heavy computation. For computation, I request an access to a compute node. This is typically done by obtaining an interactive shell of a compute node from the shell of the login node. Then, I start REPL from the obtained interactive shell.

This is fine for usual computation, but visualization is a challenge, because I cannot draw plots like I do in the REPL built in VS Code. I tried to do X11 forwarding, but it failed because compute nodes usually don’t have GPU for visualization. (I use GLMakie. Maybe other plotting backends work.)

My current solution is to use the Remote - SSH to access files and develop code using the VS Code editor, but to run REPL on a terminal with SIXEL support. Then, I can plot directly in REPL using SixelTerm.jl; see more details here. There is an effort to support SIXEL in VS Code’s built-in terminal, which will eliminate the need to run a separate terminal program.

2 Likes

VSCode + Remote SSH is convenient for working with files, developing and testing packages on the remote.

Pluto that’s permanently running on the server is great for interactive work, including plots.

2 Likes

Then you serve the page to be read remotely? How does one do that? The server needs an open http port, doesn’t it?

I can use the REPL built-in to VS Code, and it is very convenient. Amazingly, I can type code file in that REPL to open the file in the current VS Code window! Also, I can draw plots without any problem. VS Code is so powerful that I cannot live without it.

I don’t really understand the question. You just run a Pluto instance on the remote server permanently. There are different options on how to connect to it from the browser, depending on your preferences: eg, use an SSH tunnel, or run a real webserver on the remote.

That’s what I’m asking :-). Many times one does not have permission to run a http server in a cluster.

I haven’t used it much recently because my laptop has been sufficient for most things, but when I tried it before, this package worked well for loading code remotely, running commands, and moving data back and forth:

Maybe not a Julian answer. You can have remote desktops on a cluster, using several methods, both commercial and open source.

PCOIP, as used in VMware Horizon https://www.amulethotkey.com/

Altair Access HPC Job Submission Portal for Researchers and Engineers | Altair Access

Nice DCV (now part of AWS) NICE DCV - Amazon Web Services

Guacamole https://guacamole.apache.org/ (I am not familiar with this)

On another axis of this discussion, you don;t need to keep moving data around.
You could try using an ssh filesystem

Are you aware that sshfs is currently unmaintained?

Quoting from https://github.com/libfuse/sshfs:

This Project is Orphaned

This project is no longer maintained or developed. Github issue tracking and pull requests have therefore been disabled. The mailing list (see below) is still available for use.

If you would like to take over this project, you are welcome to do so. Please fork it and develop the fork for a while. Once there has been 6 months of reasonable activity, please contact Nikolaus@rath.org and I’ll be happy to give you ownership of this repository or replace with a pointer to the fork.

I’m not sure what are the practical implications of this. But that banner is certainly a bit “scary”.

No I was not aware of that. I used it successfully in the past when a certain company I was working for used a remote HPC facility.

To make my original rely to @Tamas_Papp more clear. HPC clusters very often have login or visualisation systems which are workstation class systems, very often with GPUs.
You can have virtual desktops on these systems. I was listing several methods I have had experience with (except Gaucamole).