I have a Julia application running under Linux. I would like to make a couple of remote function calls to another Julia process running on a specific Windows machine, because that Windows machine has special data-acquisition hardware for which only a Win32 driver is available.
I have Julia installed on both the Linux and Windows machine, and I have Microsoft’s OpenSSH port set up such that I can perform password-free ssh logins from Linux to Windows. However, the Linux and Windows machines have of course very different file paths.
The Julia documentation on distributed computing focuses at the moment very much on clusters of worker machines that share the same file-path name space, and where the user may not care much on which exact worker machine a job ends up.
Where do I start if, rather than creating a homogeneous compute cluster, I just want to call Julia functions on one particular machine (to control some hardware there) where Julia and my code are installed on very different file paths (because its a rather different operating system)?
Is there somewhere a lower layer API than the ClusterManager that I can use to open just one ssh channel from a Linux machine to a Windows machine to call there a couple of Julia functions?
While I haven’t done this before, I would start with addprocs (the second method which takes a vector of machine_spec strings as the first argument). You can specify the path to the julia executable on the remote worker using the exename keyword argument and the working directory with the dir keyword argument.
It turned out that the SSHManager in the Distributed package so far only supported ssh servers that invoke a POSIX shell (e.g., bash). I have now added to addprocs() a new keyword argument shell=:wincmd that causes SSHManager to prepage a remote command line for invoking julia --worker via cmd.exe. This is the default shell that sshd invokes in Microsoft’s OpenSSH port for Windows. See pull request #30614.
@mgkuhn I would look at the excellent MobXterm https://mobaxterm.mobatek.net/
It includes an ssh server and maybe it could help with the path issues also.
I understand this is not working (julia 1.2.0). Any idea how to make windows/windows communication? I made windows->windows communication using OpenSSH using keys. that part works. but then julia fails to add processes on the remote.
I guess another way is to make OpenSSH server to use some other shell, which would be posix compatible?
My pull request #30614 has not yet been merged into masters. I have still to do the slight refactoring requested by vtjnash, i.e. move the cmd.exe escaping algorithm into base/shells.jl (and also provide an equivalent for PowerShell and add some tests). I just got side-tracked by other projects, finishing this patch is still on my todo list. I also plan to add a separate option for whether to pass the cluster_cookie via stdin or via the command line, because whether communication via stdin/stdout works or not really depends on the exact Windows/OpenSSH/Julia version used: older versions required OpenSSH to do some horrid cmd.exe screen scraping that made stdin/stdout communication impractical, whereas with ConPTY support added in more recent versions this may have become better now.
Support for the new Distributed.addprocs parameter shell=:wincmd, to specify that the shell that answers the ssh connection is Windows’ cmd.exe, was recently merged into the master branch (i.e., should now be working on Julia nightly 1.6.0-DEV).
Thanks for posting this. I just spent an entire day trying to figure at why I was getting “‘sh’ is not recognized as an internal or external command.” The only problem is that I’m using the long term stable version 1.0.5. Would this fix every make it into 1.0.5 somehow or would I have to upgrade to 1.6 when it is released?
The shell keyword argument to addprocs() is new functionality, rather than a bug fix, so I doubt anyone will backport it to 1.0.x.
So for now, I’d suggest you install nightlies to try it out. (You can have several Julia versions installed at the same time.) The calling party needs to be Julia 1.6 or newer.
If the worker runs an older version of Windows (probably pre-1809) or Julia (I think pre-1.3 or so, not sure) or OpenSSH for Windows (probably pre-v7.9), you may also have to add the keyword argument cmdline_cookie=true as a workaround if your connection-attempt hangs. (This is, because of the horrid way in which with older, pre-ConPTY versions, OpenSSH for Windows has to capture the output of the Julia worker via screen scraping text from an off-screen console window, and it therefore won’t see some control characters correctly, such as a trailing newline of the output of the Julia worker, which signals that it is time for the caller to send the cookie. Until very recently, Windows really wasn’t meant to be ssh-ed into.)