Ensure Julia is used to its full power

Hi there,

Just as a small background: I’m studying theoretical physics, which also includes a lot of computer simulations, from fluid dynamics, particle physics or cosmology, and from my experience after asking people from all sort of research centers there are pretty much 2 languages for high efficient programming (Fortran and C++), another 2 for plotting and data analysis (Python and a little bit of R) and each time we’re facing an analytical problem which we require computer aid we use Mathematica.
So after a brief mention by my professor (of a parallel computing class I took on the side) of Julia, and after seeing its advantages, and since I want to spend as little time possible into creating working code, I decided to give Julia a go.

After following some of the guides on Julia Academy I’m running into a few issues, which aren’t new, but the solutions aren’t clear either:

  • Trying something has basic as using Plots takes a lot to pre-compile, and because I use the terminal to run my .jl file each time I do so it has to pre-compile the whole thing. I’ve looked into packages such as PackageCompiler.jl and SnoopCompile.jl I’m unsure which one to use and how. Even if one of them manages to pre-compile all of the downloaded packages and ensure that they get called as fast as possible, when I’m sharing my code other users might not be aware of this and end up thinking my program is slow. Is there no way to ensure that the user has the correct package and already compiled in it’s system and have the ease of use of something like Python where you just have to ensure that the correct package is installed?
  • On a note similar to the previous question, would it be possible to have the same functionally as you do in Java where you can compile your code with javac, which will compile all required functions, and then java to run it and if you were to change anything in the code you would just have to recompile the file you performed the changes?
  • Printing also seems to be taking a long time, I’ve made a script while I was learning the language and it took significantly longer than Python when running it with Julia. Am I missing something?

When I have the time I plan on trying to read a few books on the subject, the julialang website provides a few, but I’m afraid that the minor hiccups that are presented here will be enough for me to not be able to use Julia in the future as I would not have any support from my professors/peers.

I’ve seen after searching around that all of these issues have been addressed here or on the Github page, but I wasn’t able to understand some of the answers given and, more important, if they were valid for my use case scenario, as I would love to have Julia both as a data analysis on my PC or run it in the future in a computer cluster!

Thanks in advance!

4 Likes

Julia’s fancy type system and compiled nature make it more powerful than Python, but those things (among others) make the runtime substantially heavier to load. Instead of using Julia as a scripting language, it’s recommended to keep a Julia session open (so that startup/compilation penalty only needs to be paid once) and use Revise.jl to automatically reload your code as you modify it. There’s also DaemonMode.jl, which keeps a background Julia process available to run scripts.

Instead of trying to fit Julia into your Python-style workflow, you’ll encounter less friction if you use the language as it’s been built to be used.

9 Likes

no, because the precompiled stuff varies based on Julia version and system architecture.

1.6 will vastly improved the latency (time to first plot)

5 Likes

Answering your dot-points:

  1. This is the best-known problem with Julia - generally called time-to-first-plot. People are actively working on it, and everyone who uses Julia is aware of it and has the same gripe as you. Plots will compile a lot faster in 1.6 as method invalidations have been fixed to a large extent. But to understand, this issue is a result of julia doing so much work to compile very fast code, but also allowing dynamism - there had to be a catch somewhere, right? That catch will become smaller over time.
  2. PackageCompiler.jl is for this. But it’s still not as easy to use as it could be. But, think about how java got these tools… some billions of dollars have been spent on Java development for the last few decades. Julia will get there.
  3. Printing takes that long only the first time, because things are being compiled. This is a known issue too… I would personally prefer there was a spinner that ran during compilation, and everything was printed at once when it was done. And maybe you mix that with script loading time? running julia in a script will take longer than python unless the problem is large enough to offset compilation. Mostly we don’t use scripts - use a live REPL session that you don’t close, instead.

Most people here tolerate these issues because the runtime of our problem dwarfs them. E.g. my simulations would take years to run in python, but would be impossible to write in C++ given my deadlines.

16 Likes

This isn’t necessarily true. It’s possible that 1.7 will add native code precompilation, which would basically eliminate all load times from precompiled libraries (at least for functions ran in the precompile script)

6 Likes

From your script seems that you are from Brazil. You might want to take a look in this course I am giving now. The material has a series of indications on how to use Julia productively:

https://github.com/m3g/CKP/blob/master/disciplina/simulacoes2.pdf

But basically, writing very quick scripts is not what Julia is the best for.

This post may also help you in developing a good workflow: Julia REPL flow coming from Matlab - #5 by lmiq

5 Likes

You can try to automatize the PackageCompiler.jl workflow and share that with the rest of your program, make it a kind of installation phase the first time they use it. Although this is really useful if the user need to run your program regularly.

Alright, and what if I want to do something like an heavy simulation that would run for a while changing only its inputs?

Thats my main concern, has if I were for instance doing a fluid simulation and change only its initial values would make me have to pre-compile the whole thing instead of the small portion of the file I just changed. Is there where I could make use of something like those external packages to help pre-compile stuff?

Close, same language, Portugal.

Okay so my goal was to keep the workflow in the terminal with Julia, but it seems that the compile time of the script itself is what is really high which would make my workflow unsuitable for scripting.

I’ll look into your guides as Julia is regarded as a really good language, hopefully I can overcome those minor issues and sucefully implement Julia in my future work. Thanks!

That was something that I was looking for because most time we end up using external libraries, such as plots for example, and then we have a few files that we hardly change and then the main file where we change it all the time.

As such I would love to be able to share my project and the user would download and compile everything on his side and just run the program at will without having to worry about anything else.
Ideally the user would also be able to change the “main source code”, the one I wrote, without recompiling the whole thing, for instance they would never change how plots behave and as such it would never recompile.

There are two cases there: Your package is ready and you will only run your simulation with different parameters, etc. In that case, if the simulation is really heavy, even if the compilation takes a lot of time (a minute, lets say), it is unlikely that this compilation time is important for the total running time of your simulation (we assume here that a heavy simulation should take an hour, at least :slight_smile: ). Thus, for really costly computations the compile time is really not important.

Some parts of the package are precompiled when you load them for the first time, so you would not be compiling everything every time, but that is perhaps a detail here.

Where these compilation times can be annoying is when you are developing your package, because one generally performs quick tests all the time. In this case, re-running the script from scratch every time is not the way to go with Julia. You should keep a Julia section alive and use Revise, or reload the files in that Julia section, as explained in the post above that describes possible workflows (everything there is terminal based).

2 Likes

For that I guarantee that Julia is much better than C++ or Fortran. You will host your package in a github account. Your users will only have to “add” your package. Dependencies will be satisfied automatically. If the package does heavy computation, the fact that package gets compiled on execution should not be a problem, as mentioned above.

The source code will be available for them to see and modify if they want to.

Sharing a package in Fortran or C++ which depends on other libraries is very cumbersome. You have to either teach everyone how to compile it, download or install the dependencies, choose the compiler, etc, or share a binary, which must be built for every possible platform independently.

6 Likes

I find very strange nobody mentioned Jupyter yet. For exploratory computation or debugging, Revise.jl inside a Jupyter Notebook gives a “less claustrophobic” experience (in my opinion) than the Julia REPL (note: I have already used a computer only in text-mode for 6 months and, also, I have no file manager, I do everything in Bash; but even so, when you are working with chunks of code, it is annoying to not be able to easily see all code you are considering and using arrow up to find the line to re-execute is tedious, so I consider Jupyter a great quality of life improvement over using the REPL directly for anything you take more than one or two minutes to do).

5 Likes

Keep your julia session open and use Revise.jl (maybe even with this workflow?) instead of restarting julia over and over. Another option would be Pluto.jl.

I don’t know how your script is written, but if it’s not in the form of functions without globals, chances are you’re leaving a lot of performance on the table anyway, regardless of restarting julia or not.

The first part of this already works if you put your code in a project (which is used to manage dependencies). The second part about not having to recompile the code is not currently possible, but may be possible in the future.

That depends on how addicted you are to Vim and gnu-screen or similar alternatives:

teste

2 Likes

It would be interesting to get this working. I already use vim daily for years (but no gnu-screen on my personal machine just tmux in the lab servers). However, unfortunately Jupyter has the advantage of: being a format to share with other researchers; plots and tables are shown right below the cells in a convenient fashion (I use a plugin for tables to be able to interact with them, but only work on Jupiter-like environments). But yes, if you use vim and do not need use plots and tables often this may be a better experience than Jupyter (the lack of vim hotkeys when writing in Jupyter annoys me a little).

There I am using this plugin: https://github.com/mroavi/vim-julia-cell

I do not know how people use Jupyter to share codes, I might be doing something wrong. Everytime I try to do that I find out that every package has to be installed in the specific jupyter section that one/some else opens, such that it is absolutely out of question. Probably I do not know how codes have to be shared.

1 Like

Are you familiar with the startup.jl file? You can add some code that colleagues may use in their session (including calls to the plotting library). This will cost some extra time to initialize Julia, but this is probably less of an issue.

Oh, I keep a Project/Manifest.toml in the same folder and the first cell just import Pkg and activate and instantiate the project. The first time, and if you do not have the right versions, the packages will be installed, but every other time the instantiate will just discover the packages are already installed.

I do the same thing. For a concrete example, see: https://github.com/rdeits/EdgeCameras.jl/tree/master/notebooks and the first cell of https://github.com/rdeits/EdgeCameras.jl/blob/master/notebooks/demo.ipynb

1 Like