Will Julia ever fix its "using ..." latency problems?

With a sysimage I managed to get the startup time to usuable speeds. I know that many people don’t care for startup time, but I can’t use the REPL, because I haven’t found any IDE workflow that is acceptable.
Thank you.

2 Likes

If you haven’t found an IDE you like, you can just use a Jupyter notebook as a REPL-like experience (but with persistence and multimedia)…

2 Likes

I know about the jupyter notebook, but

  1. You cannot split your code into multiple files.
  2. Jupyter notebook can’t work with *.jl files, but all libraries are *.jl files
  3. Backtraces will not show the correct file number
  4. There is no “go to definition” feature like in vs code.

What I actually just tried is writing in vscode and having a jupyter notebook that is just one line:
include(“main.jl”)

Same with pluto notebook

This works pretty ok.

Yes you can, using NBInclude.

(But if you are at the point of making lots of files, you should probably be creating modules and doing structured programming, and then calling your modules from your notebooks. In fact, with NBInclude you can actually use notebook files as part of your Julia modules if you want, but personally I would tend to use notebooks only for interactive exploration code and keep long-term code in modules.)

Jupyter notebook can’t work with *.jl files, but all libraries are *.jl files

include and import work just fine, as does Revise.jl for interactively running code as you edit modules.

Or do you mean that notebooks can’t be used to edit .jl files? I thought we were talking about REPL replacements here and interactive work? If you want an IDE or an editor, use vsCode or JupyterLab or …

I’m confused about whether you are talking about writing “libraries” (modules/packages) or about writing interactive scripts.

Backtraces will not show the correct file number

Backtraces should work fine in notebooks. For statements in the running notebook, they will give you line numbers like In[35]: line 15, i.e. line 15 of input cell In[35].

6 Likes

Check jupytext, you can use a .jl file as a notebook. That is super usefull for version control too.

Also in vscode you can have a similar interacting experience using the Shift + Enter workflow and separating each peace of code you wants to run (“cell”) between ##s directly in a .jl file

3 Likes

You asked pretty much the same thing a week ago, in a topic precisely about this, and got more or less the same answers.

Julia is developing very rapidly, but asking this every week is unlikely to result in new information.

18 Likes

Btw. a popular “misconception” is that Python is not a compiled, but an interpreted language, which gives you the impression that there is no compilation at all. Python has a compiler and it needs to compile the Python code to Python Bytecode before interpretation.
There is actually one situation where you really “feel” the compilation time: when installing packages via pip. If a package does not offer wheel binary distributions (every Python source code is compiled for your target system), the installation procedure has to compile every single file.

You can even see that there is compilation if you remove the compiled files, but of course the effect is not so dramatic as compared to Julia, with a heavy duty compilation chain under the hood. Btw. most of the slow compilation times in Julia come/came from invalidations. The trick is to be smart and not recompile things if it’s not needed, which many people are working on for a very long time already.

Here are some loading times after clearing the compiled files in a fresh virtual environment (find venv -name "*.pyc" -exec rm {} \+):

░ tamasgal@greybox.local:km3pipe  master py-3.8.6
░ 09:34:26 > ipython
Python 3.8.6 (default, Nov  6 2020, 18:54:28)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.

>>> %time import matplotlib
CPU times: user 1.69 s, sys: 124 ms, total: 1.82 s
Wall time: 1.95 s

>>> %time import pandas
CPU times: user 1.41 s, sys: 482 ms, total: 1.89 s
Wall time: 2.43 s

Here is time including launching Python itself (after clearing the compiled files again):

░ tamasgal@greybox.local:km3pipe  master km3pipe py-3.8.6
░ 09:39:41 > time python -c "import pandas"
python -c "import pandas"  2.65s user 0.80s system 89% cpu 3.679 total

As you can see, Julia is not so far off. Given that such commands are executed only once per a session (and Revise for example makes it mostly unnecessary to reload packages, in contrast to Python where you have to reload your complete session every time you change your code), I think it’s quite ridiculous to state that

Regarding:

Julia is faster than Python. Show me one piece of code in pure Python which is faster than Julia, without measuring the compilation time, since that’s not what’s behind a claim of being faster than Python. No-one said Julia will compile faster than Python, as far as I know :wink:

14 Likes

For tasks that are short pieces of simple code that run quickly in python, for example, Julia will probably never be faster the first time you do the operation. But for code that is numerically demanding, taking several minutes or more, Julia will very likely be faster, if the Julia code is written reasonably well. Julia is simply not the tool of choice for quick computations done interactively a single time.

Having said that, you can have a very responsive and quick interactive experience with Julia if you take some simple steps like running a first instance of the types of operations that you will do during the day one time in the morning while you’re having a coffee or whatever. Put those sorts of commands in the startup.jl file, and then leave the terminal open for later use. For example, load Plots, and do a simple dummy plot. The second call to the commands, later, with your real work, will be extremely fast.

1 Like

To be honest, I have to say that this differs dramatically from my experience, so maybe I’m doing something wrong.

When I use Python, more specifically IPython from the terminal, I never restart the session. If I’m working on a script, I can keep running it with %run [-i] [-n] myscript.py and all the entities defined in the script will be renewed. Every time a definition is evaluated, it overrides previous definitions. If I want to get rid of something, I can del or %xdel it. I can see what is defined in the interactive session with %who and %whos. If I want a clean workspace, I can do %reset or %reset_selective <regex> to delete a whole bunch of variables. If I’m working on a module, I can importlib.reload it.

On the other hand, with Julia I have to continually restart the session, even with Revise. The first main reason is because Revise doesn’t work if you modify a structure. While developing, it’s super common to change type parameters, fields, field type annotations, inner constructors, etc. Every time you do so a restart is required. The second reason is that you cannot use the interactive session with the same flexibility as in IPython, because what you define directly in the REPL to experiment is not subject to Revise and you are forced to stick with it forever since there is no way to delete or undefine it. Moreover, when working on a package, every time ] test is run you pay the price of the interminable loading time.

Now, I may be doing something wrong here (and I’m interested to learn a better workflow), otherwise in my opinion there are a lot more situations where you need to restart a Julia session than a Python session.

5 Likes

In Python, the issue is that anything you import gets not automatically updated when it changes. When you are executing scripts, this is not an issue. There is some IPython magic for auto-reloading imports, but I never got this working correctly in Jupyter.

Pluto.jl allows redefinition of structs inside a notebook, and notebook files are just plain Julia files. Furthermore, it works together with Revise.jl for external dependencies.

3 Likes

Pluto.jl is nice, but it has its own limitations (you cannot sequentially modify a variable like you would in a script) and is not suitable for developing a package. Here I was talking about the workflow of developing a module, in the old fashioned way of writing in a text editor and experimenting in an interactive terminal session. For that, Revise is not enough and I keep having to restart the session whenever I touch a struct or a global constant. I don’t even know why they are treated differently from functions; it seems to me that it should be more difficult to update/delete functions. When working on a script, Revise.includet is pretty useless: it doesn’t redefine any data, it only updates functions, so you have to use include anyway.

1 Like

I don’t think that Julia will replace completely scripting languages, because those will be always faster if the loading time is the critical part of the script.

At the same time, maybe you know these details already and people have mentioned them, but sometimes the problems are mostly about finding a good workflow. I have recently posted this in another thread, it is basic stuff and might be useful:

Another tool you can try out is https://github.com/dmolina/DaemonMode.jl , which I think is like Pluto in that it runs code in a newly-created module each time, without re-starting Julia itself.

4 Likes

The “right” solution to that is to work in a module and constantly re-evaluate that. Pluto does that under the hood; VSCode and Juno make it very easy to do manually.
You can also just run your tests in the REPL process, although that admittedly isn’t trivial to get right.

3 Likes

Thank you, jupytext looks promising, I will look into it.

That is what I used to did. The problem is that unlike script execution Shift + Enter does not stop if a command has an error.

For structs I just search and replace the struct name by a new one with a version number, e.g. MyStruct1, MyStruct2, etc. Works well enough.

I can’t repro that with e.g.

println("hi")
println("what's 0//0?")
println(0//0)
println("that didn't go so well")

I confused shift + enter with ctrl + enter. The latter is what I used to do and it has the problem of not stopping on errors and not showing the correct line number in Stacktraces. The shift + enter workflow works better, thank you for the tip.

46 posts were split to a new topic: Redefining structs without restart?

" No-one said Julia will compile faster than Python, as far as I know " - well …

"
We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

(Did we mention it should be as fast as C?)"