Offtopic, but one time I had to peel something with a screwdriver, is hard, so the comparison is on point (camping)
One time I had to repair a cord yanked out of an IEC 60309 socket without a screwdriver (it was the blue 250V, also camping). Neither is fun
This is a really helpful discussion, as I was just thinking about this myself - maybe we can move this to a different topic?
Just this morning I uninstalled StatsPlots from my default environment, as it precompiles too frequently and I didn’t need it for what I was working on, which got me thinking “maybe I shouldn’t be working in the default environment”?
I also frequently do analysis that has to be submitted e.g. as evidence in court proceedings, the timing of which means that I might have to go back to it months (or years!) later, so reproducible environments are crucial.
How would an environment" based workflow look like for a more script based type of analysis? As an example, my analyses often have the following structure:
Raw data → clean in some script data_prep.jl
→ analyse in some script data_analysis
→ produce tables and figures in outputs.jl
. I currently save all of this together in one folder, and start the scripts with using Pkg; cd(@\_\_DIR\_\_()); Pkg.activate(".")
, but that feels a bit hacky?
Yes, if I go into detail, this is also how I usually work (especially for the environment activation → I have no time to debug global environment packages compatibility). I should really do more tests while developing though! It pays in the long-run.
The 45s I was quoting was simply a toy example to illustrate that the iterative process of “developing code → running code → looking at result → adapt/develop code” is just quicker in Julia because of the accumulation compiled functions. In MATLAB I was often just staring at the screen during the “running code” phase. In Julia, I don’t have this “luxury”!
I think a lot of us are in the same boat. We came from Matlab as scientists, writing long, slow scripts that got the job done. We heard the promise of Julia and are trying to transition. However, we now have to learn the Julia language and software engineering at the same time, since Julia is a general purpose, fully featured language.
I had never used or heard of packages, modules, libraries, environments, backends, unit testing, repositories, or IDEs before I started using Julia. Learning software development the right way has been much harder so far than learning the Julia language. Most Matlab to Julia tutorials cover syntax and convention differences, but not workflow differences. I could really use some comprehensive resources on learning modern software engineering and workflow from a basic Matlab user’s perspective. Workflow suggestions in the Julia Docs are too cryptic for me to follow.
You can find books on some things (eg people have written a lot about unit testing), but it would not be a good investment of your time because very little is specific to Julia, and the concepts are not that difficult anyway.
I would recommend picking these up by osmosis and example. Follow conversations here and research the topics that are new or unfamiliar. Look at code organization in packages — contributing small fixes that you need anyway is the best way to do it, because you get guidance from experienced users.
AFAICT this is how most people do it, especially the scientists (maybe CS people have courses on these things, but I am not sure). I realize that it’s a lot of new stuff, and it can seem daunting. So take it one piece at a time.
In addition, a lot of the things you mention above have slight variations bases on personal preference, so it is hard to write the Single Definitive Guide. For example, people use various IDEs and some users are highly attached to their current IDE because they have become very productive in it and/or tailored it to their own needs.
The other side of this coin is that Julia is just a way more expressive language than MATLAB. So even though the tooling around Julia isn’t as good, I’ve noticed myself being way more productive in the last few months since I’ve picked up Julia than I ever was with MATLAB. I now feel like I am spending most of my time in MATLAB trying unsuccessfully to express the things that would be trivial to do in Julia. And I’m a pretty experienced MATLAB programmer.
It would be very helpful for me and others who came from MATLAB to Julia, If you can provide a typical example.
@Tamas_Papp is right. Osmosis and example is the best path. I’m a pretty decent Matlab programmer and my development style is/was tuned to the apps I am currently working on. A book project is very different from a journal article, for example. I am moving to Julia and have found no problems in the prototyping phase. The differences from Matlab are not trivial, but not a show-stopping burden either. I started in March 2019 and have since done some fairly nontrivial stuff.
I’ve found that the Julia ecosystem is far better for papers (reproducible results using github + notebooks) and for books (CI + notebooks + peformance). The downside to Julia is that the user base is far smaller than Matlab’s and, for that reason, I am still using Matlab for some collaborative projects. The projects I do on my own have become all Julia.
Of course, being an old guy, my C, Matlab, and Julia codes look suspiciously like F77.
My Matlab finite element toolkit that has the same functionality (~80-90%) as FinEtools.jl has three times as many source lines.
Writing FinEtools.jl I haven’t used a debugger once. Writing the Matlab toolkit, I was in the debugger all the time.
OK, but where is the main difference?
My personal view: Julia is clean and consistent. Matlab is a mess. For instance, just consider that you cannot directly access an array that is returned from a function. It must be saved into a variable. Many more examples: implicit expansion, for instance.
For instance, just consider that you cannot directly access an array that is returned from a function.
How?
Matlab:
function [a] = f(a, b, i)
a(i) = b;
end
This cannot be done:
a = [0, 3, 5]
f(a, 1, 1)(1)
In Julia:
a = [0, 3, 5]
f(a, b, i) = (a[i] = b)
f(a, 1, 1)[1]
I see, this is fascinating. Is there any source, where I can find this sort of Julia/MATLAB differences?
Implicit expansion: http://hogwarts.ucsd.edu/~pkrysl/no-Matlab-support.html
I’m a Matlab user since R2012 and I can say it was my first programming language I really used (did some C in college). Here’s my take on Matlab:
One of the things I started to hate about Matlab, after trying Julia, is the lack of namespaces. It has only one “global” namespace and every time you want to make some code available you need to “add it to the path”. This is very risky if you work with different codebases which can conflict with each other.(Matlab has classes and that can alleviate the problem a little, but it’s not as expressive as Julia’s multiple dispatch).
It’s hard to envision these kind of things if Matlab is your only background(as it was my case) and you learn to dance around them, but once you try something else(Julia in this case, Python is similar, but I like Julia’s Pkg much more), you start to realize that there is big world with lots of goodies. This “ease of use” that Matlab promotes is a double edged sword, because as codebases grow, they are harder to maintain and people are educated with the mentality of “everything in one place”(everything in one huge script, everything in one huge repo).
Another cool point in Julia is the keyword arguments of functions and the default values for arguments. This is something that drives me crazy when I use Matlab now, because I have to clutter my code with a lot of if, else, for loops over varargin or even worse, use the horrific input parser.
Then I could add Julia’s consistency when using broadcasting (you add a dot, you have elementwise operation). Matlab uses the dot notation for vectorization, but there are a lot of cases where it broadcasts automatically even if the user didn’t write the code like that. Some might call that convenience, I call it confusing.
You can sort of do metaprogramming in Matlab using the “eval”, but it will be painfully slow and it’s not advised. Julia has efficient macros.
Matlab lacks basic types like tuples and dictionaries (it recently added containers.Map to mimic a dict)
Then there is the hilarious choice of Matlab to output line vectors out of a lot of built-in/user functions, although the language is column major. This forces me to add a lot of checks in my functions to see if my array is a column.
I could go on, but my 2c is that one needs to try stuff in both languages to see their advantages and drawbacks. For me Julia was in eye opener.
And this is really only for .*
, .^
and ./
, you can’t vectorize arbitrary functions or operators.
The implicit expansion thing is a great example.
To be clear, implicit expansion is an extremely useful tool to save a bunch of ugly bsxfun(...)
s littering your code. In a language that demands vectorization for performance code, it totally makes sense why they would introduce it. But… it’s also a terrible source of hidden bugs and code obfuscation, as this link points out.
What you need is something that handles broadcasting that
- has as small of a syntax footprint as possible,
- is opt-in so it is always clear when you are broadcasting, and
- throws an error when incompatible sizes are used without broadcasting.
This is why Julia’s dot broadcasting syntax is so cool. If a
is a length-3 vector and B
is a 3x3 matrix, a .+ B
will broadcast a
to the higher dimension and a + B
will throw an error. All three of the above requirements are met. And Julia doesn’t even need to use broadcasting to have fast code.
Yes! This hit just about all of my worst pain points. Built-in functions sometimes returning row vectors and sometimes column (and having to constantly check for this), containers.Map
almost never does what I want it to, and the input parser is just the worth thing ever (and that is the best way to handle keyword arguments in MATLAB).
I’ll add to the list that cell arrays have to be accessed with different syntax than arrays, so I’m constantly writing if iscell(...)
. This Is especially bad with the whole char arrays as pseudo-strings thing (and now they have real strings, so there is always going to be the issue of supporting both).
Also, for
loops only iterate over a vector if it’s a row vector and pass the whole vector in as a loop variable if it’s a column vector. This combined with not knowing whether you ever have row or column vectors due to the point above makes the nice for thing=things
syntax that is common in Python and Julia pretty dangerous, even though it’s supported.