Julia for Scientists: summary slides

I gave a presentation on Julia today, for staff of my uni’s Psychology department.
Many interested people in the audience!

These are the slides:
image

(It starts slow but quickly goes into more nerdy terrain; The audience is varied, hence the different registers).

I’m posting here as they might be useful for someone else discussing Julia with colleagues.

And also to get feedback: for example, is the explanation of why base Python / R / Matlab is slow correct?
i.e,

  • manifold more CPU instructions per line of source code
  • and these extra instructions are all type checking
  • …type checking that is run every time the line is run (also in inner hot loops)

There’s also a slide on “Julia tips” where you might find something helpful.
Plus, shoutouts for @bkamins & @tim.holy :slight_smile:

43 Likes

Thank you for posting this. I’m currently doing up a presentation for data analysis for experiments using Julia and trying to sell people on learning to code instead of using Excel. Your slides have a lot of great points I haven’t thought of!

I’d be curious to read what other people think.

1 Like

Nice presentation! The one thing that surprised me about the gripes section was

Getting floats to print with lower precision is way more difficult than
it should be for new users

What do other languages do differently here? It seems to me R and Python print roughly the same things as Julia by default (with R chopping off a bit more)?

julia> 3/7
0.42857142857142855

>>> 3/7
0.42857142857142855

> 3/7
[1] 0.4285714
1 Like

Yes (though Matlab has a shorter default)
But, customizing the default is harder.

NumPy:

np.set_printoptions(precision=4)

R:

options(digits=4)

Julia: …
type pirating Float64’s show?
Which assumes you have learned how Julia’s display system works (a big topic!)

using Printf  # Ok, but, Python has f-strings; built-in!

Base.show(io::IO, x::Float64) = @sprintf "%.4g" x
# Or should this be for ::MIME"text/plain" rather?
# That would allow you to use `print(x)`/`show(x)` to temporarily see more digits.
# BUT, most composite datatypes containing floats would then not use our new compact printing.
3 Likes

I miss the numpy context manager np.printoptions, really handy.

1 Like

To be fair to Excel and the like, it has so many killer features that most programming languages don’t have:

  • Edit your plots by point and click. Plots are graphic – it’s silly to tweak them with code.
    • Pylustrator (for Matplotlib) is a great experiment that combines GUI-editing of plots with code/reproducibility (though it’s too buggy for practical use atm for me).
  • Your data is always easily visible / inspectable
    • Not so in the middle of a for loop in a function in a function; both at runtime and in your code editor
  • Reactivity!
    • Though props for the likes of Observable, natto.dev, Pluto.jl
2 Likes

In “why programming” I think a big missing is “reproducibility”

this is also missing in the Julia advantages… when you deal with packages, a huge advantage of Julia is the easy you create thin environments, so that each project you deal with has its own (reproducible) environment…

4 Likes

Oh excellent point! I’ll add it

Yeah; I mention Project.toml / Manifest.toml (and talked about it in person); but might be good to add something.

Is there a big difference with R / Python here? (Yes it’s a bit more ergonomic in Julia, and Python has decision fatigue for its plethora of env/pkg management options; but they do the same thing I think)

1 Like

The biggest difference here is BinaryBuilder.jl and Yggdrasil for reliably managing binary dependencies.

6 Likes

I hadn’t used Artifacts.jl yet. This looks like it could be amazing for scientists?!
Automatic downloading of datasets (too big for git) from a URL, plus checksum’ing, plus avoiding unnecessary duplicates? :open_mouth:

1 Like

Oh! BinaryBuilder + Yggdrasil is great for binaries (like pandoc, pq, etc.) but not really meant too much for datasets. I would heartily recommend DataDeps.jl for the case you are thinking of though!

2 Likes

If you are trying to get people to code rather than excel, do you mind if I plug my package ClipData.jl? It might help people slowly transition their workflows from excel to Julia.

9 Likes

I love that @pdeffebach.
“Copy-paste is the universal API”.


I updated the slides based on the feedback, thanks all.

(Changes:

  • Emphasize reproducibility
    • Mention data tracking & binary dep mgmt
  • More on how to type and find unicode characters / LaTeX names
  • Added link to top posts of Seven Lines of Julia
  • Various new illustrations
  • A more fun Syntax example :slight_smile:

)

1 Like

Looks neat! I’ll keep it in mind.

Thanks for sharing!

I recommend mentioning DrWatson package. It is perfect for beginners IMO. One doesn’t need to learn about environments and the code becomes truly reproducible.

3 Likes

Somewhere I found this shortening print solution:

julia> struct IOContextDisplay <: AbstractDisplay ctx end;
       function Base.Multimedia.display(d::IOContextDisplay, x)
           io = IOBuffer()
           ctx = d.ctx(io)
           show(ctx, "text/plain", x)
           println(stdout, String(take!(io)))
       end
       disp = IOContextDisplay(x -> IOContext(x, :compact => true, :limit => true, :color => true))
       pushdisplay(disp)

or all in one line

julia> struct IOContextDisplay <: AbstractDisplay ctx end; function Base.Multimedia.display(d::IOContextDisplay, x) io = IOBuffer(); ctx = d.ctx(io); show(ctx, "text/plain", x); println(stdout, String(take!(io))) end; disp = IOContextDisplay(x -> IOContext(x, :compact => true, :limit => true, :color => true)); pushdisplay(disp); # prettyshow

There is a much easier way nowadays that is documented in the REPL manual:

julia> 1/pi
0.3183098861837907

julia> Base.active_repl.options.iocontext[:compact]=true
true

julia> 1/pi
0.31831
17 Likes

I think we should probably have a better interface for this (i.e. a function in InteractiveUtils)

11 Likes

OhMyREPL.jl breaks this trick.

probably ? :slight_smile: