Julia for Scientists: summary slides

tfiers · December 8, 2022, 1:15am

I gave a presentation on Julia today, for staff of my uni’s Psychology department.
Many interested people in the audience!

These are the slides:

(It starts slow but quickly goes into more nerdy terrain; The audience is varied, hence the different registers).

I’m posting here as they might be useful for someone else discussing Julia with colleagues.

And also to get feedback: for example, is the explanation of why base Python / R / Matlab is slow correct?
i.e,

manifold more CPU instructions per line of source code
and these extra instructions are all type checking
…type checking that is run every time the line is run (also in inner hot loops)

There’s also a slide on “Julia tips” where you might find something helpful.
Plus, shoutouts for @bkamins & @tim.holy

garrek · December 8, 2022, 4:20am

Thank you for posting this. I’m currently doing up a presentation for data analysis for experiments using Julia and trying to sell people on learning to code instead of using Excel. Your slides have a lot of great points I haven’t thought of!

I’d be curious to read what other people think.

nilshg · December 8, 2022, 10:52am

Nice presentation! The one thing that surprised me about the gripes section was

Getting floats to print with lower precision is way more difficult than
it should be for new users

What do other languages do differently here? It seems to me R and Python print roughly the same things as Julia by default (with R chopping off a bit more)?

julia> 3/7
0.42857142857142855

>>> 3/7
0.42857142857142855

> 3/7
[1] 0.4285714

tfiers · December 8, 2022, 11:30am

Yes (though Matlab has a shorter default)
But, customizing the default is harder.

NumPy:

np.set_printoptions(precision=4)

R:

options(digits=4)

Julia: …
type pirating Float64’s show?
Which assumes you have learned how Julia’s display system works (a big topic!)

using Printf  # Ok, but, Python has f-strings; built-in!

Base.show(io::IO, x::Float64) = @sprintf "%.4g" x
# Or should this be for ::MIME"text/plain" rather?
# That would allow you to use `print(x)`/`show(x)` to temporarily see more digits.
# BUT, most composite datatypes containing floats would then not use our new compact printing.

t-bltg · December 8, 2022, 11:41am

I miss the numpy context manager np.printoptions, really handy.

tfiers · December 8, 2022, 11:53am

To be fair to Excel and the like, it has so many killer features that most programming languages don’t have:

Edit your plots by point and click. Plots are graphic – it’s silly to tweak them with code.
- Pylustrator (for Matplotlib) is a great experiment that combines GUI-editing of plots with code/reproducibility (though it’s too buggy for practical use atm for me).
Your data is always easily visible / inspectable
- Not so in the middle of a for loop in a function in a function; both at runtime and in your code editor
Reactivity!
- Though props for the likes of Observable, natto.dev, Pluto.jl

sylvaticus · December 8, 2022, 12:03pm

In “why programming” I think a big missing is “reproducibility”

this is also missing in the Julia advantages… when you deal with packages, a huge advantage of Julia is the easy you create thin environments, so that each project you deal with has its own (reproducible) environment…

tfiers · December 8, 2022, 12:04pm

Oh excellent point! I’ll add it

Yeah; I mention Project.toml / Manifest.toml (and talked about it in person); but might be good to add something.

Is there a big difference with R / Python here? (Yes it’s a bit more ergonomic in Julia, and Python has decision fatigue for its plethora of env/pkg management options; but they do the same thing I think)

BeastyBlacksmith · December 8, 2022, 12:36pm

The biggest difference here is BinaryBuilder.jl and Yggdrasil for reliably managing binary dependencies.

tfiers · December 8, 2022, 6:36pm

I hadn’t used Artifacts.jl yet. This looks like it could be amazing for scientists?!
Automatic downloading of datasets (too big for git) from a URL, plus checksum’ing, plus avoiding unnecessary duplicates?

TheCedarPrince · December 8, 2022, 7:00pm

Oh! BinaryBuilder + Yggdrasil is great for binaries (like pandoc, pq, etc.) but not really meant too much for datasets. I would heartily recommend DataDeps.jl for the case you are thinking of though!

pdeffebach · December 8, 2022, 7:42pm

If you are trying to get people to code rather than excel, do you mind if I plug my package ClipData.jl? It might help people slowly transition their workflows from excel to Julia.

tfiers · December 8, 2022, 8:59pm

I love that @pdeffebach.
“Copy-paste is the universal API”.

I updated the slides based on the feedback, thanks all.

(Changes:

Emphasize reproducibility
- Mention data tracking & binary dep mgmt
More on how to type and find unicode characters / LaTeX names
Added link to top posts of Seven Lines of Julia
Various new illustrations
A more fun Syntax example

)

garrek · December 8, 2022, 10:44pm

Looks neat! I’ll keep it in mind.

pjgorski · December 9, 2022, 9:17am

Thanks for sharing!

I recommend mentioning DrWatson package. It is perfect for beginners IMO. One doesn’t need to learn about environments and the code becomes truly reproducible.

denius · December 9, 2022, 3:18pm

Somewhere I found this shortening print solution:

julia> struct IOContextDisplay <: AbstractDisplay ctx end;
       function Base.Multimedia.display(d::IOContextDisplay, x)
           io = IOBuffer()
           ctx = d.ctx(io)
           show(ctx, "text/plain", x)
           println(stdout, String(take!(io)))
       end
       disp = IOContextDisplay(x -> IOContext(x, :compact => true, :limit => true, :color => true))
       pushdisplay(disp)

or all in one line

julia> struct IOContextDisplay <: AbstractDisplay ctx end; function Base.Multimedia.display(d::IOContextDisplay, x) io = IOBuffer(); ctx = d.ctx(io); show(ctx, "text/plain", x); println(stdout, String(take!(io))) end; disp = IOContextDisplay(x -> IOContext(x, :compact => true, :limit => true, :color => true)); pushdisplay(disp); # prettyshow

stevengj · December 9, 2022, 4:44pm

There is a much easier way nowadays that is documented in the REPL manual:

julia> 1/pi
0.3183098861837907

julia> Base.active_repl.options.iocontext[:compact]=true
true

julia> 1/pi
0.31831

Oscar_Smith · December 9, 2022, 4:48pm

I think we should probably have a better interface for this (i.e. a function in InteractiveUtils)

denius · December 9, 2022, 10:13pm

OhMyREPL.jl breaks this trick.

joa-quim · December 9, 2022, 11:04pm

probably ?

Topic		Replies	Views
Julia packages that have no equivalents in other languages? Teaching & Outreach	42	6680	July 30, 2021
Julia - what interesting things are you doing? Community	88	6922	June 14, 2021
Preaching Julia to biologists Teaching & Outreach	76	5889	November 10, 2018
Why is Julia so great? New to Julia	77	10789	April 16, 2023
Convincing physicists that Julia is worth their time and effort Teaching & Outreach	34	5846	January 11, 2019

Julia for Scientists: summary slides

Related topics