Adam, I would love to listen in. Even though it is 30 years since I got anywhere near a beamline.
Hi - For the Julia HEP meeting - 10-11am central time on Friday 12/11 is emerging as the best time and is the only time that’s “all green” on the Doodle poll. So let’s do that! I’ll send out more info earlier that week on this discourse thread. Thanks everyone!! Looking forward to the meeting. – Adam
I’m interested too. And time is good to me.
Can I ask, how working with Julia compares with Numpy + Pandas workflow? I also curious, where you found limit for Numpy + Pandas and how Julia work beyond this limit.
Adam,
why do not we create an indico page and attach contributions to it, as physicists like
I can help with that.
Well, I don’t even know where to start… but to sum up: with Julia you can properly organise your code and logic using a nice and transparent type hierarchy; in contrast to the workflow with numpy/pandas where everything has to be a table or a contiguous array.
You simply have the feeling that you are working with the full set of the language instead of pushing everything into vectorised operations just because otherwise you end up with horribly slow code.
Thank you for the words of your experience, I used numpy (even some pandas) in the past, but never switched to Julia from it (I compute what I needed and switch to things where I don’t need Python), so I was interested how this feels like.
If you ever wrote some blog post, about switching from Python ecosystem to Julia, please notify me.
There are a lot of nice posts out there. What kind of file types are you working with usually and what are you doing in general?
I usually use Numpy/Scipy/Pandas/matplotlib whenever I need to process small tabular datasets, do some simple statistics and produce some plots. The main reason for this is that Python is much faster for plotting, can easily be shared with colleagues, and Julia would be overkill for this use case.
However, I will use Julia if:
- I am handling medium-sized datasets (~10MB–1GB) and need to do some non-trivial operations on them, which don’t vectorize well.
- The datasets contains four-vectors, in which case I want to use my custom LorentzVector type and only Julia allows me to do that efficiently.
- I am not processing tabular data but I still need very fast code (e.g. Monte-Carlo event generation).
- I am writing complex code, mostly for myself, and Julia makes is easier than Python.
- I am quickly prototyping something which does not involve any plots.
I would use plain, object-oriented Python (or C++/ROOT) if:
- I need to use some package (library) which works best with Python (C++/ROOT), or is slower with Julia+PyCall (this was the case for some nested sampling package).
- I intend to share the code widely or integrate it in a code base which enabled C++11 and moved to Python 3 only very recently, and is unlikely to support Julia before a few decades, if ever.
In my previous work I only need to compute numericaly one-dimensional integral for wide range of free parameters and make a lot of plots to understand behavior of this function. This was into context of calculating interaction between two “big bodies” caried out by scalar field in some non-typical model (I wish I can write “nonstandard model”, but this would be ambigious). This was intermidiate step toward electrogmatic interaction, which I still don’t know how to implement in this particular case.
But, now I’m try to expand into new fields of numerical works. My colleague is deep into chromodynamics calculation and I my pet project is to rewrite some Fortran code for matrix part of graphs calculation in Julia. You know, Fortran is good language, but I prefer to work in Julia. I know that calling Frotran from Julia is “zerocost code”, but Fortran is too much rigid for my taste.
Matplotlib is one of best parts of Python ecosystem, I agree. But, writing code with numpy is often suboptimal to me and in some way artificial.
I know this pain of outrageously outdated software. This software will probably still used Python 2.x if it not be abondend with the beging of 2020. And my colleagues will probably stop using Fortran, when GNU abond its compiler, but not second before. Fortran was mile stone in computer science, but in 2020 it is not always optimal choice.
@misha_mikhasenko I think indico is a bit overkill for the first meeting. Having notes in a Google Doc will be enough. Maybe Indico will be useful for later meetings. Thanks though!
Hi all - Let’s do our first meeting at 10:15 - 11:15am (US Central time) on Friday December 11. Sorry for the 15 minute offset - I have another meeting before that will go a little past 10. See this Google doc for agenda (feel free to add to it) and the Google Meets link. Thanks and looking forward to chatting with you all. – Adam
Just to be sure, you mean 4:15pm GMT or 5:15pm CET, right? In past I have seen two definitions of “US Central Time”
@tamasgal Yes, you are correct…
10:15am CST == 4:15pm GMT == 5:15pm CET
Timezones are important, but they sure are a pain
A link to the meeting please?
there is a calendar of meeting son the Discourse - a pointer to that also would be appreciated.
Here are the notes:
and the link to the Google Meet room: https://meet.google.com/xus-ggko-pqk
See you later!
Can someone sum up problem discussed on the meeting, about cooperation of Root and LLVM? I can’t remember too much about it, but this was important topic.
As far as I can recall, one of the main issues is that Julia uses a different LLVM (which is heavily patched) whereas ROOT has it’s own and those are not going well together. But I have not spent much time in building a C++ wrapper for it so I cannot provide more details.
There is a useful discussion in the ROOT.jl issue thread
https://github.com/JuliaHEP/ROOT.jl/issues/17