Julia's applicable context is getting narrower over time?

I think those people are getting too much of our attention. There are always people grumbling about new things. Time spent in arguing with them is almost entirely wasted. Instead we should improve our stuff and encourage those who are working on it. Thus we will convince increasingly those who are watching and willing to try something new.

See Fukuda’s parable, e.g. that explanation here

15 Likes

I guess these people getting too much attention is an instance of Cunningham's Law - Meta.

2 Likes

I was just trying to make a similar point on quora about Julia.

Python users tend to swap very rapidly between “look its nice that other languages have some nice features, but everyone understands python its easy.” and “You can do that in python, here are steps: 1,2,3,4,5,6,7,8…” The concrete example was that python was fine for metaprogramming because you have meta-classes… Virtually everyone who uses julia does metaprogramming because its easy. Hardly anyone in python uses meta-classes as far as I can tell, its deep magic for an inner circle of python hardcore types.

The main point of Julia is it works. I’ve been under extreme time pressure to parse some new epidemic data, develop a new model, fit it, make forecasts.

Dataframes+Differential equations + DynamicHMC + Plots = job done on the first pass.

In python I’d still be fiddling with torchdiffeq, because the log-likelihood calculation required a convolution operation on top of solving the incidence rate over time, and as soon as you deviate from the beaten path in python it all grinds rapidly to a halt. In Julia its just more julia functions for the differentiable stack.

Its not speed per se, because hardcore pythonistas eventually get fast code, its the sheer ease of use for solving complex problems.

15 Likes

Same holds for R.

1 Like

Sorry, but I have to heavily disagree here. If you are talking about being an “user” of metaprogramming, ok, but if you are talking about doing metaprogramming themselves, I would just guess the vast majority of the Julia users never wrote a macro. And certainly is not “easy” in the absolute sense, maybe relatively when compared to Python meta-classes which I do not know anything about.

12 Likes

Really?

One of the things I like about Julia is I can fiddle with models using f(expr) → expr or macros, saves C&Ping, speeds up development.

Part of that view comes from many Julia-sites where more or less one reads that Julia gives “within a factor of two” performance of C. For example, the micro-benchmarks here:

The description of the benchmarks is clear in that they are not the most efficient implementation in every language, they are mostly “equivalent” implementations. Why is C more frequently faster than any other language?

That question does not apply only to Julia. Could well be asked for the Fortran implementations, which there look in average more time than the Julia ones (and nobody discusses if Fortran is a high-performance language).

3 Likes

Mandatory link at this point:

5 Likes

Nice to have a good example. I can’t build the Microbenchmarks currently (see Build fails on Arch Linux: deps/scratch/OpenBLAS.v0.3.10-0.x86_64-linux-gnu-libgfortran5 missing · Issue #43 · JuliaLang/Microbenchmarks · GitHub), but from the graph it looks like parseint is the case where Julia lags C the most. If I take out a @static version check, the Julia benchmark is

julia> function parseintperf(t)
           local n, m
           for i=1:t
               n = rand(UInt32)
               s = string(n, base = 16)
               m = UInt32(parse(Int64, s, base = 16))
               @assert m == n
           end
           return n
       end

and the C one is

long parse_int(const char *s, long base) {
    long n = 0;
    for (; *s; ++s) {
        char c = *s;
        long d = 0;
        if (c >= '0' && c <= '9') d = c-'0';
        else if (c >= 'A' && c <= 'Z') d = c-'A' + (int) 10;
        else if (c >= 'a' && c <= 'z') d = c-'a' + (int) 10;
        else exit(-1);

        if (base <= d) exit(-1);
        n = n*base + d;
    }
    return n;
}

    tmin = 10.0;
    for (int i=0; i<NITER; ++i) {
        t = clock_now();
        char s[11];
        for (int k=0; k<1000 * 100; ++k) {
            uint32_t n = dsfmt_gv_genrand_uint32();
            sprintf(s, "%x", n);
            uint32_t m = (uint32_t)parse_int(s, 16);
            assert(m == n);
        }
        t = clock_now()-t;
        if (t < tmin) tmin = t;
    }

(just snipping out the relevant bits). On my machine:

julia> @benchmark parseintperf(1000)
BenchmarkTools.Trial: 
  memory estimate:  93.75 KiB
  allocs estimate:  2000
  --------------
  minimum time:     94.564 μs (0.00% GC)
  median time:      103.344 μs (0.00% GC)
  mean time:        111.886 μs (4.97% GC)
  maximum time:     3.408 ms (96.54% GC)
  --------------
  samples:          10000
  evals/sample:     1

and if I compile just the minimum I need to run parseint

$ gcc perfint.c -o perfint
tim@diva:~/src/Microbenchmarks$ ./perfint 
c,parse_integers,0.118811

Note the C version returns the minimum time; for the Julia version, the “minimum”, “median”, and “mean” times are all faster than this. So despite what the benchmarks say, the Julia version is slightly faster. Moreover, if you look carefully, the C version parses immediately to long which is the same as Int32. If we define

julia> function parseintperf2(t)
           local n, m
           for i=1:t
               n = rand(UInt32)
               s = string(n, base = 16)
               m = parse(UInt32, s, base = 16)
               @assert m == n
           end
           return n
       end

then I get

julia> @benchmark parseintperf2(1000)
BenchmarkTools.Trial: 
  memory estimate:  93.75 KiB
  allocs estimate:  2000
  --------------
  minimum time:     87.846 μs (0.00% GC)
  median time:      92.150 μs (0.00% GC)
  mean time:        100.223 μs (5.38% GC)
  maximum time:     3.157 ms (96.51% GC)
  --------------
  samples:          10000
  evals/sample:     1

which is a considerable advantage for the Julia version.

This just illustrates my point: Julia is capable of matching C, it just comes down to how things are written. You can write great C code and lousy Julia code, or lousy C code and great Julia code, and when the quality difference is large the victor is predictable based on quality and not language. Fundamentally Julia is capable of everything C can do, and it can do oh-so-much-more besides.

EDIT: just remembered I should add optimization flags for C.

tim@diva:~/src/Microbenchmarks$ gcc -O3 perfint.c -o perfint
tim@diva:~/src/Microbenchmarks$ ./perfint 
c,parse_integers,0.089710

So C is capable of tying Julia, but not beating it.

20 Likes

I’d like to add one other point: defending these points takes a lot of time that I’d rather be using for other activities. I just hope everyone gets notions of magic out of their heads and realizes that compilers are compilers are compilers, and when you’re comparing the same compiler in different settings (LLVM on Julia vs C), your general expectation is that they should be equivalent. The exception comes when something gets in the way of the compiler being able to “understand” the code and generate proper optimizations, and Julia (unlike Python/Matlab/etc) was explicitly designed not to get in the way.

That’s not to say that there isn’t room for more compiler optimization, but a lot of microbenchmarks are comparing settings where Julia’s compiler long ago reached parity.

31 Likes

Seconding (3rding) this.
Once very rarely needs to define new macros, and almost never need them for performance reasons.
(One does need to use existing macros like @inbounds sometimes, but rarely definine new ones)

I just checked and Invenia’s whole internal ~400k LOC code-base, defines only 3 macros.
2 of them are used only in tests,
and the last just generates a good error message for dimension mismatches.
I know whe have a handful more in our open source code, like Mocking.jl’s @mock, NamedDIms.jl’s declare_matmul.
But still, I think that makes the point that one can write a whole ton of julia code without ever needing to write a macro.

saves C&Ping,

might i recommend structuring your code such that a function can do this?
It is less powerful, and as such easier to reason able.
(e.g. you know it isn’t just going to assign to a local variable or somehting)

12 Likes

I have never used Python metaclasses either but from a quick read they seem to be about modifying aspects of the workings and construction of classes. Although that can be a big deal in Python it’s far from the generality of Julia’s metaprogramming. A closer correspondence is the Python ast module, which I have used once to port a piece of Julia metaprogramming. It worked out okay, partly because the necessary code transformations perfectly matched the available methods in the ast object and partly because I only needed to mimic the existing Julia code. Had this not been the case I would have been stuck in the nightmares of opaque ast objects.

3 Likes

This is especially surprising since string(n, base=16) allocates a new object on the heap for every iteration, whereas the C version writes over and over again to the same stack-allocated static array. Presumably the Julia version would be even faster if you rewrote it to use pre-allocated string output. (Update: I tried it, and working in-place is 20% faster than parseintperf2 on my machine, but required forking Base.hex to replace string(n, base=16) with an in-place version, combined with StringViews.jl to call parse on the overwritten buffer without making a copy. But this is hardly “magic”—just pure Julia code.)

(On the other hand, the Julia version slows down if you use s = @sprintf "%x" n instead of string(n, base=16), reflecting the decades of extraordinary optimization effort that has gone into C printf implementations.)

Apples-to-apples cross-language benchmarking is hard!

13 Likes

Also, every time I see benchmarks based on code from different languages using some kind of rand() function it makes me wonder if those calls are comparable in performance, as their generation time is part of the overall benchmark time.

Until the next release of the compiler and struct layout changes, or you try to port your code to a different computer or compiler. This type of unsafe micro-optimization causes more time in fixing maintenance problems than it is likely to save in run time.

4 Likes

I might be misreading, but the C code is doing the timed loop 100.000 times, while you’re only using a 1.000 loops with Julia? I.e. for (int k=0; k<1000 * 100; ++k) { versus @benchmark parseintperf(1000)?

1 Like

I think on JuliaCon 2021 someone should create a theme with “The fastest coding ever in the world, achieved with julia”.
Or I hope I will have time to show my optimised workflow. I think this is an other dimension of programming and after this video everything else will look like stone age. :o

I think the simplicity that comes with Julia is key to success in future. A pretty good analogy is electric and petrol cars, the cars got simpler with a magnitude and it is easier to develop in each part of the car. :slight_smile:

1 Like

Sorry, in my abbreviated snippet I omitted a key line in the C code:

print_perf("parse_integers", tmin / 100);

Full details are at https://github.com/JuliaLang/Microbenchmarks

3 Likes

It never occurred to me to consider using meta-classes as meta-programming. It’s been a while, but I seem to remember that you use meta-classes to inspect (and perhaps manipulate) meta-information about classes; you’re not really using code to transform code.

Am I misremembering this?

Hey @tim.holy - I agree that your time could certainly be better spent elsewhere, but I personally have learned and benefited from these posts you are putting here. I am reminded of a recent Slack discussion about people needing to defend Julia in the workplace. Each one of your statements have been solid and serve as great evidence to support usage. I am certainly bookmarking this post and will use it as a point of reference for whenever discussions like this arise.

So personally, thank you so much for taking the time to go in-depth on the questions in the discussion. ~ tcp :deciduous_tree:

26 Likes