Matt Parker's Crappy Python Challenge

HexSpin · October 19, 2022, 3:18pm

I found this amusing and perhaps you will too… and maybe someone with Julia will make their mark.

(The scattergram does show a Julia competitor looking pretty good.)

ellocco · October 20, 2022, 8:00am

I have searched a bit in the net, but was not lucky to find something related to Julia and this challenge.
Is there somewhere a performance scattergram published that includes a Julia code?

jmair · October 20, 2022, 8:40am

I thought this video was interesting. The main theme of the video was just describing the “two-language problem”, but not in those words. It seems like someone did have a go in Julia, and did much better than the original implementations, but still fell short of the C/C++ speeds.

Palli · October 20, 2022, 8:43am

0.0068 sec. (the C++ code, I’m up to 5 min timestamp) is hard to beat in Julia, because Julia starts in 0.177 sec. for me (I’ve though cut startup in half with a sysimage; and you could time without startup or compilation). [EDIT: startup can be eliminated, see my comment further down.]

jmair · October 20, 2022, 8:52am

Would the benchmarking on our end include Julia startup time?

Palli · October 20, 2022, 9:20am

No, it’s just a predictable complaint people could have (e.g. insist on timing that way at Debian’s Benchmark game), while unfair.

Palli · October 20, 2022, 9:45am

@HexSpin, @ellocco, Isn’t this related (or same type of code? I’ve yet to watch the full video): Rust / Julia Comparison Post and Optimal Quordle

Here’s Julia code that Parker linked to:

It uses Polyester threads, while good still far behind the best, which will be hard to beat. It’s a good challenge (excluding Julia startup/compilation overhead), and if compiling to binary with StaticCompiler.jl, then currently I think impossible to beat since it doesn’t support threads.

mkoculak · October 20, 2022, 12:17pm

So under the videos there are a bunch of links; most of the benchmarking is done by Benjamin Paassen, here there is a spreadsheet of implementations he curates

Palli · October 20, 2022, 12:58pm

Hmm, we’re beaten by Python (and JavaScript, Go and Haskell) on time, even Python is 16x faster, so I think we can do better. And intriguingly the smallest code is Python at 18 lines (Julia’s is 447), and even that code is also 10x faster than Julia. It’s also intriguing to compete on size (lines of code or bytes): 5words538 - Python Repl - Replit

Oscar_Smith · October 20, 2022, 1:14pm

The Julia code could easily be a ton faster. It is using Set{Char} so just switching to the bitmasked datastructure will be a massive improvement.

HexSpin · October 20, 2022, 1:50pm

Perhaps - but I’m hardly qualified to comment. I’m what you might call a “Matt Parker kinda coder, but in Julia”.

lrnv · October 20, 2022, 2:30pm

Seeing the video, I was surprised there was only one Julia version, and thought that there will be something done from the community to defend what is ours

HexSpin · October 20, 2022, 2:31pm

I had the same reaction - but not having that skill level, I thought I’d offer it up for the more able.

palday · October 20, 2022, 4:25pm

As I commented on Slack in #performance-helpdesk:

algorithmically it’s nothing special — the whole point was to do the algorithm done by Ben Paaßen in Python and show that you get a big speed boost just by moving to Julia and doing some fairly naive things
[it includes] None of the bit twiddling done by the fastest C implementations nor any of the scheduling of loading (if their run time is < 500ms, then I’m curious how they’re loading the file and doing the actual analysis).

This is also discussed in the README in that repo.

It’s also a bit more general in its formulation than Paaßen’s – it’s written as a recursive call so search for cliques of any size and with any (constant) number of letters in the constituent words.

Related to “what counts as time?”, I had also commented in the same Slack thread:

And given comments on Patreon and elsewhere, I also don’t think people knew how to do timing in Julia, so I have no idea how they counted e.g. compilation, etc. If somebody has a C program with -O3 that takes several tens of seconds to compile but runs in < 1 second, then I think Julia comparable for total runtime of ~40 seconds

fins · October 21, 2022, 11:22am

I only briefly looked at https://github.com/stew675/standup5x5/blob/master/525.c (what seems to be one of the fastest solutions, of that challenge - not sure, I missed something, though). This kind of C(++) code is so close to assembler, nailing the relevant portions of the algo to specific instructions (avx, popcnt, “regular” boolean logic, …), for a specific pieces of silicon, that I don’t think compiler-options even contribute all that much, to performance. Probably, the main-reason, for the author, to not code assembler, directly, was, to have a (relatively) nice way to code the utility (non-performance-critical) stuff, not loose track of multithreading-issues and maybe the overall control-flow, but which doesn’t even look too complicated, in this piece of code.

My guess is: Best bet in Julia would be to use something along those lines of bit-fiddling, & using intrinsics with as little overhead as possible, from julia, for max. performance. But then, the question really also is: What’s even the point, as it is just competing for which language is more suited to doing assembler…

I think, often, people forget, there is only so much, any given cpu can do, within a given time, with a fixed instruction-set, memory-bandwidth, etc. And any high-level-language can shine, when they are able to save the cpu some work, through clever algorithms. BUT, also the problem has to be complex enough, for that, to even provide these opportunities of “work-pruning”. Once, optimisation of algorithms and heuristics is exhausted, it’s “just” going to be a matter of how to tell the cpu, what to do, with as little overhead, as possible.

HexSpin · October 21, 2022, 11:57am

Your observations seem correct to me. You ask what the point is … and I’d say pride : being beaten by C++ okay, but crushed by Java and Python… ouch. I guess this is why pride is a sin.

fins · October 21, 2022, 12:36pm

Your observations seem correct to me. You ask what the point is … and I’d say pride : being beaten by C++ okay, but crushed by Java and Python… ouch. I guess this is why pride is a sin.

Haha, I totally get that. My 2 cents would be: Efforts in that direction are probably invested more wisely, by tackling one of the high-profile benchmarks, even when they’re evaluated with flawed metrics, often, currently, to: Gain more traction and then question (some of the) established metrics. This (nonetheless fun) challenge will likely be forgotten in 2…3 weeks.

Palli · October 21, 2022, 1:47pm

Startup-time is avoidable, see e.g. this benchmark (also its intriguing accuracy column):

@brenhinkeller I believe ~~it uses threading~~, so matching (and beating?) Parker’s challenge ~~should then be possible~~. Can someone also look into making that Julia code a tiny bit faster, to close the 0.46% speed-gap (less accurate if needed?), to claim top spot there?

EDIT: It doesn’t use threading (can’t?), I misremembered, it used @simd. The tricks used in the best Parker’s challenge code use threading (and interesting tricks, and improved algorithms) so very hard to beat. How difficult and far off might it be to support threading (Julia’s or polyster-threads) in StaticCompiler.jl?

fins · October 21, 2022, 5:11pm

This must be a new(er) implementation, that hadn’t been reported, before. Can’t imagine, it doesn’t use threading, already, with that result and being almost on par, with the best C(++) contenders.
Can’t seem to find a link, anywhere - do we have a repo-link, for the julia-code?

lol, just realized, this is not regarding the matt-parker-challenge.

Palli · October 26, 2022, 5:20pm

This change alone would also help:

Topic		Replies	Views
Benchmark game challenge and some optimization questions Performance	29	2803	January 13, 2024
Benchmarks game Performance	20	3785	May 13, 2020
Benchmark for latest julia? Community question	126	14764	April 1, 2019
Question on simple performance comparison between Python and Julia General Usage question	23	1036	June 13, 2023
Why is julia slowest here? Performance performance	4	1143	March 6, 2022

Matt Parker's Crappy Python Challenge

Related topics