I believe this is an excellent real-world example to showcase Julia’s superiority.
GaloisFieldNumbers is a reference (snapshotted) implementation that shows the beauty of Julia in 300 lines of code. It was released a few months ago in the Chinese community and was never posted here.
In summary:
- it’s about 300 lines of code, written in about 2-3 days.
- on CPU, it’s 1000 times faster than MATLAB’s C codes by adopting two key designs:
- composability: scalar struct design to reuse the array implementation
- generated function: dynamic generation of lookup-table (used by multiplication/division) without runtime overhead; this further enables SIMD.
- it’s GPU-ready without a single CUDA.jl-related line of code. – This is because the scalar struct design is a bits type.
- MATLAB/Python + C solution could never reach this performance (because they don’t have generated function)
Some materials are written in Chinese, and there’s also a talk about it in the JuliaCN meetup 2022 at bilibili
It’s 1000 times faster than MATLAB’s C codes
Don’t ever try to conclude that Julia is 1000 times faster than C. It’s Julia plus the code design for a very specific example that makes this possible.
I was one of the core maintainers of JuliaImages and I’ve used Julia for over five years. Now I’m working full-time at TongYuan. It’s a busy life here, and this post is written quickly without much detailed explanation (sorry… ). If you had any questions about the source codes, I’ll reply when I get available.