Python big integer vs. Julia big integer arithmetic

oOosys · June 19, 2024, 6:11pm

And I am expected just to believe such authoritative claims without any evidence I can check out myself?

stevengj · June 19, 2024, 6:13pm

Yes, definitely, if you want to benchmark this properly in Maxima, you should write the timing in Maxima too, analogous to @btime or timeit.

But don’t demand that people do this for you — we all have better things to do than to satisfy “please benchmark X against language Y” requests for lots of values of X and Y.

You can believe whatever you want — people here have no obligation to prove anything to you. But there have been vast numbers of comparisons of Julia performance to other languages in this forum and elsewhere, which you can easily find if you try.

Tortar · June 19, 2024, 6:15pm

I see a lot of evidences, why should we all lie to you?

oOosys · June 19, 2024, 6:16pm

Maybe just because I am after fair comparisons and don’t trust tools which calculate running times for small code snippets by doing some statistics and other “magic” I am not able to see what they are actually doing?

Mason · June 19, 2024, 6:19pm

If you want to optimize julia for command line usage, there’s a few things you can do here:

use --startup=no (this makes sure you’re not including the time it takes to run any of your startup scripts)
use -t1 (i.e. --threads=1). Julia is natively multithreaded, unlike python, but aquiring OS threads at startup is a pretty big source of overhead that can be eliminated by aquiring only one thread.
use --compile=min. Julia’s JIT compilation is great for runtime performance, but you seem to not care about runtime performance and have instead constructed a benchmark that’s completely dominated by julia’s startup time, so we can trash runtime performance and come out ahead because we reduce the time spent compiling.

After all that, I get:

before:

❯ time julia -e 'powermod(3, 10000000000000, 102)'
0.41s user 0.59s system 241% cpu 0.410 total

and after:

❯ time julia -t1 --startup=no --compile=min -e 'powermod(3, 10000000000000, 102)'
0.05s user 0.06s system 116% cpu 0.096 total

which is a pretty good uplift. But at the end of the day, this sort of benchmark is just not something julia is particularly good at (though it’s getting better at this every release).

If all you care about is the startup time of your programs, Python is a great option. If you later start doing calculations that are sensitive to the runtime performance, then julia will be ready and waiting for you.

oOosys · June 19, 2024, 6:22pm

“Lying” is often done out of not being aware that it is not true. An authoritative person states something what is not true and all others just believe it without questioning it themselves. You see it everywhere on the Internet. Untrue statements spread around thousands of times … It is the convenience of not testing oneself before passing some information as true further on - maybe the reason why this discussion lacks evidence using appropriate code?

sgaure · June 19, 2024, 6:24pm

They generally run the code snippets some thousands or million times, and divide the total time with the number of times it was run. Most developers who depend on speed use such tools regularly to e.g. compare different ways to do things. Adjustments to an inner loop may speed up things, two or five or a hundred times.

If you look in the julia developer forum on github you’ll see that a regular thing that’s done with new versions is to run a number of such benchmarks and compare it to previous versions (e.g. here: Performance regressions in linear algebra benchmarks with Hermitian and Triangular matrices · Issue #54810 · JuliaLang/julia · GitHub). If anything is noticable slower it’s investigated further. So, these tools are indispensable for development work.

oOosys · June 19, 2024, 6:32pm

~ $ time julia -t1 --startup=no --compile=min -e 'powermod(3, 10000000000000, 102)'

real	0m0.212s
user	0m0.148s
sys	0m0.069s
~ $ time maxima --very-quiet --batch-string='block([modulus:102], print(rat(3)^10000000000000), exit());'

block([modulus:102],print(rat(3)^10000000000000),exit())
warning: assigning 102, a non-prime, to 'modulus'
69 
                                    exit()

real	0m0.093s
user	0m0.070s
sys	0m0.030s
~ $ time python -c "print( pow(3, 10000000000000, 102) )"
69

real	0m0.034s
user	0m0.025s
sys	0m0.008s

All of these are interpreted “languages”. If they do it right … how does it come that such big differences are there?

Mason · June 19, 2024, 6:40pm

Julia is not (typically) an interpreted language. It is a Just-In-Time compiled language that tries to give as many of the nice interactive features an interpreted language typically gives without sacrificing runtime performance.

When you load julia you have to load around half a gigabyte of objects into RAM because Julia includes a big heavy compiler, a sysimage of all of Base + many standard libraries’ precompiled methods, and a bunch of binary artifacts loaded like OpenBLAS.

Fast startup was not a design goal for Julia. It’s just not something julia was designed for, and so it’s not very good at it. It’s getting better at it every release, but it’s not at all clear if a regular julia REPL will ever load anywhere close to as fast as a regular Python REPL.

sgaure · June 19, 2024, 6:43pm

Although julia does contain an interpreter, it is not an interpreted language. In general, functions are compiled to native cpu code before being run. Compiling takes time. We’ve told you numerous times. The startup time also varies with what is being done at startup. In particular, julia loads a quite large system library, a compiler, etc. etc., and does quite a lot of initialization. Other languages may load things incrementally if needed, etc, etc.

Note that most of the time you measure for python is also python’s startup time. The time do the pow is negligible. It’s doing a few hundred squares, mods and multiplications. A few microseconds task in most languages, interpreted or compiled.

rfateman · June 19, 2024, 6:45pm

(defun powermod(n pow mod)
  (cond
       ((= pow 0) 1)
       ((= pow 1)(rem n mod))
       ((evenp pow)(powermod 
                      (rem (* n n) mod)
                      (ash pow -1)
                      mod))
       (t (rem (* n (powermod n (1- pow) mod))
                mod))))

thanks, here it is indented. I wouldn’t have bothered except to point out to the non-Lispers in the audience that the importance of indentation in programming languages goes back a long time (in Lisp) and that typical editors do it for you. The misalignment of indentations is often useful for debugging. Some editors do a lot more, e.g. emacs.
Some programmers even favor emacs as the “console” … the home for the REPL.

If we were really timing bignum arithmetic instead of startup time for doing arithmetic on small numbers (modulo 102), it would be almost pointless to make a fuss about Lisp, because non-trivial bignum computations would probably access GMP, the gnu multiple precision library.
Possibly also pointless regarding Python, for the same reason… there are python-gmp bindings. Apparently several.

Regarding comparisons and who to believe – I tend to believe claims of various programming systems that they run faster than Python. If they claim to run significantly faster than (well-written) C or C++ I might be more curious.

DNF · June 19, 2024, 6:50pm

Unfortunately, if you don’t trust software to do the benchmarking for you, you will not be able to time fast code. The benchmarks you have conceived are completely dominated by noise. The code itself will run in sub-microseconds time, but if you only trust timings you can “see with your own eyes”, then I guess you are out of luck🤷

DNF · June 19, 2024, 6:57pm

I think I missed that part. I understand that there are situations where Julia can beat C (where one would need complicated optimizations in C), but I don’t think that is likely to apply to powermod.

BTW, this isn’t Bignum, I’m not positive I understood you correctly on that point.

sgaure · June 19, 2024, 6:59pm

It seems so. The original julia parser is written in scheme. julia/src/julia-syntax.scm at master · JuliaLang/julia · GitHub

oOosys · June 19, 2024, 7:06pm

Yes … it comes down to have an appropriate task which takes time if scaled up. I have erroneously assumed powermod() is suitable for showing the power of big integer arithmetic misled by posting in the original discussion thread this one was decoupled from, but it’s trivial no matter the exponent because of the modulo able to neutralize the need for actual big integer operations, right?

In other words this discussion needs another example of code performing big integer arithmetic operations able to being easily scaled up where the runtime scales at least linear with increasing values or amount of values involved.

DNF · June 19, 2024, 7:12pm

Julia draws inspiration from various dialects of Lisp, including Scheme and Common Lisp

Some people consider Julia to be a dialect of Lisp. There’s even a builtin Lisp mode that lets you write with s-expressions.

sgaure · June 19, 2024, 7:16pm

I think all languages that support big integers do it by calling gmp-routines. They will essentially run exactly the same code for the big integer operations. There are some differences. I think julia can’t do in-place operations on BigInt, and this may slow it down for some operations, compared to languages which do enable in-place operations (it saves allocations).

rfateman · June 19, 2024, 7:24pm

if you actually computed 3^100000 and then reduced it modulo 101 or 102 then you would be doing bignum arithmetic. This would not be a smart way of doing it.

rfateman · June 19, 2024, 7:27pm

I was not thinking of the surface syntax of the language, but of the lessons of the implementations. Stuff like memory management, faster light-weight subroutine calls, how/where to store lexical values (shallow vs deep binding).

DNF · June 19, 2024, 7:30pm

I think you missed the larger point. (The syntax mode is just an Easter egg, but it emphasizes how much they did in fact take lessons from Lisp.)

Topic		Replies	Views
Poor BigInt performance Performance	32	1252	May 14, 2024
Why is python faster than Julia Performance	14	1915	March 12, 2020
Comparing exp() performance on Julia versus numpy Performance	38	1913	August 8, 2022
Why is this code so slow in julia compared to a numpy implementation? Performance performance	9	3566	October 24, 2017
Performance / Memory differences between constructors of BigInts Performance performance , numbers , type , memory-allocation	5	336	August 16, 2023

Python big integer vs. Julia big integer arithmetic

Related topics