Can Julia do this algorithm faster?

So I’ve developed the fastest algorithm to find Twin (Cousin) Primes. Here’s the paper on it.

If you’re REALLY interested in the math also check out these:

and its video: (Simplest) Proof of Twin Primes and Poliganc's Conjectures - YouTube

So far I’ve personally implemented the algorithm in 6 different languages.

Crystal – Twinprimes generator, multi-threaded, using SSoZ (Segmented Sieve of Zakiya), written in Crystal · GitHub

Rust – Twinprimes generator, multi-threaded using rayon, using SSoZ (Segmented Sieve of Zakiya), written in Rust · GitHub

Nim – Twinprimes generator, multi-threaded, using SSoZ (Segmented Sieve of Zakiya), written in Nim · GitHub

C++ – Twinprimes generator, multi-threaded, using SSoZ (Segmented Sieve of Zakiya), written in C++. · GitHub

D – Twinprimes generator, multi-threaded, using SSoZ (Segmented Sieve of Zakiya), written in D · GitHub

Go – Twinprimes generator, multi-threaded, using SSoZ (Segmented Sieve of Zakiya), written in Go. · GitHub

I’ve been asked multiple times, “Hey, have you done a Julia version?”

But I don’t know no stinkin Julia! :face_with_hand_over_mouth:

So I keep seeing the propa…er PR about how good, fast, yada, yada it is, but talk is cheap!

Currently the Rust version is by far the fastest and uses the least memory. It was done with a lot of help from the Rust forum (really nice people over there), because I ain’t no (frequent) Rust programmer either.

So to stop the Julia fanboys asking me this, I’m requesting someone(s) who know what they’re doing to conjure up a Julia version (post a gist link) that will do it proud, to compare to the others. I’ll run some comparisons to test it, and post here. All I can offer is bragging rights if you’re faster, and a promise to learn enough Julia to really understand the implementation.

But seriously, I really would appreciate what you all consider an idiomatic Julia version to compare.


Is there a page where you compare the performance of the versions you have so far?

I’ll post some time comparison for different inputs for all versions done on my laptop, probably tomorrow, when I do them again for the code I just upgraded these past 2 weeks. I just optimized all their implementations so I have to redo their tests.

1 Like

Is there a reason everything is in gists rather than an easily clonable repository?


If you remove the comment lines, the code comes in ~260-270 loc, depending on the language. I started using code for another sieve and just modded it to get this working, circa 2018. I was just doing it for myself. The code is just 1 file, so it didn’t seem worth it to create a repo for something I could distribute/link as a gist. Then I started doing other versions, and didn’t want to do all the work to keep separate repos for each one.

Bottom line is, I didn’t want to do all the work to learn github for a single page of code. I might reconsider in the future, if I want to make it a bigger project, but that’s the answer to your question.


This doesn’t have to do with the sieve, but your proof of infinite twin primes is wrong.


Well, you have to site where, and why, it’s wrong, don’t you.
Nobody has done that so far.
Email me offline with the specifics so I can know what to explain/contest to you. OK.

Please take the time to thoroughly read, and understand, my papers/video, because it seems almost impossible for you to have done so in such a short time since I started this thread.

Here are some time comparisons I ran today, run on a quiet system after bootup.

System76 laptop (2016); i7 6700HQ, 2.6-3.5GHz, 4C|8T, 16GB memory
Linux 5.15.14, gcc 11.2.0, llvm 12.0.1

I use htop to monitor threads and memory use while the programs run.

I just show Rust and C++ because they were the 2 languages people seemed most interested in.

As a reference, I compare my results with primesieve

GitHub - kimwalisch/primesieve: 🚀 Fast prime number generator

a highly optimized C++ implementation of the the Sieve of Eratosthenes (SoE). I used its current
version (ATOW) primesieve-7.7

I consider my code reference implementations. They perform all the separate pieces of the
algorithm but are not necessarily optimum for performance or memory use. However, they still
demonstrate the efficiency and inherent performance of the algorithm, which is the fastest
way to generate twin (cousin) primes, based on Prime Generator Theory (PGT) math.

primesieve actually has 3 different routines, for small, medium, and large input number ranges. For inputs from about 16-20 digits (20 digits is max for 64-bits) it’s faster because it uses look-up tables of compressed sieving primes. My implementations generate all its parameters from scratch, which slows it down when generating the sieving primes for large input values.

My code also doesn’t try to optimize cache hits, etc, like primesieve, but all the code
to select which Prime Generator (PG) and segments size to use is confined to set_sieve_parameters, so you only have one place to optimize memory use and generator selection. People should appreciate you can do the whole algorithm in less than 300 loc.

The SSoZ (Segmented Sieve of Zakiya) is designed to be run on multicore systems. The more cores the faster. I’ve only had an x86_64 system to test on, and am looking for people with
different hardware to run it on (ARM|M1, AMD, RISC, PowerPC, etc).

The table shows the times (secs) for Rust, C++, and primesieve for the single inputs shown.
I also show the count of twinprimes (all agree). My implementation also gives the largest
twinprime in the input range. This allows you to search for specific twinprimes using small ranges.

The major difference between the language implementations is how they do parallel processing.
For Rust I use the Rayon crate – GitHub - rayon-rs/rayon: Rayon: A data parallelism library for Rust – while C++ uses OpenMP.

This data can be used as references for a Julia implementation to see how it compares when
running each version on the same system.

If there are any questions please feel free to ask here and email me.

         N          |   Rust |   C++  | PrmSev |   Twinprimes   |    Largest in Range   |
     10_000_000_000 |   0.35 |   0.45 |   0.51 |     27,412,679 |     9,999,999,702+/-1 |
     50_000_000_000 |   1.67 |   2.14 |   2.81 |    118,903,682 |    49,999,999,590+/-1 |
    100_000_000_000 |   3.41 |   4.24 |   5.91 |    224,376,048 |    99,999,999,762+/-1 |
    500_000_000_000 |  18.15 |  21.51 |  32.76 |    986,222,314 |   499,999,999,062+/-1 |
  1_000_000_000_000 |  37.67 |  44.48 |  69.25 |  1,870,585,220 |   999,999,999,960+/-1 |
  5_000_000_000_000 | 219.67 | 253.62 | 395.16 |  8,312,493,003 | 4,999,999,999,878+/-1 |
 10_000_000_000_000 | 482.51 | 543.74 | 825.71 | 15,834,664,872 | 9,999,999,998,490+/-1 |

is your work published in any peer-reviewed journal? I was surprised to not see your name in:

considering you claim that you have provided the conjecture


Here’s the deal.

I created a branch of math I call Prime Generator Theory (PGT) because I wanted to create a fast(er) prime sieve method than I had seen. In 2019, after I had written/done the work on the twinprimes sieve, I wrote the paper on the Twinprimes Conjecture, because it was so obvious Polignac’s Conjecture) was true.

But here’s the problem. My methodology is too simple for academic mathematicians to accept, because it doesn’t use (or need) any calculus, algebra, or probability analysis. But nobody can find anything mathematically wrong with it to point out where it fails, because I first developed all this software that uses it that is prima facie proof it works. See the following Handbook.

I’ve sent my paper to many academic faculty members, but only one said he will take time to try to understand it. I inquired about submitting it to one journal and they wanted $2K to have it peer reviewed before it could be published. Bunk that!

So end of 2021 I did the video on my paper, which actually explains it in more detail, and more simply, than my paper, but both complement each other, because I couldn’t put everything in the paper in the video (and vice versa).

Because my background is in engineering its written like a engineer writes, not an academic mathematician. That doesn’t mean it’s not true, that means mathematicians have to adapt to my presentation to understand it, and most don’t (can’t) do it. Too much ingrained thinking. And that’s why most will not read, understand, or accept it, because its too far outside their status quo training, and that’s why they would never take this approach in the first place.

But basically all I present in the video is the logical and mathematical basis to show, as a Proof by Contradiction, you can never have a last pair of primes that differ by any gap size, which means there are an infinity of prime pairs of any gap size.

it’s because the proof is wrong (or rather isn’t actually a proof). The paper makes several statements that are just false, and the bulk of the proof relies on empirical evidence from data rather than actual proof.


There are many peer-reviewed journals that do not ask for money to publish (I would even say most of them). Your paper may be paywalled by some amount of time (possibly forever), but you will always be able to distribute the PDF yourself (and often have a preprint in arxiv). Unless your paper is very long (could not check, will not make an account), then there must exist many journals which would take it.

1 You don’t need to open an account on in order to read/download papers. If you use the right browser/extensions to get rid of those popups you won’t be bothered, otherwise click past them and you can get to anything there.

2 arxiv ( said I had to be sponsored by someone before accepting my papers, so I also posted them on vixra ( . Just put my name - Jabari Zakiya - in its search bar and all my papers posted there come up – e-Print archive, Jabari Zakiya

On The Infinity of Twin Primes and other K-tuples

The Use of Prime Generators to Implement Fast Twin Primes Sieve of Zakiya (SoZ)...

3 This is the 21st Century. I don’t need to try to get some stodgy publication to publish my work, to (not) be read by the 100s (?) of people who might read those journals. It’s not about allowing a handful of entities to control knowledge and use of my work, when I have access to the whole internet of humanity. I get orders of magnitude more looks/read/downloads of my stuff on from outside the U.S. than from within.

4 In 2012 I created a wikipedia page for my general sieve, and it was taken down because they said original authors/creators of materials aren’t allowed to post their stuff, it had to be submitted by someone(s) else with cited references. You’re encouraged to create a submission.

5 Apparently people don’t really bother to read stuff for comprehension anymore, because my papers are written to be understood, and the math applied. If you can’t read and understand my papers and video then there’s no chance in hell you’ll make any sense out of Terrence Tao, etc, who only a handful of people in their specialties can probably thoroughly understand. (And I emailed him my paper, which he probably never bothered to look at.)

6 To do a Julia version, you don’t even need to know the math behind it, you just need to translate whatever version you use into Julia. Probably for most people the easiest thing to do is compile the C++ version and use it as a reference to see that you get the same outputs then optimize afterwards.

I would disagree a bit on point 6, you would need to know the math to some degree to implement it. You’re right that one could learn the math via another language’s implementation, though I can’t say if that would be easier. I don’t know math or any of the languages used so far to participate in the challenge myself, and even if I did I’m only so-so at optimizing Julia (just the typical performance tips of type stability and reducing allocations).

As for the original question “can Julia be faster”, my gut feeling is a hard “maybe”. Julia is designed to allow people to write composable generic code and benefit from performance optimizations typical in lower-level languages; Julia isn’t magically faster than everything. Rust and C++ are some of the fastest languages, so if your implementations are done well, a more realistic goal of a Julia implementation would be to approach or match that performance.

If you want to bring your code (the C++, at least, I don’t know about the others) to Julia, implementing it from scratch isn’t necessary or typical. Some gold-standard numerical libraries in C and Fortran are wrapped for use in Julia code.


This is a false dichotomy. Publishing in a peer-reviewed Journal does not hinders (at least not completely) the possibility of distributing almost the same material openly, like I said before. If you search well enough, you can even find a 100% open publication Journal that does not ask for a fee. I did not understand the countries bit, because it seems to imply that Journals lock the knowledge in the US, but I am not in the US or a US citizen and have heavily relied on international Journals for my work. Also, even if your work is publicly available, researchers may not read it if it is not published because they will only take the time to read and cite papers that already passed the sieve of peer review.

I already had the experience of someone emailing a paper to me. It was not pretty, the paper was nonsensical: the author believe that he proved that P == NP using curve fitting over the time increase obtained by increasing the knapsack capacity. There were layers of wrong. I showed my lab colleagues and we had a good laugh, but nobody had the hearth to send him an answer back. The problems were so basic that anything we wrote would sound like we were patronizing him (he was an older Professor while all of us were PhD candidates).


Alright, this is getting pretty far-afield from a discussion related to Julia — even for something in #offtopic. A better venue for discussing the proofs themselves would probably be Math Overflow or something like it.

Let’s try to keep focused on the possible benchmarks, please. A good benchmark shootout is always great, but @jzakiya please do realize that you’re requesting a good amount of free work from someone here.


As I said in the beginning, it was a request, so I realize no one may bother to try to do it.

However, the first version I did was in Nim, which I had to learn, after that it was all downhill for the other versions, because all I had to do was learn how to do xyz in them.

The Rust version is literally their forum community edition. I posted my original Rust attempt (that worked) and people just optimized some of the pieces, and it was then literally twice as fast.

If people are really interested in the math they should just email me, or start another thread, or check out the discussion in this forum, for example.

That’s a significantly different situation; it’s much easier to give optimization advice on code that exists than it is to translate 200-300 lines of code.


Yes, yes, of course it is, but that wasn’t the salient point.

In the Rust forums they love to code. They know it’s hard to learn, and I find they’re very patient answering the same questions from newbies over and over.

So when I provided them an extensive piece of working code they jumped on it, first because some were curious about what it was doing so they could understand its purpose, but mostly they wanted to showoff (my characterization) how good Rust is, period. They didn’t want my novice code be what I put out to represent a Rust version of my algorithm (again, my characterization), when compared to the others.

It doesn’t really matter to me what language is the fastest, because they all have their place.
But the best way I learn a language is to use it to do something I care about. And if it can do that, then I keep it in my toolbag of languages to consider using to do other stuff with in the future.

We love code here as well. In particular performant code.

Do the same here and you will likely get a similar reaction. Unfortunately, you didn’t start this thread with Julia code but with a simple request instead.