Reading and writing BigInt to file

phma · March 9, 2025, 6:47am

I’m writing a package that creates a cryptographic key consisting of big prime numbers, and I need to write them to a file and read them back. Is there a preexisting library that does this, or do I have to write the code myself? Is a four-byte length big enough for BigInt in general?

Per · March 9, 2025, 8:05am

You could write the prime numbers, one per line, to an ordinary text file with println, then read them back with readlines and parse back to BigInt with parse(BigInt, line).

No. Not all BigInts fit in four bytes. They can be arbitrarily big.

phma · March 9, 2025, 8:09am

A BigInt can have more than 4294967295 bytes? I seriously doubt I’ll need a prime number that big, but if BigInts can be bigger than that, then the library should use an eight-byte length.

Benny · March 9, 2025, 8:10am

To elaborate, there might be something out there more specific to GMP (which BigInt is built on), but writing out the integer in text can apply to any format. Here’s an example:

open("twotohundred.txt","w") do f
    print(f, BigInt(2)^100)
end

That converts the BigInt to a String with the function string in a deeper call, so it’ll look as you’d expect. If you use string directly, you can also use write instead of print. You can delimit the integers with more than newlines, you can print or write anything.

GMP’s integer data increments at word size, that’s 4 bytes for 32-bit systems and 8 bytes for 64-bit systems. Converted to UTF-8 text in base 10 though, that’s 1 byte per digit.

Sure, if you have enough working memory for it.

phma · March 9, 2025, 8:16am

No delimiter is necessary, because the number of bytes in the BigInt will be written before the bytes themselves.

The number will be written as bytes, 8 bits per byte, not as decimal digits, 3.3 bits per byte.

Per · March 9, 2025, 8:22am

I’ve never actually tried this, but the GMP library webpage https://gmplib.org states that the precision is only limited by available memory, so on a 64 bit system that could be more than 4 GB.

Benny · March 9, 2025, 8:33am

Definitely possible, GMP has functions for raw IO like this, and even if it didn’t, we have access to the raw data via BigInt, just a matter of figuring out how much to read from the pointer. But I don’t know BigInt or ccall well enough to do this.

Per · March 9, 2025, 8:40am

Is this really worth the extra effort? For a 4096-bit cryptographic key, this is less than a kilobyte of disk space saved…

phma · March 9, 2025, 8:42am

It says it writes four bytes of size. However:

julia> big(256)^444;

julia> big(256)^2147483647;

julia> big(256)^2147483648;

julia> big(256)^4294967295;

julia> big(256)^4294967296;

julia> big(256)^5294967296;

julia> big(256)^5294967296%7
Killed

Julia can hold a number bigger than that, but on this computer can’t compute with it.

phma · March 9, 2025, 8:46am

It’s more effort to convert it to decimal than to write the bytes as they are. Not that it matters much, as writing is I/O-bound. But most of the key files I’ve seen are either binary or base64 encoded, not decimal.

Per · March 9, 2025, 9:02am

In that case, your computer is not going to find a prime number that needs more than 17 megabytes of storage either…

Benny · March 9, 2025, 9:27am

My god, 1 year, thousands of GPUs, and $2M just to find a 17MB-ish BigInt we can instantiate in a few ms. I guess the eternal bragging rights is worth more than crypto or AI bot profits.

Janis_Erdmanis · March 9, 2025, 9:43am

If you are not bothered by serialisation performance, you could use parse(BigInt, bytes2hex(x), base=16) and the reverse hex2bytes(string(x; base=16)). This, however, adds unnecessary computation and memory overhead, and you may benefit from int2octet and octet2int methods as implemented in CryptoGroups.jl as shown in the following line.

caleb-allen · March 9, 2025, 4:20pm

iirc GMP BigInt performance for parsing from a string is quite bad (in my experience with integers of a couple MB).

That said, I believe there is a GMP method for serializing and deserializing based on their in-memory representation. The problem is that it is not exposed on the Julia side.

I think I played around with calling the C function directly, I could go back and find it if there’s interest (I think it’d be worth submitting a PR to Julia, too)

Janis_Erdmanis · March 9, 2025, 4:46pm

For serialising BigInt to the buffer there is:

_, written = GMP.MPZ.export!(@view(buffer[end-needed_bytes+1:end]), n;
                                 order = 1,    # Big-endian
                                 endian = 0,   # Native endian
                                 nails = 0     # Use all bits
                                 )

However there does not seem to be an inverse method. At least I haven’t managed to find one.

stevengj · March 9, 2025, 6:29pm

Yes, we need a wrapper of mpz_import. Shouldn’t be hard to write. Could maybe improve the performance of the serialize and deserialize functions too?

Serialization currently uses str = string(n, base=62) and parse(BigInt, str, base = 62) to convert more efficiently to/from a buffer without going through decimal, so that’s also an option. (62 is the biggest base that you can represent with 1-byte [0-9A-Za-z] digits.

phma · March 10, 2025, 6:47pm

Okay, it looks like I’ll write my own code to do this.

Dan · March 11, 2025, 1:54am

Unless there is a need for compact binary representation, text-editor compatible output, as stevengj suggested is best.
Even using base = 60 just as a nod to Babylonian times and to keep a couple of sentinal values for begin, end and other special digits (possible even prep for BigFloat serialization).

Topic		Replies	Views
Understanding BigInt Numerics	3	559	March 1, 2019
Reprezentation of large numbers in cryptographical application New to Julia	6	870	January 23, 2018
Why can I just write BigInt("12345") General Usage	14	700	September 19, 2019
What is rand(big(1:6))? General Usage	8	529	April 7, 2019
Converting UInt8 Array to BigInt, and back New to Julia	19	3504	July 12, 2019

Reading and writing BigInt to file

Related topics