sizeof(str::AbstractString)
Size, in bytes, of the string str. Equal to the number of code units in str multiplied by the size, in bytes, of one code unit in str.
I understand that in this case sizeof and summarysize should return the same value… What am I missing ?
Some context : I want to convert a Vector of Strings into a Vector of some struct by splitting the strings at some separator, then convert the obtained substrings to more appropriate formats (Char, Int …) if possible.
sizeof('z') == 4 because a Char is stored as a 32-bit value (see ?Char). This is required so any Unicode codepoint can fit in a Char.
sizeof("z") == 1 because encoding “z” in UTF-8 takes only one byte.
Base.summarysize('z') == 4 because a Char is a simple value type.
Base.summarysize("z") == 9 because… hum I’m not sure: I thing this counts 8 bytes for the pointer to the region of memory that holds the string, and 1 byte for the string itself. But it should also count some bytes for storing the length of the string?
The reason that this is 9 is that, internally, a String consists both of an array of bytes (UTF-8 code units for the encoded string) and an internal length::Int field and summarysizeincludes the Int size. sizeof(Int) == 8 on a 64-bit machine, and 1+8 == 9. (Technically, a String object may have an even bigger footprint in memory: not only may it implicitly include a 1-byte NUL terminator for ease of passing to C, but a heap-allocated Julia value can also have a preamble with a type tag and some other info.) In contrast, sizeof only gives you the size of the underlying String data and not the Julia wrappers thereof.