A Python rant about types

Liso · July 24, 2020, 2:32pm

According to PEP-381 it is based on UTF-8b.

It seems that this type of solution was analyzed but abandoned in python. See explanation from that PEP:

“… , the approach of escaping each byte XX with the sequence U+0000 U+00XX has the disadvantage that encoding to UTF-8 will introduce a NUL byte in the UTF-8 sequence. As a consequence, C libraries may interpret this as a string termination, even though the string continues. In particular, the gtk libraries will truncate text in this case; other libraries may show similar problems.”

(There are also described some security concerns about supporting everything)

Topic		Replies	Views
Julia's UTF-8 handling [vs. new Python's 3.7 UTF-8 PEP 540] Internals & Design	29	4689	January 24, 2018
Converting string of bytes to integer General Usage question	6	3488	April 11, 2021
Passing bytes instead strings in PyCall.jl General Usage	2	1127	May 24, 2017
Solution for issue #25216, larger octal literals produce smaller types, sometimes Internals & Design	7	1003	December 23, 2017
Julia equivalent to Python's int.to_bytes General Usage question , python	14	3303	October 1, 2020

A Python rant about types

Related topics