Strftime & strptime bug #27239 is present on all platforms, not just Windows

ScottPJones · May 28, 2018, 6:08am

My point was that calling them “invalid UTF-8” leads people to believe they were supposed to be UTF-8,
which was not the case at all. The function correctly returned them using EUC-KR encoding, based on the setting of LC_TIME, as noted in the documentation for the strftime and strptime functions on the different platforms. Note that these functions are part of the Open Group Unix standard, the Posix standard, as well as the ISO C standard, going back at least 30 years.

Most all of the times over the years I’ve seen people have “corrupted” text data, it’s simply been a case of misidentification, the data was actually valid, it just wasn’t UTF-8.

Since String in Julia is supposed to be UTF-8, the strftime function needs to handle converting whatever the C library strftime or wcsftime function correctly returned, to the UTF-8 encoding required by the String type.

One solution to the problem is using wcsftime / wcsptime with transcode, however,
it might be better for Julia to always set it’s locale to a UTF-8 one (in your case, ko_KR.UTF-8).
That might cause problems for C code that Julia calls, depending on whether they use the locales correctly, but that’s probably not that likely.

Yes, it’s just that most people on Macs or Linux have LC_TIME set to *.UTF-8.
Here is the output from locale on my laptop:

01:02 $ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

What is it on your system? (I hope that Windows has the locale command, if not, you might need to write a little C program to get the setting of LC_TIME).

Topic		Replies	Views
Problems with deprecations of islower, lowercase, isupper, uppercase Internals & Design	179	13314	January 1, 2018
Julia's UTF-8 handling [vs. new Python's 3.7 UTF-8 PEP 540] Internals & Design	29	4702	January 24, 2018
Julia install issue (foreign name) New to Julia	15	597	December 10, 2024
Julia 0.7.0-alpha.0 ERROR: PCRE.exec error: UTF-8 error: isolated byte with 0x80 bit set New to Julia	14	1693	June 9, 2018
Changes to the representation of Char Internals & Design	14	2851	December 12, 2017

Strftime & strptime bug #27239 is present on all platforms, not just Windows

Related topics