Strftime & strptime bug #27239 is present on all platforms, not just Windows

My point was that calling them “invalid UTF-8” leads people to believe they were supposed to be UTF-8,
which was not the case at all. The function correctly returned them using EUC-KR encoding, based on the setting of LC_TIME, as noted in the documentation for the strftime and strptime functions on the different platforms. Note that these functions are part of the Open Group Unix standard, the Posix standard, as well as the ISO C standard, going back at least 30 years.

Most all of the times over the years I’ve seen people have “corrupted” text data, it’s simply been a case of misidentification, the data was actually valid, it just wasn’t UTF-8.

Since String in Julia is supposed to be UTF-8, the strftime function needs to handle converting whatever the C library strftime or wcsftime function correctly returned, to the UTF-8 encoding required by the String type.

One solution to the problem is using wcsftime / wcsptime with transcode, however,
it might be better for Julia to always set it’s locale to a UTF-8 one (in your case, ko_KR.UTF-8).
That might cause problems for C code that Julia calls, depending on whether they use the locales correctly, but that’s probably not that likely.

Yes, it’s just that most people on Macs or Linux have LC_TIME set to *.UTF-8.
Here is the output from locale on my laptop:

01:02 $ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

What is it on your system? (I hope that Windows has the locale command, if not, you might need to write a little C program to get the setting of LC_TIME).

1 Like