Cyrillic symbols in comments

gvbelov · November 6, 2020, 5:42pm

When I use comments, typed in Cyrillic in a file (“hello.jl”) and load this file like that include(“hello.jl”), I have the error message: ERROR: LoadError: syntax: invalid UTF-8 syntax

Example:

println(“Hello !”) # Булгъар

Mason · November 6, 2020, 5:52pm

It appares that the quotation marks you use are the problem, not the comments:

julia> println(“Hello !”) # Булгъар
ERROR: syntax: invalid character "“" near column 9
Stacktrace:
 [1] top-level scope at none:1

julia> println("Hello !") # Булгъар
Hello !

Use ASCII quotation marks.

gvbelov · November 6, 2020, 5:54pm

Same file, but with comment works fine

println(“Hello !”) # some comment

gvbelov · November 6, 2020, 5:55pm

In REPL Cyrillic comments works too

stevengj · November 6, 2020, 6:18pm

Probably you didn’t save the file in the UTF-8 encoding, but used some other encoding like UTF-16. What editor are you using?

An example in the REPL of invalid UTF-8 data can be generated by creating a string from random bytes:

julia> Meta.parse("# " * String(rand(UInt8, 10)))
ERROR: Base.Meta.ParseError("invalid UTF-8 sequence")

gvbelov · November 6, 2020, 6:37pm

It does not depend on the editor, I have used several Notepad++, TED Notepad,… For example, file with such comments works

println(“Hello !”) # Հայերեն

println(“Hello !”) # სომეხური ნინა

println(“Hello !”) # میلیون نفر بۇ دیلده

stevengj · November 6, 2020, 6:45pm

Any decent editor will preserve the encoding of the file by default, so simply opening it and re-saving in another editor will not fix the encoding. You’ll have to change an editor setting somewhere to specify conversion to UTF-8. How to do this will vary with the editor.

For example, for Notepad++ see here. For TED Notepad there is an encoding option in the File menu. And so forth.

PS. I would strongly recommend using a modern programming editor like vsCode. (To change the encoding to UTF-8 in vsCode, there is a menu at the bottom of the file window.)

gvbelov · November 6, 2020, 6:56pm

Well. But why this line is Ok

println(“Hello !”) # Հայերեն

and this is not

println(“Hello !”) # Булгъар

?
And what does the editor have to do with it?

stevengj · November 6, 2020, 6:57pm

It depends on how it is encoded in whatever encoding you are using and whether that happens to correspond to a valid UTF-8 sequence.

The editor determines what the default encoding is and how to change it, as I explained in my message above.

gvbelov · November 6, 2020, 7:02pm

Well. Did you try to make file with the text

println(“Hello !”) # Булгъар

using and run it in REPL?

stevengj · November 6, 2020, 7:36pm

Yes. It works fine (both via julia foo.jl and by include("foo.jl") in the REPL), saved in UTF-8 encoding in a file foo.jl, once you correct the quotes to straight quotes:

println("Hello !") # Булгъар

(Use a programming editor like vsCode! Non-programming editors will sometimes “smart-correct” quotes "..." into curly quotes “...”, which is not what you want for programming. Or maybe your browser is doing that to your discourse posts?)

gvbelov · November 6, 2020, 8:13pm

Thank you very much!
(I use the right quotes, probably they are converted in this window).
Yet I do not understand the idea. It is COMMENT, it is for me, not for compiler/interpreter. One can use here any symbols.
I thought, the compiler just skip the text after the symbol #.

Skoffer · November 6, 2020, 8:22pm

It has to parse text after # in order to find newline symbol.

gvbelov · November 6, 2020, 8:27pm

Thank you! I finally figured it out!

Topic		Replies	Views
LoadError: syntax: "\" is not a unary operator New to Julia question	8	2765	June 5, 2021
Placing Unicode texts anywhere within Julia code New to Julia question	3	246	May 29, 2024
REPL reports (ParseError: unknown unicode character) when I copy-paste code containing unicodes to the terminal in VScode General Usage question , repl , vscode , unicode , terminal	17	433	April 9, 2025
Julia in VS Code not recognizing Mathematical Italics Unicode characters New to Julia question , unicode	5	629	February 8, 2023
Julia 0.6 Unicode Parsing Problem Data strings	4	1986	May 12, 2017

Cyrillic symbols in comments

Related topics