[ANN] NKFtool - a Julia package for converting encodings in Japanese texts using nkf

I am happy to announce the release of my first package “NKFtool.” which is a wrapper for a command-line tool called network kanji filter, aka, nkf (https://osdn.net/projects/nkf/), having been developed over three decades. This package allows you to guess the encoding of Japanese texts and convert them to each other.

This package offers two functions: nkf_guess and nkf_convert.

NKFtool-overview

Use nkf_guess to guess the encoding of a string:

julia> use NKFtool

julia> nkf_guess(raw"こんにちわ")
“UTF-8”

Use “nkf_convert” to convert the encodings. It takes two parameters: the first one is the input string, and the second one is the directive of the conversion. You don’t have to specify the input encoding because “nkf” can guess it.

julia> nkf_convert( raw"こんにちわ", “-s”) |> nkf_guess
“Shift_JIS”

Use “nkf_convert” function without the second argument to convert to UTF-8 because the default second parameter is “-w -m0” (to UTF-8, no MIME decode).
The “nkf_guess” and “nkf_convert” functions also accept an input stream for the first argument, where the inputs may come from an input stream, such as a file

See the following example:

julia> open(“hello_sjis.txt”,“w”) do f
print(f, nkf_convert(raw"こんにちわ", “-s”))
end
#
encoding=open(“hello_sjis.txt”) do f
nkf_guess(f) # <==
end
“Shift_JIS”

julia> hello_utf=open(“hello_sjis.txt”) do f
nkf_convert(f) # ⇐ Convert to UTF-8
end
“こんにちわ”

This package requires “nkf” installed in your system.
Consult the documentation:
In English -> https://hsugawa8651.github.io/NKFtool.jl/dev/man/guide/
In Japanese -> https://hsugawa8651.github.io/NKFtool.jl/dev/man/guideja/

I welcome your feedback, bug reports, suggestions, and PRs.

GitHub -> https://github.com/hsugawa8651/NKFtool.jl

4 Likes