Is Julia well-suited for string manipulation?

stevengj · January 13, 2023, 5:25pm

I don’t think Julia is at an inherent disadvantage — you can certainly write heavily optimized text-processing code like CSV.jl and Parsers.jl in Julia. For example, a few years back I helped optimize a Japanese tokenizer called TinySegmenter.jl, and the result was considerably faster than optimized implementations of a similar algorithm in Ruby, Python, and Perl.

However, if you write naive string-processing code in Julia, using the same style as Python, allocating zillions of temporary strings, then Julia performs worse — the language isn’t designed for code that does lots of small allocations in critical inner loops. It’s similar to people who port numerical code from Matlab (or Numpy) to Julia line-by-line, and often find that their initial Julia port is slower, as in this recent thread for example: MATLAB outperforms Julia (20 times faster) running this nested loop

Of course, for particular tasks a particular language may benefit from some heavily optimized library, for which Julia does not yet have an equivalent. It’s also true that the majority of people writing high-performance libraries for Julia have thus far been focused more on numerical computations than text processing.

Topic		Replies	Views
I'm trying to write something with the best possible performance (for a library). I keep wishing I was writing C! is that normal? Should I just write C? New to Julia	55	2916	January 10, 2019
String indices : byte indexing feels wrong New to Julia strings , unicode	18	1411	December 5, 2023
Problems with deprecations of islower, lowercase, isupper, uppercase Internals & Design	179	13315	January 1, 2018
Performance of length(::String) Performance	24	3933	July 28, 2018
Julia subsumes ICON: well done Internals & Design strings	3	755	June 25, 2021

Is Julia well-suited for string manipulation?

Related topics