tree-sitter may actually be the simplest, and most useful path to go down here. It gives you fast parsers that produce CSTs that give you all the token info you need to highlight stuff, plus an actual syntax tree that you can do other cool things with.
Even though Atom looks like it’ll eventually be abandoned (which it where tree-sitter originated from I believe) it looks as though it’s got a second life in neovim so it’s probably not going to be abandoned itself and should accumulate parsers for plenty of languages in the long term. Writing new parsers also looks relatively straightforward compared to the regex-nightmares of tmLanguage ![]()
Regarding implementing something based off of textmate/vscode grammars that I discussed above: we’d need to wrap the oniguruma regex lib since there appears to be subtle differences in some regex syntax compared to Julia’s PCRE that can’t really be glossed over. I went down a deep rabbit hole getting oniguruma to compile and trying to wrap it – more effort than it’s worth.
So I’ll probably start to wrap tree-sitter grammars and integrate them with Highlight.jl in the near future since, as far as I can tell, basing our highlighting off of that seems reasonably future-proof and a good return on investment of time.