Writing a fast nlp tokenizer in Julia

Thanks oxinabox,

I’ve seen WordTokenizers and it looks really interesting. I read the gsoc blog post on sentencepiece and understood that WordTokenizers doesn’t do training of tokenizers, right?
It’s really fascinating that this readable Julia code leads to something faster then Spacy though, I’ll be sure to take a closer look at the implementation.