Natural Language Processing: where do I start?


I do not know enough about machine learning to ask an informed question. I am not even sure whether this a Machine Learning question. Maybe.

I will tell you what I want to achieve in the end:

In short we want to classify data based on text. The process will be unsupervised machine learning. The data will be abstracts, keywords, titles of scientific publications. We want to develop a meaningful subject classification system based on phrases and distance between words in the data.

I know that Python has a well developed nltk and good tutorials and I still know Python better than Julia, but I will prefer to do this in Julia. Unfortunately a lot of related tools in Julia are not yet usable in v1.0. And then there are so many Julia Packages that I do not know where to start exploring.

I will appreciate some advice.


Did you see this:


No. Thanks for the link.