Natural Language Processing: where do I start?

johann.spies · September 26, 2018, 11:15am

I do not know enough about machine learning to ask an informed question. I am not even sure whether this a Machine Learning question. Maybe.

I will tell you what I want to achieve in the end:

In short we want to classify data based on text. The process will be unsupervised machine learning. The data will be abstracts, keywords, titles of scientific publications. We want to develop a meaningful subject classification system based on phrases and distance between words in the data.

I know that Python has a well developed nltk and good tutorials and I still know Python better than Julia, but I will prefer to do this in Julia. Unfortunately a lot of related tools in Julia are not yet usable in v1.0. And then there are so many Julia Packages that I do not know where to start exploring.

I will appreciate some advice.

cormullion · September 26, 2018, 11:22am

Did you see this:

johann.spies · September 26, 2018, 11:50am

No. Thanks for the link.

Liso · November 10, 2018, 10:09am

I was trying a little

and found that matchall was removed from Julia 1.0

HISTORY.md says:
matchall has been deprecated in favor of collect(m.match for m in eachmatch(r, s)) (#26071).

It seems ubelievable design decision at least at first look.

But there seems to be mistake in HISTORY.md too:

matchall(r,s) = collect(m.match for m in eachmatch(r, s))  # this didn't work
matchall(r,s) = collect(m for m in eachmatch(r, s))  # this could help to run WebScraping.ipynb

A little problem could be HTTP/1.1 429 Too Many Requests from stackexchange… (which is probably understandable)

Topic		Replies	Views
Julia motivation for machine learning Machine Learning	6	1565	April 18, 2019
Naive Bayes text classification: what package in Julia? Machine Learning	1	1295	April 11, 2019
New to Machine Learning, Where to begin? Community machine-learning	6	706	September 23, 2021
Voicebot / NLP packages Specific Domains nlp	1	282	September 8, 2022
New article: Practical introduction to machine learning with Julia Community	0	430	February 19, 2023

Natural Language Processing: where do I start?

Related topics