Reading word '.doc' file

Hi,
How can I read the content of MS word ‘.dic’ file, and search for key word in it.
I’d like to make quick scan for some CVs.

Thanks

The easiest way would involve the use of an external utility (e.g. PanDoc) to convert a Word .doc file into a text file (see https://gist.github.com/aembleton/1eb889bc443996a508df). Once such a preprocessing has been completed, it should be straightforward to load the text file from Julia and parse/search it.

Could maybe use the Python docx package (if you mean docx instead of doc) via PyCall.

1 Like