I’m facing a strange problem, and I’m having some trouble figuring out why it’s happening.
I have a script that donwloads a pdf from an S3 bucket, extracts the text from the pdf and writes the document to a MongoDB, using Mongoc.jl.
I’ve parallelized the script using
Threads.@threads. I’ve also placed a
try-catch around the function that extracts and writes the text to mongo:
function extractandwrite() client = Mongoc.Client() database = client["resolvvi"] collection = database["autos"] autos = obterdataframeautos() p = Progress(size(autos)) Threads.@threads for i in 1:size(autos) try @suppress extrairauto(autos[i, :], collection) catch continue end next!(p) end end
The thing is, when I leave this running, I get (at random) an error saying 17244 iot instruction (core dumped). This makes the repl die, hence, the error seems to go beyond the actual Julia code… Any idea what is this error?
The error is hard to reproduce, because it happens randomly. I have thousands of pdfs that I’m parsing, and sometimes the error occurs in, for example, pdf 103, and sometimes in pdf 234. My guess is that the problem is in the multi-threading.