Why is os.walk() + regex so much slower than glob

Hi!

So myself and another coworker were both writing julia code that basically did the same thing as a specific GNU find command.

For my implementation, I used a regex for the filename and walkdir to find the files, something like this:

for (root, dirs, files) in walkdir(".")
    #This is how I handled not going into directories with false information that I didn't want to parse later
     deleteat!(dirs,findall(x->"MisleadingDirectoryName",dirs))
     for file in files
           if match(MATCH_REGEX, file).match
                #do input
           end
      end
end

Whereas my coworker did

files = glob("MATCH_GLOB", ".")
for file in files 
    if occursin("MisleadingDirectoryName", file)  && continue
    #do input
    end
end

Given this, his code finishes on a huge directory structure in 20 seconds, whereas mine takes about 20 minutes. He has been coding in julia a lot longer so his code is way more elegant, but my question is, why does it run so much faster? Is regex just that bad? Is it because walkdir is slow?

Thanks!

walkdir is probably just written badly for handling nested directory trees. I may fix this.

1 Like