What is the correct way to ignore some files/directories in `walkdir`

I am trying to use JLFmt and FilePaths to write a script (~/.julia/config/format.jl) that can format all Julia code under my project root (a typical Julia package).

using FilePaths
using JLFmt

isjuliafile(path::AbstractPath) = isfile(path) && extension(path) == "jl"

isignored(path) = any(occursin(x, path) for x in (".git", ".idea", ".vscode"))

function formatjuliafiles(path)
	for (root, dirs, files) in walkdir(path)
	    for file in files
			any(occursin(x, abspath(file)) for x in (".git", ".idea", ".vscode")) && continue
	        isjuliafile(Path(file)) && format_file(basename(file), 4, 120; overwrite=true)
	    end
	    for dir in dirs
			any(occursin(x, abspath(dir)) for x in (".git", ".idea", ".vscode")) && continue
	        formatjuliafiles(joinpath(root, dir))
	    end
	end
end

formatjuliafiles(pwd())  # Project root

and I execute the file ~/.julia/bin/jlformat

#!/usr/bin/env bash

/usr/bin/env julia ~/.julia/config/format.jl

at my project root.
However, no matter how I modify the fucntion formatjuliafiles, walkdir still goes through every file and directory under .git, .vscode, .idea and wastes a lot of time. I cannot filter the walkdir iterator with

function formatjuliafiles(path)
	for (root, dirs, files) in filter(!isignored, walkdir(path))
	    for file in files
			any(occursin(x, abspath(file)) for x in (".git", ".idea", ".vscode")) && continue
	        isjuliafile(Path(file)) && format_file(basename(file), 4, 120; overwrite=true)
	    end
	    for dir in dirs
			any(occursin(x, abspath(dir)) for x in (".git", ".idea", ".vscode")) && continue
	        formatjuliafiles(joinpath(root, dir))
	    end
	end
end

since it throws an error:

ERROR: LoadError: MethodError: no method matching filter(::getfield(Base, Symbol("##58#59")){typeof(isignored)}, ::Channel{Any})
Closest candidates are:
  filter(::Any, !Matched::Array{T,N}) where {T, N} at array.jl:2343
  filter(::Any, !Matched::BitArray) at bitarray.jl:1710
  filter(::Any, !Matched::AbstractArray) at array.jl:2355
  ...
Stacktrace:
 [1] formatjuliafiles(::String) at /Users/qz/.julia/config/format.jl:15
 [2] top-level scope at /Users/qz/.julia/config/format.jl:27
 [3] include at ./boot.jl:328 [inlined]
 [4] include_relative(::Module, ::String) at ./loading.jl:1094
 [5] include(::Module, ::String) at ./Base.jl:31
 [6] exec_options(::Base.JLOptions) at ./client.jl:295
 [7] _start() at ./client.jl:468
in expression starting at /Users/qz/.julia/config/format.jl:27

What’s wrong with my script?
BTW, I will post this script as a gist later to help the community.

1 Like

I don’t think its necessary to call formatjuliafiles recursively, as walkdir already iterates over everything on its own. All you need to do is to filter the root for an occurrence you don’t like:

function formatjuliafiles(path)
  ignoredPaths = [".git", ".idea", ".vscode"]
  isignored(path) = any(occursin(ip, path) for ip in ignoredPaths)
  isjuliafile(path) = splitext(basename(path))[2] == ".jl"
  for (root, dirs, files) in walkdir(path)
    isignored(root) && continue
    for file in files
      !isjuliafile(file) && continue
      println("formatting: " * joinpath(root,file))
      #format_file(joinpath(root,file), 4, 120; overwrite=true)
    end
  end
end

There are definitely more performant ways to do this, but the function does what its supposed to…

1 Like

Thank you, I was confused because I was regarding walkdir similar as Python’s os.listdir (which only list one depth of directory and files) but it actually goes through every file and directory…

Here the gist is.
There is only one problem of the code: it ignores the package root. Now we have to run this code under <root>/src/. So if someone has some Julia files under <root>/scripts/ or so they will not be formatted.

disclaimer: I am the author of Continuables.jl

after years I myself stumble upon this and for me the accepted solution is very bad, as walkdir still recurses into the directories which should be skipped. This is especially costly for something like .git directory which you want to ignore.

The best and most flexible way to solve this was actually to use one of my older packages Continuables.jl which give you python-like generators in julia.

Here a fully working version, very well readable while still giving you all possibilities to apply your individual filterings

list_all_juliafiles(path=abspath(".")) = @cont begin
    if isfile(path)
        endswith(path, ".jl") && cont(path)
    elseif isdir(path)
        basename(path) in (".git",) && return
        for file in readdir(path)
            foreach(cont, list_all_juliafiles(joinpath(path, file)))
        end
    end
end

collect(list_all_juliafiles())
2 Likes