I am looking for a function that lists all files in all sub-directories according to a given search pattern.
I can imagine that such a function already exist, but I failed to find it.
Below my version:
# rdir: list files in all sub-directories that match a given pattern
s_dir = raw"C:\data\julia\scripts\example"
s_pattern = "hl.jl"
# drop files in directories that cannot be reached:
# walkdir(path; onerror = identity)
# https://discourse.julialang.org/t/hello-got-a-quick-question-i-am-using-walkdir-to-scan-the-file-system-for-file/54194
function rdir(s_dir::String, s_pattern::String)
s_files = String[]
if ~isdir(s_dir)
error(string("\"", s_dir, "\" is not a dir!"))
else
for (root, dirs, files) in walkdir(s_dir; follow_symlinks = false, onerror = identity)
# why is this wrong?
# global s_files
println("Directories in $root")
for i_dir in dirs
println(joinpath(root, i_dir)) # path to directories
end
println("Files in $root")
for i_file in files
# why is this wrong?
# global s_files
if occursin(s_pattern, i_file)
println(string("Pattern: \"", s_pattern, "\": ", joinpath(root, i_file))) # path to files
push!(s_files, joinpath(root, i_file))
println(s_files)
end
end
end
end
return s_files
end
s_files = rdir(s_dir, s_pattern)
println("s_files:")
println(s_files)
The strange thing is the behavior of the variable s_files.
Why is it not necessary to declare it as global in the for-loop?
And even more strange, why does it the opposite from what I would
like to achieve? If I declare them inside the for-loop as global,
the content is deleted.
I do not know why, but on my computer it does not work, neither on Linux
Julia v1.4 nor on MS Windows 10, Julia v1.7.1.
The error message is in both cases:
@ellocco please read the README of the package more carefully, I am just typing random snippets of code here from a mobile device. The character / is special apparently and cannot be used at the start of the pattern. Have you tried omitting the character as the error message suggests?
You could use the Glob package with walkdir, for example:
import Glob
function rdir(dir::AbstractString, pat::Glob.FilenameMatch)
result = String[]
for (root, dirs, files) in walkdir(dir)
append!(result, filter!(f -> occursin(pat, f), joinpath.(root, files)))
end
return result
end
rdir(dir::AbstractString, pat::AbstractString) = rdir(dir, Glob.FilenameMatch(pat))
(It might be useful to add walkdir support directly to Glob.jl. In general, it seems much better to have an iterator for this sort of thing, since a recursive directory tree can get huge.)
There is an enhanced Glob around: Eglob
Unfortunately, the documentation does not make clear to me,
if I can specify a specific top directory as an input
parameter to the function, which specifies the starting point
of the recursive search.
I defined two methods of the rdir function. One methodd that takes a FilenameMatch pattern, which is defined by the Glob.jl package, can be constructed with fn"...", and is needed by my implementation because thatβs what Glob.jl implements occursin for. The other method, for convenience, takes a simple string pattern β it is implemented by simply converting your string to a FilenameMatch and calling the first method.
This way, you can pass either a string or a FilenameMatch to rdir. (The latter provides more options, e.g. there is an option to make it case-insensitive.)
The proposed function with the variable type βFilenameMatchβ in combination with βoccursinβ has the drawback that β*.ljβ works, but βstring*.jlβ does not. Another option is the variable type βGlob.GlobMatchβ in combination with βreaddir()β this enables the usage of the joker char / asterisk β*β inside the search string:
function MyLib_RDir(s_dir::AbstractString, s_pat::Glob.GlobMatch)
files_filtered = String[]
for (root, dirs, files) in walkdir(s_dir)
for i_files in readdir(s_pat, root)
files_filtered = vcat(files_filtered, i_files)
end
end
return files_filtered
end
# Next: add 2nd method to function "MyLib_RDir"
# https://docs.julialang.org/en/v1/manual/methods/
# purpose: convert "String"-type content into
# "GlobMatch"-type (defined by the Glob.jl package) pattern,
# by utilizing the first methode of this function
MyLib_RDir(s_dir::AbstractString, s_pat::AbstractString) =
MyLib_RDir(s_dir, Glob.GlobMatch(s_pat))