What’s the best way to check if a folder is empty if folders possibly contains 10_000 files?
isempty(readdir(mydir))
The point is that I want to list all empty subfolders of a directory, but mapping readdir on all non-empty folders before telling they’re not empty is not really performant in that case.
Palli
November 18, 2024, 12:39pm
2
Maybe this works, not sure:
empty_dir(d) = stat(d).size <= stat(d).blksize && isempty(readdir(d; sort=false))
_readdir is called, and you could simplify this to not loop:
julia> function _readdir2(dir::AbstractString; return_objects::Bool=false, join::Bool=false, sort::Bool=true)
# Allocate space for uv_fs_t struct
req = Libc.malloc(Base._sizeof_uv_fs)
try
# defined in sys.c, to call uv_fs_readdir, which sets errno on error.
err = ccall(:uv_fs_scandir, Int32, (Ptr{Cvoid}, Ptr{Cvoid}, Cstring, Cint, Ptr{Cvoid}),
C_NULL, req, dir, 0, C_NULL)
err < 0 && uv_error("readdir($(repr(dir)))", err)
# iterate the listing into entries
entries = return_objects ? DirEntry[] : String[]
ent = Ref{Base.Filesystem.uv_dirent_t}()
while Base.UV_EOF != ccall(:uv_fs_scandir_next, Cint, (Ptr{Cvoid}, Ptr{Base.Filesystem.uv_dirent_t}), req, ent)
name = unsafe_string(ent[].name)
if return_objects
rawtype = ent[].typ
push!(entries, DirEntry(dir, name, rawtype))
else
push!(entries, join ? joinpath(dir, name) : name)
end
end
# Clean up the request string
Base.Filesystem.uv_fs_req_cleanup(req)
# sort entries unless opted out
sort && sort!(entries)
return entries
finally
Libc.free(req)
end
end
Dan
November 18, 2024, 1:02pm
4
There is a pending PR enabling lazy readdir
operation:
JuliaLang:master
← nlw0:nic/lazyreaddir
opened 08:50PM - 05 Oct 19 UTC
The original libuv readdir one day became scandir, what is today the basis for J… ulia's `readdir`. The way that function works means all directory contents must be held in memory and sorted before processing. By utilizing the new readdir function in libuv we can instead process directory contents in a streaming fashion. This patch implements a `lazyreaddir` method to do this, delivering the directory contents trough a `Channel`.
1 Like
That was really valuable feedback. Looking into the code I could come up with my own
function isemptydir(dir::AbstractString)
# Allocate space for uv_fs_t struct
req = Libc.malloc(Base.Filesystem._sizeof_uv_fs)
ie = false
try
# defined in sys.c, to call uv_fs_readdir, which sets errno on error.
err = ccall(:uv_fs_scandir, Int32, (Ptr{Cvoid}, Ptr{Cvoid}, Cstring, Cint, Ptr{Cvoid}),
C_NULL, req, dir, 0, C_NULL)
err < 0 && Base.Filesystem.uv_error("readdir($(repr(dir)))", err)
# iterate the listing into entries
ent = Ref{Base.Filesystem.uv_dirent_t}()
ie = Base.UV_EOF == ccall(:uv_fs_scandir_next, Cint, (Ptr{Cvoid}, Ptr{Base.Filesystem.uv_dirent_t}), req, ent)
# Clean up the request string
Base.Filesystem.uv_fs_req_cleanup(req)
finally
Libc.free(req)
end
return ie
end
and with this I see a significant but not large difference in memory consumption and performance on a mac-os:
julia> @btime isemptydir("temp/hh")
48.199 ms (0 allocations: 0 bytes)
false
julia> @btime isempty(readdir("temp/hh"))
50.958 ms (110028 allocations: 5.18 MiB)
false
1 Like