This is a fascinating discussion! Thanks everyone for their input. I have some opinions, but won’t weigh in on most of the controversies because all of the suggestions seem better than the status quo, and other people have more informed opinions than me.
Two things I do want to mention. First, a couple of times file extensions have been mentioned, and I just wanted to flag that I remember having issues in python a decade or so ago when dealing with multiple extensions (eg my_file.csv.gz
) or less frequently but very annoying, when dots were used in file names (eg my.file.name.txt
). I don’t know if there are elegant solutions here, just an observation.
But I do have somewhat strong opinions regarding the relationship with strings. I really love the idea behind FilePaths.jl, and have tried to use it many times, but have run into way too many situations where I have to covert back and forth between strings to make modifications, and I always abandon it.
As an example, a situation comes up routinely in practice is that I need to rename a file from one form of annotation to another. For example
conversions = Dict(
"sample1" => "db967",
"sample2" => "db888",
#etc
)
files =[
"some/dir/sample1_1.fastq",
"some/dir/sample1_2.fastq",
"other/dir/sample1_processed.csv"
"some/dir/sample2_1.fastq",
"some/dir/sample2_2.fastq",
"other/dir/sample2_processed.csv"
#etc
]
for oldfile in files
dir = dirname(file)
filename = basename(file)
sample = match(r"^sample\d", filename).match
newname = replace(filename, sample => conversions[sample]
newfile = joinpath(dir, newname)
mv(oldfile, newfile)
end
Because I need things like regex matching and string replacements, in order to use FilePaths.jl I was constantly needing to convert to strings and back to paths, so whatever benefit I was getting using explicit path types was swamped by boilerplate.
I think the idea of explicit path types is compelling, but whatever the merits of “a path is not a string”, enforcing that super rigidly is, in practice, annoying.