Recently there was an LWN news item pointing to an article on the Go blog about Traversal-resistant file APIs.
That’s a nice API. I’d also be nicer in Julia since we could use a do
block instead of defer
.
My two cents here: if Paths API has some kind of abstract interface, then maybe it can be used consistently across packages like Tar.jl, ZipArchives.jl, ZipFile.jl (maybe with some constraints)?
Thanks for the link! Incidentally this touches on one of the tensions in the current design: my desire to break with select conventions in the name of consistency/safety.
For instance, currently p"/some/absolute/path" * p"../../../../../../../../etc/passwd"
will raise an error once we try to get the parent directory of p"/"
, however the “standard” behavior is for the parent of /
to be /
.
I particularly enjoyed reading this section:
If the attacker controls part of the local filesystem, they may be able to use symbolic links to cause a program to access the wrong file:
// Attacker links /home/user/.config to /home/otheruser/.config: err := os.WriteFile("/home/user/.config/foo", config, 0o666)
If the program defends against symlink traversal by first verifying that the intended file does not contain any symlinks, it may still be vulnerable to time-of-check/time-of-use (TOCTOU) races, where the attacker creates a symlink after the program’s check:
Since this is exactly the issue that is systematically eliminated by requiring the use of a FD-as-Path type, as I am now experimenting with (and based on encouraging results, advocating for).
I don’t think it would be too hard to extend the proposal slightly and have rooted-ness be an extra flag that the FD-as-Path type can have, which is then automatically propagated to all path joining operations.
From reading that article, with this proposal will give us a better story with this class of vulnerabilities than Go
RAII would be nice here, but in the current experiment I’m just using a finaliser. I’m not sure that do
is really what we want, since it’s good if the same handle is passed around more instead of being dropped and re-acquired.
Sure, why not How does julia-basic-paths/abstractpaths.jl at main - tec/julia-basic-paths - Code by TEC look to you?
It’s slightly imprecise, but I’m thinking that maybe PathHandle
is the name to go with here. I think having “handle” in the name is important, and while FSHandle
etc. are most accurate, by putting Path
in the name we get an explicit connection to the Path
type and an indication of the relationship between the two. Appearing as a completion with Path<tab>
is a nice bonus. I think together this more than makes up for the loss in accuracy.
So, with this, there are three kinds of system-native paths:
Path
as a “pure” path for the current system (an alias ofPosixPath
orWindowsPath
)PathHandle
for a file descriptor/filesystem handle produced from aPath
DirEntry
is a parent directoryPathHandle
combined with a relativePath
and (optional) metadata.
All of these types can be converted between (e.g. PathHandle
to a Path
, DirEntry
to a PathHandle
, etc.) using the platform filesystem APIs. We can do this automatically where convenient, and require a particular type when we want to prod the user into behaving more safely (e.g. requiring them to acquire a handle, to encourage them to re-use it).
Since @tecosaur explicitly requested further commenting in
I already expressed most of my views during the original discussion, in Designing a Paths Julep - #58 by goerz, but in the context of the point I was trying to make in the context of the “The strangeness (or not) of * as string concatenation” thread, the worry I took away from the discussion here is pretty much summed up with
One aspect of such “purity” is @jar1’s (and others’) stance
It’s not that I don’t understand this at some level, and it’s hard to argue about it completely objectively. But the experience of every other programming language that has Path
objects has shown that there is no practical problem with using /
for something completely unrelated to division, in the context of paths. These functionalities do not clash, in practice.
But beyond the purity of “division”:
I’d also want to strongly re-emphasize
But enough about the joinpath
operator
The other point @tecosaur was bringing up was
It’s dangerous because there’s untrusted input being used. People have to be aware of untrusted input, and sanitize it before letting it flow through the rest of their program. But that’s not your responsibility. Or rather, you’re in no position to “fix” this in a path library, and I’d be concerned that any well-meaning attempt to fix it is only going to mess up perfectly legitimate use cases. Stick to the solutions that have been found to work in other ecosystems, such as Python’s pathlib.
A Path('user_content/../../../../../../etc/passwd')
is absolutely something I might want to do, and yes, it should resolve
to /etc/passwd
. The way to sanitize Path("user_content/" + untrusted_file)
is to resolve
it and check that the result is “secure”. Maybe “secure” means it’s in a subfolder of user_content
, or maybe that it’s in a subfolder of the current user’s home directory. The point is, you can’t know, so how could you possibly “fix” this? The CVEs are for libraries improperly using untrusted input, not for the Path
library. the Path
library isn’t the problem here, and it’s not the place to fix anything.
I would be strongly advise against changing the behavior of existing solutions for things like relative_path / absolute_path -> absolute_path
. Even the existing string-based joinpath
in Julia implements that behavior. Other pathlib implementation adopt this for good reason. Yes, it’s potentially dangerous with unsanitized input, but it’s important for common practical use cases, such as join_path(cwd(), location)
not having to distinguish whether location
is relative or absolute.
Basically,

I think I would try to keep this very simple
As for some of the “ideas” for design goals floating around,

- rejecting invalid path segments at creation
- disallowing a root segment within a path’s segments
- disallowing joining a path with an absolute one
- clear handling of special
.
and..
segments
No! Don’t do any of these things! Don’t fix it if it ain’t broken. All we need is a Path
object that understands the “segments” that make up a file path, and functions like parent
, name
, suffix
, stem
, relative_to
, etc. to manipulate them.
Don’t get in the way of constructing “weird” paths, like “rejecting invalid path segments”, or applying any kind of normalization prematurely.
Then on top of that, you can normalize paths to deal with “handling of special .
and ..
segments”. At that point, you still don’t want to access the actual file system: it should be possible to generate normalized Unix paths on a Windows system and vice versa.
Then lastly, you resolve paths on the actual file system, and that’s where things like symlinks come into play. (I’m not sure whether the name resolve
is appropriate at the normalization stage or at the resolve-on-filesystem stage; again: look at the terminology used in existing implementations). At that level, maybe you also want to add some functionality to verify that certain paths are “secure”, but I’d be pretty careful with that.