Split a path (inverse of Base.Filesystem.joinpath())


#1

Is there a way to split a path "a/b/c.txt" into [ "a", "B", "c.txt" ]?
I could call Base.Filesystem.splitdir() in a loop until the first element is the empty string, but that seems wasteful. I could call split( "a/b/c.txt", Base.Filesystem.path_separator ), but Base.Filesystem.path_separator is undocumented.

In practice, I want to check if the first directory in the path is “…”.


#2

Here is the function I came up with:

"""
    splitpath( path::String ) -> Array{String}

Splits a path into an array of its path components. The inverse of `joinpath()`.
Calling `joinpath( splitpath( path )... )` should produce `path`
(possible with the trailing slash removed).

```jldoctest
julia> splitpath("a/b/c")
("a", "b", "c")
julia> splitpath("/a/b/c")
("/", "a", "b", "c")
```
"""
function splitpath( path::String )
    result = String[]
    
    while path != ""
        path, last = splitdir( path )
        
        ## If path consists of only the path separator, which could happen
        ## when referring to the filesystem root, then last will be empty
        ## and path will be unchanged.
        ## If this is the case, swap them so that we push the root path
        ## marker and then the while loop will terminate.
        if last == ""
            path, last = last, path
        end
        
        push!( result, last )
    end
    
    reverse!( result )
    return result
end

#3

I am not sure that "..." is a valid path, but in any case, it is possible that you could also use string (or regex) matching.


#4

[The “…” is what happens when I don’t enclose ".." in code backticks on Discourse.]

I can’t use string matching or a regular expression because Base.Filesystem.path_separator is undocumented.


#5

I don’t see why you would need to use separators, unless validating paths. Wouldn’t

startswith(path, "..")

just work?


#6

There could be a file named ..notes. I don’t want to exclude it.


#7

Recently, I’ve been doing some work, where I want the folder name and the file from an absolute path. What I’ve come up with is:

last_two_parts(p) = joinpath( basename(dirname(p)), basename(p) )

That’s not bad, but the following looks cleaner:

last_two_parts(p) = joinpath( splitpath(p)[ end-1 : end ] )

It’s also much more flexible syntactically. If there is a performant way to make splitpath, I think it would be a really useful function.


#8

The computationally efficient way to write it is split( "a/b/c.txt", Base.Filesystem.path_separator ). However, Base.Filesystem.path_separator is not documented. In contrast, Python exposes os.sep.


#9

In practice, the performance of even the iterative approach is probably not bottlenecking anyone’s code.


#10

Changing documentation to add a reference feels like an easy fix. I’ll try to submit a PR for this soon.