Better handling of pathnames

This rule is broken by numbers. (2^32+1)^2=1, but in floating point, it equals 2^64.

Sorry, I don’t understand how that is related to this discussion. Your example is about overflow, which is not an issue for strings. * as a non-commutative operator has very little to do with * for numbers, and is always exact for ::AbstractString.

Want I meant was that just because an operation is exact work only 1 behavior now isn’t a strong argument against implementing a different behavior for a new subtype. If you consider paths as a monoid generated by file/folder names with * as their operation, it makes perfect sense for * to add the / or \ as appropriate. This has the added benefit of making it ready to write cross platform code.

This might be a terrible idea, but if having PosixPath <: AbstractString is a problem, why not define an easy and nice way to convert them to strings (nicer than string())?

One could for example make instances of PosixPath callable:

julia> using FilePaths

       # Monkey-patch FilePaths for the sake of the example
julia> @eval FilePaths begin
           (path::PosixPath)() = string(path)
       end

julia> path = p"/tmp/foo.csv"
p"/tmp/foo.csv"

julia> path()
"/tmp/foo.csv"
1 Like

For that reason, I was thinking about creating a new package with pathnames as a subtype of AbstractString , specifying a unique string representation for each pathname (like TCL does, what a nice language!). That way, one deal with pathnames as structures and use them in methods that expects strings. Plus, it would be ease to extend this interface to, e.g., URLs, sockets, etc., and having them to function appropriately.

That’s what FilePaths.jl originally did, but that tended to result in weird bugs if we didn’t fully implement the string API. I think having a filepath type be distinct from strings is the more correct approach as there are relatively few operations that overlap between strings and paths (conceptually).

Regarding the ecosystem, I think the ideal solution is for packages to support both types for some period of time. The FilePaths.@compat macro is intended to help with developing those interfaces correctly. A compromise may be to provide those definitions in FilePaths.jl (not FilePathsBase.jl) for packages with few dependencies.

Here’s a blog post on pathlib in python that had the same issue a few years ago.

3 Likes

Other things you can have if path is not a string are specialized getindex and iterate .

That’s a good idea, made an issue. `getindex` and `iteration · Issue #70 · rofinn/FilePathsBase.jl · GitHub

1 Like

Now, about this:

julia> programPath = path("C:\\Programs\\My_Program");
julia> execPath = programPath * "bin" * "prog.exe"
Path: C:\Programs\My_Program\bin\prog.exe

I honestly prefer to make it clear when a path is supposed to point to a directory instead of a file, i.e., it should end with a slash:

julia> programPath = path("C:\\Programs\\My_Program\\");
julia> execPath = programPath * "bin\\" * "prog.exe"
Path: C:\Programs\My_Program\bin\prog.exe

IMO, paths that refer to directories and paths that refer to files should be distinct, that allows better understanding of code and a finer control of how it should work.

2 Likes

Does FilePathsBase overload open? It seems like that should be a good 90% solution. There’s very few places where you should be passing around a filepath rather than a stream. Overloading the functions in Base.Filesystem would probably get you to 95% if packages were good about using those utilities instead of shelling out.

Yes, open([f], ::SystemPath, args...; kwargs...) works the same way as in base and there is a fallback for open([f], ::AbstractPath, args...; kwargs...) that’ll return a FileBuffer with read and write (useful for remote filepaths like on S3).

The problem is that various package APIs do the following:

  1. read_thingy(path::AbstractString, args...) reads from a file at path, calling open and
  2. read_thingy(io::IO, args...) which does the same for a stream

open is in there somewhere, but a lot of code would need to be generalized before it is reached (this was pointed out above @stevengj).

I am not saying that cleaning this up would not be worth the effort, but there is a cost. Maybe tiny interface package could define

abstract type AbstractFilePath end
abstract type AbstractFilePath <: AbstractPath end
abstract type AbstractDirectoryPath <: AbstractPath end

Base.open(path::AbstractFilePath, args...; kwargs...) = 
    open(string(path), args...; kwargs...)
# and other variants, and similar methods

so that

  1. the API methods of various packages using paths would only have to modify their signatures to accept Union{AbstractString,AbstractFilePath}, as applicable (maybe a constant alias for this would be nice),

  2. packages that handle pathnames would only need to define Base.string to hook into this.

3 Likes

I agree. It looks like the Julia default is without though. pwd(), for instance, doesn’t return ending slashes.

You may be interested in this issue then. This would make directories and files different types so you could dispatch on directories vs files. I have some technical issues/concerns, but if there is sufficient interest then I’d consider trying to address them.

https://github.com/rofinn/FilePathsBase.jl/issues/72

3 Likes

Anyone who is working with paths, use this function to make your path good.

using FilePathsBase
GoodPath(inp::String) = inp |> Path |> _GoodPath |> string
_GoodPath(path::WindowsPath) = PosixPath((path.drive, path.segments...))
_GoodPath(path) = path

The result of this can be passed to all of the julia functions like cd on any platform, so it will free you up from being worried about the platform. I use GoodPath on my functions that accept a path from the user, or on the output of the functions that return a path.

julia> GoodPath("C:\\folder1\\folder2") 
"C:/folder1/folder2"

julia> GoodPath("folder1/foldee2")
"folder1/foldee2"
2 Likes

Might need a more concrete example of what issue you’re running into, but the fact that you just want strings anyways suggests that you should not use FilePaths. Functions like cd should work with AbstractPaths or strings fine regardless of platform, w/o any need to convert windows to posix paths. Also, your GoodPath function could be simplified with a call to replace:

julia> replace("C:\\folder1\\folder2", "\\" => "/")
"C:/folder1/folder2"

julia> replace("/folder1/folder2", "\\" => "/")
"/folder1/folder2"

To write generic code that works on all platform I need a single method of writing paths. I chose the Posix method since Julia works well with it.

I used FilePathsBase methods to make a path from a string, and then dispatch on it based on the types that it detects, and then convert it back to string for my usage.

I do a lot of string interpolation, printing, etc, so I can’t use FilePaths for that (or makes things complex for no reason), and I should use strings.

""""
a lot of non path stuff with other string interpolations

"$goodpath/somefolder"

a lot of non path stuff with other string interpolations
"""

If I let the goodpath be a normal Windows Path like C:\\folder, when it is printed, it is converted to C:\folder which is not what I want.

The benefit of using FilePathsBase is its path detection regardless of what is given inside:

julia> using FilePathsBase

julia> p = Path("C:\\folder1/folder2")
p"C:/folder1/folder2"

julia> typeof(p)
WindowsPath

Probably this can be replaced with a simpler function. That questions the existence/implementation of FilePaths packages. Since everything can be replaced with Julia’s functions that work on strings rather tuple of strings…

If I let the goodpath be a normal Windows Path like C:\\folder , when it is printed, it is converted to C:\folder which is not what I want.

Hmm, that should be an easy fix… just need to have an extra print dispatch for WindowsPath that was missing. I’ll probably want David to review as he does more Windows stuff.

Probably this can be replaced with a simpler function. That questions the existence/implementation of FilePaths packages. Since everything can be replaced with Julia’s functions that work on strings rather tuple of strings…

In the example you gave, you had a function (GoodPath) that took and returned a string. If you just wanted strings in the example then I see no reason to do the FilePath conversion. Use FilePaths when you want a type based solution to a problem (e.g., dispatching on path types, common operations on local and remote filesystems) and use strings if you just want to do some string manipulations. I don’t see how that questions the existence of the FilePaths packages though.