[Pre-ANN/RFC] XDG.jl: A cross platform implementation of the XDG Directory Spec


Knowing where to find/place files can be hard. So far in the Julia ecosystem I’ve seen two approaches:

  • Try a basic (sometimes incorrect) approach, as has been done in FreeTypeAbstraction.jl and DataDeps.jl.
  • Just shove any/every-thing under the Julia depot path, leading to a cluttering of .julia. From the packages I have installed, I see: conda, datadeps, makie, makiegallery, pluto_notebooks, symbolstorev2-lsp-julia mixing state, cache, user data, and other categories of information.

Neither of these approaches correctly determine appropriate locations for data. This has niggled me for a while, and so I’ve finally got around to producing:

XDG.jl A cross platform implementation of the XDG Directory Spec

It is essentially an implementation of the XDG (Cross-Desktop Group) directory specifications, with analogues for Windows and MacOS for cross-platform support. More specifically, this is a hybrid of:

This has taken longer that I expected to develop, because I did an investigation of other translations of the XDG spec to Windows/Mac (looking at Qt, Go, Rust, and Python libraries). You can find my collated results and conclusions here, and the approach I’m taking here.

Why does this matter

It may be easy to treat file paths haphazardly, but for the user in particular
abiding by the standards/conventions of the their platform has a number of major
benefits, such as:

  • Improved ease of backups, since it is easier to make rules for which folders need to be backed up.
  • Improved configuration portability, since it is easier to identify and share the relevant configuration files.
  • Ease of isolating application state, by containing state to a single directory it is easy to avoid sharing it.
  • Decreased reliance on hard-coded paths, improving flexibility and composability.

It is worth noting that these considerations apply to both graphical and
command-line desktop applications.

Choosing the correct location

Along with this package, I’ve also produced an initial attempt of a flowchart to work out the appropriate directory (first pass, will likely be edited for clarity in the future).

image

Example usage

(@v1.8) pkg> add https://github.com/tecosaur/XDG.jl.git

julia> using XDG

julia> XDG.CONFIG_HOME[]
"/home/tec/.config"

julia> XDG.User.config()
"/home/tec/.config"

julia> XDG.User.config("sub", "dir/")
"/home/tec/.config/sub/dir/"

julia> XDG.User.config(XDG.Project("mything"), "config.conf", create=true)
"/home/tec/.config/mything/config.conf"

Feedback would be appreciated

Ideally, this would be taken up by the various Julia packages that need to think about where to look for/put files, to help them do so more idiomatically. In order for this to have the best chance of doing so, this package needs to be:

  • As easy to use as reasonably possible
  • As idiomatic (on Windows/Mac) as reasonably possible

With this in mind, I’d be very appreciative of feedback on both the overall design choices and details of this package.

35 Likes

This looks great! Especially julia applications that need to store some configuration files can now store them in a safe place, consistent with the users’ system and without assuming a julia runtime directory is available (for the eventuality of static binaries - some day…!)

Wish I could :heart: this more than once :slight_smile:

4 Likes

This is also super useful for distributed environments where there are multiple users sharing one single julia depot that should not be changed. Going to start using this and recommend it to some of my colleagues! Great work!!!

1 Like

For anybody thinking of using this soon, once the design is settled (it currently is pending any feedback that makes me re-evaluate it), I plan on tagging v1.0 and registering it :slightly_smiling_face:.

Also thanks for all the kind words, it helps make going through the effort of putting this together feel worthwhile.

4 Likes

Of course! Also, I’d be curious what you found deficit with packages like DataDeps.jl. It seems like it conflicts with your notion that in some sense, the .julia depot should be “untouched” from what I gather in your initial post. Definitely know what you mean with regards to conda, pluto_notebooks, and datadeps – it gets messy fast.

1 Like

DataDeps.jl is a case of hard-coded defaults silimar to FreeTypeAbstraction.jl. In the OP I’ve linked to the problematic snippets from both files. Here I’ll go through the DataDeps.jl one as an example:

DataDeps.jl/src/locations.jl#L7-L16
    @static if Sys.iswindows()
        vcat(get.(Ref(ENV),
           ["APPDATA", "LOCALAPPDATA",
            "ProgramData", "ALLUSERSPROFILE", # Probably the same, on all systems where both exist
            "PUBLIC", "USERPROFILE"], # Home Dirs ("USERPROFILE" is probably the same as homedir()
           [String[]])...)
    else
        ["/scratch", "/staging", # HPC common folders
         "/usr/share", "/usr/local/share"] # Unix Filestructure
    end], "datadeps")

On Unix

This is rather minor, but for user-data XDG_DATA_HOME and XDG_DATA_DIRS should be used over "/usr/share", "/usr/local/share".

On Windows

This one is actually a bit of a pain, because the the environment variables ProgramData etc. may not be enough, more specifically you need to check for the relevant FolderID (in the case of ProgramData, AppDataProgramData) via the KnownFolder win32 API. This ends up being a bit of a hassle, see XDG.jl/nt.jl at main · tecosaur/XDG.jl · GitHub for everything required, but here’s a snippet:

function knownfolder(id::Symbol)
    guid = KNOWN_FOLDER_IDS[id]
    ptr = Ref(Ptr{UInt16}())
    result =
        ccall((:SHGetKnownFolderPath, "shell32"), stdcall, UInt32,
              (UInt128, Cuint, Ptr{Nothing}, Ptr{Ptr{UInt16}}),
              guid, 0, C_NULL, ptr)
    if result == zero(UInt32)
        unsafe_utf16string(ptr[])
    end
end

With XDG.jl

To get all the relevant system and user data dirs, all you need to do is:

julia> XDG.data()
5-element Vector{String}:
 "/home/tec/.local/share"
 "/home/tec/.local/share/flatpak/exports/share"
 "/var/lib/flatpak/exports/share"
 "/usr/local/share/"
 "/usr/share/"

For this to be scoped to a particular project, this is what you’d need:

julia> XDG.data(XDG.Project("DataDeps"))
5-element Vector{String}:
 "/home/tec/.local/share/datadeps/"
 "/home/tec/.local/share/flatpak/exports/share/datadeps/"
 "/var/lib/flatpak/exports/share/datadeps/"
 "/usr/local/share/datadeps/"
 "/usr/share/datadeps/"
3 Likes

Interesting! @oxinabox, I wonder what you make of this package/perspective? This makes me curious if it would be worthwhile to upstream (e.g. me helping on this) a PR using XDG.jl to DataDeps that could make things less hard coded and more standardized around XDG. What do you think?

I like this a lot.

The paths DataDeps searchs by default are more or less based on what i could find notes on at the time.

I am inclined not to make any changes to DataDeps.jl these days because it is very stable.
Plus @tecosaur has been working on a new package that might more or less supersede DataDeps.jl
(Hopefully that isn’t spoilers).
I suspect this was part of that work.

2 Likes

Thanks for the kind words :slightly_smiling_face:.

I’m not sure if anybody has actually taken a look at the API/design, but there hasn’t been any negative feedback so I’m inclined to tag 1.0 and register this in the coming days.

I suspect this was part of that work. [on a new package for managing data]

As it happens, this was more of a “the straw that broke the camel’s back” -type situation. In this case, the straw was finding out that Pluto’s default notebook location was also the Julia Depot :stuck_out_tongue:.

That said, on the todo list for the project you allude to is the idea of a central data store, and I might very well end up using XDG.jl with that.

2 Likes

The name will get rejected.
And I think Correctly so.
I would pobably fo for XDGDirectories.jl or something?

1 Like

Hmm, I’m not sure. The XDG have put out other specifications, but when I search for “XDG” the entire first page of results is only about the base directory spec, and it seems to be a common choice of name for packages like this in other languages (Go, Python). I’m not sure if there’s anything else this could potentially be confused with?

I will also admit that I also like how “clean” it looks to me to write XDG.User.config() too vs. XDGDirectories.User.config(), even though this is a minor thing.

2 Likes

You can always have your package export an XDG object/submodule that has this interface - you’re not bound to use the package name.

2 Likes

That’s not a bad idea, I could essentially have a wrapper module which is just.

module XDGDirectories

include("./XDG.jl")

export XDG

end

Hmm, two other considerations.

  • I’ve been thinking of adding XDG Trash support in an 1.1 update, which mean that I’m not just doing the XDG base directory spec any more. (this is delayed due to the effort it will take to wrangle the Win32/Foundation APIs on NT/Darwin)
  • Looking at General, I see that three letter packages are being merged in, with manual review.
2 Likes

You can always give it a try with XDG.jl and see if it’s considered unambiguous enough. If it isn’t, you can still change it.

1 Like

In general I am opposed for three letter package names.
They are simply too precious.
I include CSV.jl in names I am opposed to; but i overlook that because it is truely excellent.
(Not because CSV is such a clear name for this)

and I hate that

2 Likes

In general I am opposed for three letter package names.
They are simply too precious.

I think I’m about 80% of the way with you. Three letter names are precious, but I’d not say that makes them outlawed, just only allowed when the package does exactly what one would expect it to, both being unambiguous and also doing that particular thing well enough that it seems unlikely to be supplanted.

I think CSV clears that bar, JSON doesn’t quite, but I think XDG also does.

I also think a similar bar should be applied to “common names”, for example (hypothetically) say somebody registered “Statistics” or “Analysis” for a package that just did a few stats things and eventually became orphaned.

2 Likes

What about XDGSpec.jl?

  1. Already taken
  2. More misleading than XDG (I think), since I think this could be interpreted as a collection of XDG specifications. As it happens, this is what the existing (but abandoned) XDGSpec.jl seems to be trying for.

I’ve been letting the naming question ruminate in the back of my mind. Arguably this is bikeshedding, but I personally don’t like regretting naming decisions.

I’m currently contemplating going with Directories / StandardDirectories and exporting Dirs, so XDG.User.config() would become Dirs.User.config().

I feel like this may do a good job of being unambiguous but also succinct. Dirs does give me some pause, but Directories.User.config() feels a bit long to me (I don’t plan on actually exporting any functions/variables, so having a short module name is more important than usual).

The potential trash integration in the future also gives me pause, but Dirs.trash(path) doesn’t seem too bad to me.

3 Likes