Best Practices for Package Resource Location

duncanam · February 25, 2020, 9:58pm

Hello,

I’m creating a package that will perform table lookups on some standardized chemical tables one can find online, and I’d like to inquire as to best practices for where and how to look for these resources once downloaded.

The current workflow will be:

User finds table (one for hydrogen, one for nitrogen, etc) online they like want to work with, and they download it into a location of their choice. Perhaps eventually I’ll migrate this step within the package itself.
User imports my package
User sets path to the library of various tables they have downloaded
User calls function from my package, and it grabs the table requested from the path where the user is storing these tables

Example usage:

using MyPackage

libpath = "/home/user/chemtables/"

value = MyPackage.hf("H2O",298.15) 
println("The enthalpy of formation of water at 298.15K is: $value")

I’d rather not always have the path as an input to the function that will access the table. This seems to suggest that my only option is to be playing with global variables, and have a reserved variable name that the package always accesses for the user-defined path (which I suppose would always be needed at the top of the script, or maybe have a backup default location if it can’t find it). Is there a better way to approach this?

Edit: for those bothered by inputting a temperature for a formation enthalpy, forgive me, I was thinking sensible enthalpy at the time.

stevengj · February 25, 2020, 10:07pm

Using global variables would make it difficult to use multiple tables in the same caller, no?

I would tend to suggest having some sort of data structure that wraps a table, e.g.

ct = ChemTable("/home/user/chemtables/datafile")

and then define functions that act on this, e.g. hf(ct, "H2O", 298.15). Or maybe have ct["H2O"] return a data structure about the molecule (similar to https://github.com/JuliaPhysics/PeriodicTable.jl) that you can operate on:

H2O = ct["H2O"]
hf(H2O, 298.15)

or even use dot overloading: ct["H2O"].hf(298.15).

Partly it depends on what your data looks like. You may not want to hit the disk for every query — either load in all of the data when you construct ChemTable(...), or load it in lazily and cache it in a dictionary internally to ct.

duncanam · February 25, 2020, 10:19pm

Good thoughts on the disk overhead each call, I hadn’t considered that. I’ll look more into a custom data structure of sorts that will just hold it there. The tables aren’t big, so that shouldn’t be an issue.

Topic		Replies	Views
Best practice for storing data in Packages Data	5	1839	May 23, 2021
User defaults in a package General Usage package , development	6	1147	October 22, 2017
Working directory New to Julia	4	132	April 7, 2025
Best way to root a pipeline directory structure inside of a package General Usage filesystem , code-organization	9	281	December 9, 2023
Workflow for development packages General Usage question , environment	7	210	December 2, 2024

Best Practices for Package Resource Location

Related topics