I have a package that provides some functions that needs data stored in files on the users computer
My plan was to require that these files are stored in the same folder as the file that imports (using package) the package so I could define a global variable in the module
DATADIR = pwd()
and then use this variable anywhere in the package I need to access the data.
To test this I have a script located in mypath/test/test/run.jl (mypath is just the rest of the path). This script imports my package (using package) and tries to call some functions, but I get the error that there is no such file as: mypath/test/datafile.txt
The function in question tries to open the file using
filepath = joinpath(DATADIR, datafile)
data = open(readdlm, filepath)
The error is correct sin the data file is located in mypath/test/test/datafile.txt
I get the following in the REPL after the error occurs
>DATADIR
"mypath\test"
>pwd()
"mypath\test\test
It seems like one folder level is truncated from pwd() when storing in DATADIR. Hmm…?
I guess I have two questions:
Why is this not behaving as expected (where did the last part of the path go (\test)?)
Is there a better way to reference the file location where the package was imported (or a better way of defining where data files should be located and accessed for that matter)?
If I understand your setup correctly*, then the issue will be that the current working directory pwd is simply a different concept from the directory of the current file. To get the latter, you can use @__DIR__.
you can see that pwd() gives the terminal directory I launched Julia from, while @__DIR__ gives the directory of script.jl.
*Edit: I presumably do not with regards to your first question , considering a declaration DATADIR = pwd(), should really mean that DATADIR == pwd(), regardless of whether this is the directory you really intend. You’re not cd’ing at some point by any chance?
I later tried to uninstall the package from the Pkg environment (in mypath/test/test). I then tried to reinstall it, but this time making sure that the current working directory while Pkg.add was mypath/test/test. After doing this, no errors
My theory is that the code in the module is evaluated while being added or precompiled and thus the value of DATADIR variable is determined at this point and not when using the package at a later point.
My guess is that last time I added the package the current working directory was mypath\test and I probably did Pkg> activate .\test (instead of Pkg>activate . which I did now).
Does this seem reasonable?
So the conclusion is that using pwd() is completely unsafe because there is no control of what the working diretory was when the user added the package.
But @__DIR__ seems to give the same problem, it will only refer to the folder where the package is installed which also can be anywhere and is typically the Julia package depot.
Is it possible to determine, from within the package, the location of the script that imports (import/using) the package?
If not it seems like the only way is to prompt/require the user to provide the file location of the datafiles in user’s script, e.g. export a function set_datadir():
function set_datadir(path:String)
DATADIR = path
end
That’s how precompile caching of top-level code works, yes. If you want runtime values, then you need to define global variables in __init__, and it only runs automatically on the first load, not further using/import statements. Another example with top level rand(1:100).
I don’t think an import can provide any information backwards to the module, even if it’s just the first time via __init__. If it were possible and multiple imports occur from different modules and files, then you can’t be sure what the state of your package ends up being. You can sort of get the same effect with your set_datadir idea. Say I run 2 scripts that set_datadir(@__DIR__) before doing some work, that should be fine. Then I decide to rerun a specific function call from the 1st script to double check or demonstrate something, but I forget set_datadir directed the package to the 2nd script. If I’m unlucky, the mistake is unnoticed and causes problems later. Global state is called evil for good reason, independent inputs are far preferable.
Thank’s for the insightful input @Benny. I think this whole problem was a good exercise to make me realise that I must excorcise the evil global state variables from my code