This post may seem long, but its a very step-by-step process and very easy to digest.
Lets say I have developed a package (or more of a scientific simulation model) that depends on certain DataFrames, v0.20.2. The Project.toml
now has a
[deps]
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
and the Manifest.toml
dictates the exact version I added
[[DataFrames]]
deps = ["CategoricalArrays", "Compat", "DataAPI", "Future", "InvertedIndices", "IteratorInterfaceExtensions", "Missings", "PooledArrays", "Printf", "REPL", "Reexport", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"]
git-tree-sha1 = "7d5bf815cc0b30253e3486e8ce2b93bf9d0faff6"
repo-rev = "v0.20.2"
repo-url = "https://github.com/JuliaData/DataFrames.jl.git"
uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
version = "0.20.2"
Now suppose a scientist wants to run my simulation model (call it sci-proj
). So they create a new folder, activate a new environment, and run the command: (sci-proj) pkg> dev mypkg
(notice the dev
for this example, ideally it would be add [github link]
.
But look what it adds!
(sci-proj) pkg> dev mypkg
Path `/home/affans/.julia/dev/mypkg` exists and looks like the correct package. Using existing path.
Resolving package versions...
Updating `~/sci-proj/Project.toml`
[f88dc3b7] + mypkg v0.1.0 [`~/.julia/dev/mypkg`]
Updating `~/sci-proj/Manifest.toml`
[a93c6f00] + DataFrames v0.21.2
Why does it add DataFrames v0.21.2 when the Project/Manifest of the original package mypkg
clearly dictates 0.20.2?
Even if I pin DataFrames in the original package, i.e.
(mypkg) pkg> pin DataFrames
Updating `~/.julia/dev/mypkg/Project.toml`
[a93c6f00] ~ DataFrames v0.20.2 #v0.20.2 (https://github.com/JuliaData/DataFrames.jl.git) ⇒ v0.20.2 #v0.20.2 (https://github.com/JuliaData/DataFrames.jl.git) ⚲
Updating `~/.julia/dev/mypkg/Manifest.toml`
[a93c6f00] ~ DataFrames v0.20.2 #v0.20.2 (https://github.com/JuliaData/DataFrames.jl.git) ⇒ v0.20.2 #v0.20.2 (https://github.com/JuliaData/DataFrames.jl.git) ⚲
I still get the newest version in my sci-proj
?
(sci-proj) pkg> dev mypkg
Path `/home/affans/.julia/dev/mypkg` exists and looks like the correct package. Using existing path.
Resolving package versions...
Updating `~/sci-proj/Project.toml`
[f88dc3b7] + mypkg v0.1.0 [`~/.julia/dev/mypkg`]
Updating `~/sci-proj/Manifest.toml`
[a93c6f00] + DataFrames v0.21.2
The only way to really fix this is to add a [compat]
. So in the Project.toml of mypkg
, I add
[compat]
DataFrames = "0.20.2"
and now finally, when I add this project to sci-proj
, I get
(sci-proj) pkg> dev mypkg
Path `/home/affans/.julia/dev/mypkg` exists and looks like the correct package. Using existing path.
Resolving package versions...
Updating `~/sci-proj/Project.toml`
[f88dc3b7] + mypkg v0.1.0 [`~/.julia/dev/mypkg`]
Updating `~/sci-proj/Manifest.toml`
[a93c6f00] + DataFrames v0.20.2
I feel like add
(or equivalently dev
) only looks at Project.toml
files? Since the Project.toml
file says there is a dependency on DataFrames
, it basically just pulls in the latest one. I can get around this by using [compat]
but why can’t it look at the combination of Project/Manifest.toml
?
I thought about this when publishing a recent paper with my model. I published the paper using DataFrames#v0.20.2
and at time of publication, I created a git tag. Users should be able to checkout the tag in ten years, and be able to get DataFrames#v0.20.2
to reproduce the results. Note that since publication, I’ve since updated the model, added new algorithms and functionality and even updated my DataFrames dependency to the latest version. But still… that code at that specific tag works with 0.20.2 and has no guarantee it will work with later versions.
How do people solve this problem? Do I have to add a [compat] line everytime I wrap up a paper, check in/tag the updated Project.toml
file, and then remove the [compat]
to continue working? This seems … not right.
** I also just realized that my approach is flawed as well. Because I don’t check/commit/track my Manifest.toml
file, any one who adds my project in 10 years, will get the latest version of DataFrames.