Turn a module into a package locally

This topic has been asked many times. I have read many posts. Much of the information is incomplete or contradictory. I have read the package manual. It uses the word “project” in 2 very different senses. So, let’s not use that word–it is more of an informal notion. The manual also focuses on packages that are in a git repo. I use git locally for version management. I keep a sort of backup on github, but I am not systematic about dev vs. master branches. I realize that to publish the package in the Julia package directory I’d have to clear that up with discipline.

I just want to be able to do : using mything and have this work with Revise while I am still developing and testing the work.

It is a module in a Julia source file in a directory with other source files and some essential external files for loading essential parameters. The module and the julia source file have the same name (though the file has the .jl extension of course). It seems this is necessary; others say it is not. Confusing.

The module has using for 8 packages. The module has includes for 6 source files. The 6 source files are, naturally in the same directory with the source file containing the module definition.

This works: push!(LOAD_PATH, "/Users/lewis/.julia-local-packages/Covid") in my startup.jl file. Note that the directory is called Covid and the module is called CovidSim and the source file is called CovidSim.jl. It is symlinked from the directory where I work on it (this is what I and most IDEs would call a project–a place where all the stuff is, with or without extra metadata describing/defining the stuff).

I was told I could just do:

] dev <directory containing the source file for the module>

This, of course, doesn’t work. It is missing steps like generate and activate. I see that I could do add, which might work. But, Revise says that I mustn’t do that because Revise will not see changes to the package in that case. And I want Revise to work. The problem is that I have a module and some source files. I do not have a package.

First, then is to create a package with a Manifest.TOML file, a source directory, all in some special place. While I am still developing the “package” it would be inconvenient if I need to copy to that special place. If it is special (like in a specific place with a specific name like “source” that’s easy: I’ll just symlink it.

It is fine if the answer is just stick to pushing to LOAD_PATH. That has zero overhead and does everything I want.

When it is time to really make a package (if ever), then I’ll build it by hand carefully following the manual. I understand and accept that much of the package overhead is there for good reasons: enabling widespread distribution of packages RELIABLY and securely; providing dependency management; and providing users with super simple access to packages and management of packages (add, rm, up, etc.). All of this works really well; has enabled bootstrapping a large Julia ecosystem quickly; and has avoided lots of headaches that other language ecosystems have had. I just need a little workflow clarification.

1 Like

Please don’t post in so many threads about this. Since you have decided to make a new thread, I am reposting my reply here:

With regards to the definition of project, I agree with other posters, a project is just an environment.

It sounds like a frustrating experience. Let me give you my workflow to see if it helps you. As I understand it you two pieces of code

  1. A Package, which you want to re-use across many projects
  2. A Project, here meaning a set of code which has it’s own environment but is not meant to be re-used.

At the end of the day, I think you should have a folder structure like this.

Documents
 ├── Packages
 |      └── MyPackage
 |              ├── src
 |              └── Project.toml
 ├── Projects
 |       └── MyProject
 |              ├── src
 |              └── Project.toml

To do that, go into your Packages folder. This can be in Documents or wherever you want. It does not need to be in your .julia folder. Do ] generate MyPackage .

Next close out of julia. Go to your Projects folder and do the same command . You still want to ] generate . So you do ] generate MyProject .

Now you want to let your MyProject know that it can use code from MyPackage . To do this, go into your MyProject folder and do ] activate . . This will create a new environment. I think this might be where you are running into trouble.

Now your package repl should look like

(MyProject) pkg>

What you want do next is

] dev '~/Documents/Packages/MyPackage`

Now using MyPackage should work and you can work on both in tandem.

3 Likes

Additionally, It is not clear what you want to edit using Revise, exactly. Would you like to edit MyProject or edit MyPackage?

1 Like

It’s definitely necessary, and I don’t know who would have told you otherwise. If you can find a link, then we can correct whatever misinformation is out there.

This is easy to verify:

  • Create an empty folder and cd into it.
  • Create a single Foo.jl file containing module Foo:
$ echo "module Foo; end" >> Foo.jl
  • Start Julia add the current directory to the LOAD_PATH and then load the module:
julia> push!(LOAD_PATH, @__DIR__)
4-element Array{String,1}:
 "@"
 "@v#.#"
 "@stdlib"
 "/home/user/folder"

julia> using Foo
[ Info: Precompiling Foo [top-level]
  • Now rename Foo.jl to NotFoo.jl, so the module name does not match the file:
$ mv Foo.jl NotFoo.jl
  • And try again:
julia> push!(LOAD_PATH, @__DIR__)
4-element Array{String,1}:
 "@"
 "@v#.#"
 "@stdlib"
 "/home/user/folder"

julia> using NotFoo
[ Info: Precompiling NotFoo [top-level]
ERROR: KeyError: key NotFoo [top-level] not found

Perhaps someone at some point told you that you can have nested modules whose names are not the same as the top-level module? That’s true–you can have Foo.jl with:

module Foo

  module InnerModule
  ...
  end
end
3 Likes

Thanks.

It was a discourse comment. I think the point is missing altogether in documentation. I’ll re-read and post an edit if it is really missing.

The source code…

Thanks for all the suggestions.

Maybe my question wasn’t clear and we should just forget about “project”. It seems to mean the environment for one’s Julia installation. Or maybe the environment notion that Pkg permits so that dependencies can be collected in once place
just for one bit of work. Like a virtualenv in Python. It’s not as clear as it might be, but it’s not a problem. And, I can see that it is needed when working on different “projects” (in the informal sense) with different dependencies.

I want to make a local package and keep working on the code IN THE PACKAGE. One day, it’s really a bit self-contained and seems to fit the description of “application”. Too many words for too many concepts—let’s not go there. It’s just
about 3000 lines of code and needs to be split across source files and a module in a package
seems like the easiest way to treat that as one coherent whole. A module would be fine, but there are issues with including a module; a package seems cleaner.

You just need to do the first bit of what @pdeffebach said then (or what I said in the other thread). Do that, it will generate some files/folders. Inside the /src folder that gets generated, that’s where you put your code. It can take pretty much any form except you do need to have a .jl file that has the same name as the package, which defines the module itself. Suppose you’ve stuffed your code into this folder - now your package is created.

You still haven’t told the Julia package manager anything about this package (even though you used Julia to create it). So you need to

] dev '~/Documents/Packages/MyPackage`

like @pdeffebach said. This tells Julia that you are currently working on revising some part of this package.

Now when you want to actually do some development (with Revise because why not do it the easy way)

using Revise
using MyPackage
# test your functionality, make changes, etc
3 Likes

That’s really clear. Thanks.

A note on my above post. I don’t think I realized that you want to keep all the code in one folder tree. This should be very easy to do with Revise. I have a folder MyProject that meets the qualifications of a Package, i.e. it has a Project.toml and inside src there is a file called MyProject.jl. This means that using MyProject works when I start julia with julia --project (Or call Pkg.activate(".")). My folder structure looks liks:

shell> tree
.
├── Project.toml
└── src
    ├── Mod2.jl
    └── MyProject.jl

1 directory, 3 files

MyProject.jl looks like this:

module MyProject

export Mod2

greet() = print("Hello World!")

include("Mod2.jl")

end # module

Notice that nowhere do we write using Mod2. However we can access it from REPL after using MyProject.

Importantly, Revise works with this folder structure. If we do edit("src/MyProject.jl") and make changes to Mod2.jl these changes will show up.

If you think that Mod2 doesn’t contain functionality that is needed outside of MyProject and you don’t think Mod2 needs it’s own dependency management, this is a perfectly good workflow.

I think what was tripping you up was that you expected using Mod2 to work. using looks for things in LOAD_PATH and your Project.toml etc. Neither of those things know about Mod2, which is fine. And nor should they since it’s just a name-space used in MyProject.

If you have many Mod2.jl-type files scattered throughout your system, there is nothing stopping you from includeing them. But as other people have mentioned this is a recipe for lots of bugs and dependency hell.

In conclusion

  1. Everything in 1 repo called MyName? Make that repo a Package, meaning it has the src/MyName.jl file. You can organize your code through many modules in the code.
  2. Multiple modules you want to use each without a full package structure? As a stopgap, you can include them in src/MyName.jl but it’s better to bite the bullet and make all those Mod.jl files packages in their own right.
2 Likes

3 years later and I am still tripping up with this.

I don’t understand what a project is or why the word is even used. It seems it is just a directory somewhere that is a git repo (local or with a remote on github) and contains some julia code in files. What makes the directory a “project” is the existence of a project.toml file. I have a valid project.toml file. It mentions all of the dependencies of the module, which also happens to be a package with a valid uuid.

I understand the concept of environment but not the physical existence of an “environment”. It is also just a directory containing some julia files collected into one or more modules. In my case, it is the very same as the proejct. What makes it an “environment” is having a manifest.toml file. I have a valid one of those, too. I am not quite sure which command I used in Julia to generate it, but it is there. Is it Pkg.activate()? See my confusion: none of the Pkg commands say anything about project or environment–these are notional, not concrete. And they seem to completely overlap. Can one exist without the other?

I “test” my package in a jupyter notebook. I make changes. Sometimes, after I restart the Julia kernel, I see changes I just made. Sometimes the changes will not appear.

The first cell in the jupyter notebook is: using mypackage.

The problem is I don’t get the latest state of the source code. Since it is all local I don’t understand why not.

Is there something I must do every session to get changes in the working set (git terminology) of the directories to become “known” to the package. Even though package is nothing more than the current state of the directories? Do I need to git commit the latest changes for them to be “in” the package? I just tried that and it doesn’t work.

I have tried to ] up mypackage. stuff is updated probably in my packages folder of ~/.julia.

Now, I get error messages refering to line numbers that are wrong. The “package” I am using is simply not the same as the package’s source code. Hence, my problems.

I have read the documentation and I am just dumb and find it impenetrable with lots of conflicting concepts (environment, project, package) that I can’t grasp as concrete realities and arbitrary procedures to create environments.

I think my problem, other than sheer stupidity, is that I think of Pkg as a way to manage packages. But, it also serves as a tool to help develop and distribute packages. Several of the commands change the package metadata. But, it is not clear to me where these changes occur. Finally, when I do ] st mypackage I see what appears to be a commit identifier, but it doesn’t match any commit — Ah, I see that it is the beginning of the package uuid.

After doing “up”, I now find I have 2 versions of the package. Why? Which one am I “using”? Which one am I developing? Can I delete the older one? (It might mess up my julia config, but it won’t lose my work…)

do I need to do Pkg.develop? It seems so. But what I don’t need is a second copy of the package. It only lives on my machine (yes, there is a github repo, but it is NOT registered in the Julia package repository). So, should the argument to develop be the package name or the directory of the package source? I think the latter. If I do get a second copy, how do I get changes back to the “real” location?

I am sorry for this but I can’t seem to make it work and the careful instructions provided above don’t seem to correspond to what I am doing.

Sorry for the whining. I really, really don’t understand how Pkg works.

Here are some concrete things that may help you help me.

Here is project.toml:

name = "CovidSim_ilm"
uuid = "c70e2dfc-8e06-4fce-bc57-ad774aa1cb0a"
authors = ["Lewis Levin <lewis@neilson-levin.org>"]
version = "0.8.0"
repo = "https://github.com/lewisl/CovidSim_ilm.git"

[compat]
julia = "1"

[deps]
DelimitedFiles = "8bb1440f-4735-579b-a4ab-409b98df4dab"
DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
OrderedCollections = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
PrettyPrint = "8162dcfd-2161-5ef2-ae6c-7681170c5f98"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
PlotThemes = "ccf2f8ad-2431-5c83-bf29-c5338b663b6a"
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
YAML = "ddb6d928-2868-570f-bddf-ab3f9cf99eb6"
LazyTables = "3f3958f0-9dd7-4ba3-9331-125492819f7d"
TypedTables = "9d95f2ec-7b3d-5a63-8d20-e2491e220bb9"
Interpolations = "a98d9a8b-a2ab-59e6-89dd-64a1c18fca59"

[extras]


[targets]

That github repo really does exist but I don’t push all the time because I want the local changes to be tested and work–so it goes out of date by a few weeks.

Here is how the package is added locally to my Julia environment. I run only one environment. I don’t want to use specific versions of package as dependencies. I tend to use the latest “good” builds of other packages across all my work in Julia.

(@v1.9) pkg> st CovidSim_ilm
Status `~/.julia/environments/v1.9/Project.toml`
  [c70e2dfc] CovidSim_ilm v0.8.0 `/Users/lewis/Library/CloudStorage/Dropbox/Covid Modeling/Covid-ILM#main`

Now, that is interesting. Note that the local package refers to the main branch. I did not add that explicitly. I add the package using its local path and Pkg adds the branch. The work I just did was on a feature branch and that clearly is not what “using” retrieves. So, it appears that for a local package, the package manager accesses what it finds in the local git repo, not the working set–which was pointed to a different branch.

And Pkg.update() expects to go to the location of the package and the applicable branch. So, I never was loading the code of my “feature” branch. I suppose it is doing just what it is supposed to do.

So, my question is can I use feature branches in git, which is the sensible thing to do. Can I run the code that exists in my directories. To do that, I probably should just cd to the right folder and load the module, ignoring what is defined as a package.

It would be nice if there were a better way to do this that is closer to what people working on package do–but you guys probably fork the repo of the package and generate a pull request of your changes–even if you are the same person that will accept the pull request and merge it.

I would like to keep using git and have a package, but with a little less complexity. Perhaps if I use Pkg.develop(“<path to the code”>). What I don’t understand is where I’d put this command? Not in teh repl because that is not how I test various scenarios. I do that in a notebook. So I am guessing I would do Pkg.develop(“”) as the first cell of the notebook and NOT using mypackage.

I can see what’s going wrong, but not how to fix it.

The version of the package source code where I work on it, with the git repo with a remote at github, is not the same as the code in the packages directory at ~/.julia.

This makes sense. There is no way it would automagically change. Pkg.update() isn’t going to change it in a good way: that will go out to the repo in project.toml. That would only work if I have pushed the latest changes to origin/main.

Is there a way to update the local copy of the package. I can force it by removing the package and re-adding it from the local directory.

Is that the best or only way? …doesn’t feel like it should be…

Found a nice article at: jkrumbiegel.com - Pkg.jl and Julia Environments for Beginners

This is much clearer than the Pkg doc and shows what the key commands actually do.

So, the answer is to simply do ] up. This will also pick up upgrades from the central repository. But, because I added my local package at its local directory, Pkg updates it from that location and ignores the git repo in my package’s project.toml.

I sort of overlooked the obvious that even though my package is local, by actually adding it to my @1.9 environment (my default environment) it goes into project.toml and manifest.toml there and it “downloads” the code of the package just as Pkg must do for a remotely obtained package–simply by copying it from the local repo to ~/.julia/packages. I incorrectly expected that there would only be one copy of the code, but of course there are two.

The other thing I have to realize is that I must commit any changes to the local repo (the origin remote doesn’t matter… …please verify) because Pkg doesn’t go to the working set (local directories where I make changes) but only to the repo. This also makes sense because that is how Pkg works with packages we download from the central repository (or any other remote repository). I think of my package as local, but to Pkg it’s “remote”–just that remote is nearby at a directory and not a url.

So, I finally am close to understanding this but not sure I am using the appropriate workflow.

Final questions:

  • should I start the notebook (my experiment and testing playground) with Pkg.develop()?
  • Will this actually pick up changes from the working set even without committing?
  • Will this avoid making a copy of the package (as it would for a remote package repo) since I, in effect, already have made a copy of the package?
2 Likes

Sorry I didn’t read the whole very long thread, but are you aware of Revise?

I suggest you create a new, separate thread, with a standalone question, if you want help. And try to be more concise. When I see such walls of text that seem hard to follow I just choose not to read them.

I use Revise with Julia and IJulia.

Won’t solve this because a package is being edited while also being used.

But that is exactly what revise is for? You do using Revise; using MyPkg, and then when you edit the code in MyPkg those changes automatically get reflected in your session.

2 Likes

With the caveat that I mostly skimmed the thread, I think a key issue here is that you seem to have used add to add your local package to the environment you’re working in (which in your case is the base @v1.9 environment). As you’ve observed, that will make a copy of the current state of your repo and not track updates. You need to use dev for that.

Try something like the following:

(@v1.9) pkg> rm CovidSim_ilm

(@v1.9) pkg> dev /path/to/CovidSim_ilm

Now start your notebook session afresh and see if you can make changes to CovidSim_ilm and have them picked up by your notebook (provided you did using Revise in your notebook).


Eventually, you might want to migrate your notebooks to their own separate environment rather than using the base @v1.9 environment. In that case, you should dev your package there instead. Structure-wise, if it makes sense to you to bundle these notebooks with the package for testing/showcasing its functionality, you can place them inside the package directory in a separate folder, like /path/to/CovidSim_ilm/demo or /path/to/CovidSim_ilm/notebooks or similar. This makes the dev call exceedingly simple: (demo) pkg> dev .. (make sure to activate the local environment inside the demo folder first). On the other hand, if the notebooks form an independent project that just happens to use your package, place them and their environment in a separate directory and use (notebooks) pkg> dev /path/to/CovidSim_ilm as above (once again after having activated the local environment in that folder).

2 Likes

Only if you dev the package into your current environment or are working in the package’s environment.

Final answers:

  • I didn’t actually read about your notebook and testing playground, but I think Pkg.develop is what you want for live testing of your code.
  • Yes.
  • Pkg.develop loads the package files exactly where they sit on your file system. No copies are made (unlike with Pkg.add).

I had some of the same confusions. Maybe the answers on my post will help you:

1 Like