I am interested in creating a GitHub repository for part of my PhD. The repository will be mostly Julia code that can be shipped as a package, but will also contain a latex directory for writing a paper related to the code (as some other folders that are not needed for someone that just want to install the package). My question is: the Julia package structure must be at the root of the github repository (and the ‘latex’ folder will be copied when someone install the package), or the package folder (that contains src
, test
, etc…) can be at an arbitrary level inside the github repository and there will be no problem for shipping the package after (just give the github url path to the specific folder)?
A package can be located at any folder in the repo.
Are you planning for others to install and use your package, or is it just for documenting analysis code? If the former, I would eventually separate that repo from your thesis repo and register it as a standalone. While you’re first working on it, it could make sense to keep it all together so you don’t have to worry about it, but eventually you’ll want to have a stable version of the code for everything in the thesis, and using a Project and Manifest makes documenting which version of code was used easy.
If the latter, it’s easy to just keep it all together as a single project.
Ideally I want any person to be able to install the code module as a package, even if using the git URL to the specific folder instead of having a package registered in the General Registry.
This comment makes no sense to me. I want to know if (1) I should have a entire github repo just for the code or (2) I can have a repo for the both the code and related materials, keep all the package structure inside a folder (instead of the repository root), and pass the folder URL for anyone interested in installing the package. I already have a Project and Manifest and I have the package structure inside a folder, one thing does not conflict with the other, at no moment I considered not having a Project or Manifest.
Most of my work is like this. A package for reuse and a project that depends on the package for preparing a paper.
If you want people to use your package, it has to be in its own git repo. Which shouldn’t have other random stuff in it. It shouldn’t have a manifest, just a Project.toml
Put your thesis project in another git repo, and depend on the package in its Project.toml and Manifest.toml.
For my dissertation I used Julia for my articles. First, you should decide whether the project is a package or an application.
- Project is an umbrella term: packages and applications are kinds of projects.
- Packages should have UUIDs, applications can have a UUIDs but don’t need them.
- Applications can provide global configuration, whereas packages cannot.
Two of my articles were applications. Those were studies which used Julia and the code is packaged together to allow for reproducibility / transparency. The third chapter was a package. Currently it is being reviewed for JuliaCon proceedings so the paper is in the repository (i.e., paper/paper.tex
). For applications, I usually have the project under code/ApplicationName/Project.toml
).
So what are the benefits/drawbacks? For an application, the code and paper should most likely be together in the same repository (e.g., figures are generated from the code and placed in the relevant directory). For a package, it depends. CRAN, R’s registry, usually have vignettes and manuals saved and downloaded when a package is installed since it is beneficial for the package users. The paper serves an additional documentation role. If the paper does not serve a documentation role or it is not complimentary to the code, it could be hosted in a separate repository with link to the code repository.
A few things on having the paper and code together:
- it will likely change the major programming of the repository (i.e., instead of a Julia repository it might be a TeX repository; might not be the best for convincing GitHub to treat Julia better)
- the additional files will be download with every installation by the users (there isn’t a convenience function for accessing the paper so most users will not have access to it and will be just taking space). It would change if Julia develops a convenience utility for accessing those materials like other languages, but currently users would likely have to access the remote to get access to it directly.
@StefanKarpinski Everyone else seems to be saying that I need the whole repo for the package (what is the default). To have the package inside a folder of the repo I just need to have the structure inside the folder instead of the root? It will play nice with any tools made to install packages and work with packages?
A package version is just a tree hash which can be anywhere in a repo. But a lot of the tooling doesn’t handle that yet, so it will be a bit rough for now, but it’s a use case that we will support.
Thank you for explaining that. This is a bit of information that is
easier to get from an human being than by search alone.
You can have either. For packages that need more intricate documentation using equations, I keep a subfolder in docs/
for \LaTeX code, and occasionally even compile it for CI.
I would just start things in the same repository, you can always split later.
@Henrique_Becker if you are not aware of it already this package genwerates Latex from Julia objects:
That’s super useful.
I tend to use git submodules to keep a projects code/results together with the latex documents. That way, you can treat the project as one whole when that makes sense and as separate entities when it does not.
Edit: a better git submodules link.
I have a project that should include packages for several different languages (back end code), along with some common artifacts (front end things HTML, JS, and CSS).
@StefanKarpinski, is there a way to handle this in Julia? Ideally I’d like a directory structure like this:
Git-repository
|
|-- html/ (artifacts used by packages in various languages)
|-- julia/ (julia code in here, along with Project.toml, Manifest.toml, and Artifacts.toml)
+-- (directories for other languages)
I don’t see way to put a git-tree reference to html/
in the current package (wherever it happens to be hosted) into Artifacts.toml
. Do you have to hard-code the repository URL to do this?
It seems like this even if there was a way to refer to a git-tree in the current package, we still have to wait for PackageSpec
to allow subdirectories and for Registrator.jl
to handle subdirectories.
It might be worth starting a new post with this question, especially considering the last post on this topic was 4 months ago.
Separately, is there a reason you tagged Stefan? It’s usually considered bad form to tag people unless they’re really the only person that can answer your question.
Huh, I’ve never heard that it’s considered bad form. I’ve been tagged many, many times in discussion groups when I was obviously not the only person who could answer a question.
I made a new post here.
Fair enough. I don’t think anyone will hold it against you . And it’s possible I’m speaking it off turn here, but I think that’s been discussed here before.
Especially for the language creators and core devs - they have a lot of demands on their time (and are pretty generous about spending their time on the community), so adding to their cognitive load isn’t great.