Citations in docs

I can’t really say how this should be implemented because I am not really that familiar with the code of DocStringExtensions. But as far as I know, you can basically register callbacks to Julia’s docsystem. You’d have to read DocStringExtensions code to figure out how all that works.

From the design point of view, DocStringExtensions relies on the dollar-sign interpolation. So it could look something like this:

using DocStringExtensions
import DocStringExtensions: Citations

# Adding citations in source
Citations.add_reference(:fisher,
    title="...",
    author="...",
    year=1999
)

# .. or importing from a bibtex file
Citations.bibtex("references.bib") # relative to @__FILE__

"""
This sentence has a citation $(CITE fisher).

$(REFERENCES)
"""
function foo end

And the docstring should be expanded to the following Markdown:

This sentence has a citation [Fisher 1999].

# References

* Fisher, A. (1999) Title. Etc. [doi:xyz](https://dx.doi.org/xyz)

$(REFERENCES) would gather up all the $(CITE)s in a docstring. I don’t have a preference on the exact citation format and it should be easy to support different styles (Citations.setstyle() or something like that, with a reasonable default).

Not sure how one should handle modules / packages exactly. The reference database should probably be shared within a package, but packages should not pollute each other’s databases. But from a technical side I’m not sure how easy it is to figure out what package a particular docstring lives in.

Since this would make DocStringExtensions depend on the BibTex parser, it might be better to have this in a separate package, one which would depend on DocStringExtensions and the parser (DocStringCitations?). DocStringExtensions could contain all the code to facilitate the development of these types of extensions.

1 Like

Thanks @mortenpi.

Does anyone want to take a stab at this? If not, I’m happy to work on it (as time permits).

I would say that the next step, before we worry about integration with docstrings, would be a working citation formatter: https://github.com/bramtayl/BibTeX.jl/issues/3

Good call. Thanks for pointing out.

Sorry to bring the topic back (it might be slightly outside of topic …).

I have stupidly decided to create a bibtex parser for a personal project last week without checking if there was something ongoing (I still checked if there was something in the registered repositories though).
I used Automa.jl to write a parser down in a hundred line. I have never down any parsing before that and I was new to Automa.jl. So it is pretty easy to use.
It seems to be pretty efficient as soon as we precompile the grammar automata. As most bibliography format are pretty easy grammar it should be easy and quick to produce the automata for other bib format.

For my project, that handles citations networks, I am planning to use a Julia internal type to convert from one format to another. So far it only convert from and to Bibtex. However, there is one interesting feature than maybe would make sense for citations in docs.
My internal citation format uses a rules system to check the format of a citation. Those rules check the existence (and eventually formatting) of the different fields in bibliography entries. For instance, the basic set of rules when formatting from or to bibtex style check that the relevant fields (required/ optional/ forbidden) are provided (or not).
Wouldn’t this rule system be good for documentation? As long as the provided entries (as the internal bibliography type) follow the rules needed for the documention, they those entries are converted to the required format for the documention.

I will upload both the bibtexparser and the bibliography internal type and rule system through two packages this week, so I can keep you updated if interested.

3 Likes

OK, after giving it some time, I organized the bibliography packages system as follows:

  • A package to handle an internal type for bibliography items in Julia: BibInternal.jl. It also include the rule system mentioned above (work in progress though).
  • Set of parsers from any citation language to the BibInternal.jl format. Currently compatible with BibInternal.jl is only BibTeXParser.jl. If other parsers are or become available it should be pretty straightforward to make them compatible with BibInternal.jl.
  • A top package to aggregate BibInternal.jl and any parser: Bibliography.jl. Importing from (parsing) and exporting to (file or string) BibTeX is less than 40 lines, thanks to multiple dispatch. I think it is pretty straightforward to extend it for the documentation now.

Please note this is just a proposal, and there are a couple of parsers for bibtex or other bibliography languages that are available in Julia (so my parser is not necessarily the best choice).

I will use those package for my own project (unless something better/smoother pops up), so if there is an interest in this organization of packages, I can register (part) of them. Let me know.

1 Like

The discussion here might be relevant for you too: ANN: Our Julia paper in SIAM Review

Thank you, that is quite interesting. And a nice application of having a bibliography package system.
Along with getting all articles related to a package, it would be nice also to generate a citation for the package itself (and maybe related packages) as an online resource. Getting all the contributors, the owner, the url, the current tag version, etc, all formated and then exported to your favourite bibliography format (for instance BibTeX). This could be automated quite easily I suppose.

1 Like

Any news on the BibTeX/References within DocStrings? I would be very happy to use them to cite several of my Algorithms correctly and – best case – to use it also within Documenter.jl for the Documentation in general.

1 Like

Somewhat tangentially related: I am currently creating a package that will allow people to maintain a database of LaTeX notes, and the Julia package will automatically be able to parse this database of LaTeX notes to extract citations, references, and it will allow authors to generate graph diagrams depicting the inter-relationship and dependencies of definitions, theorems, calculations, references, and results. It might be possible to substitute it in place of a docs system in the future, but for now it’s a prototype concept for maintaining a body of research and citations and making it into a computational graph database.

My hope is to have it fully functional by October, but the prototype is moving along nicely, 0.7+ only.

That sounds great! Might be a little too much for my need (I would just like to add literature to thedoumentation of algorithms) – but I am still lookingforward to that package; since it might also help a Documentation (Documenter.jl) to introduce LaTeX commands for nicer writing.

So after some time trying to port BibTeX parsers, and then some time trying to port Citeproc parsers, I’ve decided to give up. By I still have a thought:

  1. put bindings to pandoc into an external package (should already be in Weave.jl I think)
  2. Using Requires.jl, add some optional functionality to Documenter based on pandoc

Does that seem reasonable?

Citeproc is actually one thing I got quite used to for my homepages (Jekyll and Jekyll-Scholar), which directly works with markdown. I haven’t tried to write something myself (since I am mainly writing my own package on something different), but Weave might be a starting point.

Citeproc is relatively easy to parse if you want something bare-bones, which may be sufficient for in-doc references. References are stored as JSON or YAML, so there wouldn’t be a need to rely on pandoc to generate anything.

Thank you for the response. Can you provide a little more detail? For example: How would I get Documenter.jl to use such JSON or YAML? How would I invoke citeproc onto the markdown files of Documenter (the /docs/src-folder) such that all inline docstrings are also parsed?
That’s why for now I am looking more for an extension to Documenter – for me, building that from scratch seems to be a never ending story of reading and including quite a few packages and find a way for them to interact.

My GitHub - chakravala/VerTeX.jl: Typeset scattered graph data rewriter based on LaTeX nodes package that I announced above uses TOML instead of YAML or JSON by the way. I’ve made some progress on it, and will be considering the citation issue soon, but I’m definitely going to try making a parser for it, and perhaps I will make it a separate package, we’ll see.

1 Like

I think yaml and json are reasonable formats because that’s what citeproc uses. You can e.g. export from zotero directly in this format (it’s also supported by pandoc).

1 Like

I am adding citation support that doesn’t rely on Pandoc to Weave.jl Add support for citations by mpastell · Pull Request #185 · JunoLab/Weave.jl · GitHub.

I have extended Julia Markdown to parse citations from text and using Update for Julia 1.x by mpastell · Pull Request #1 · JuliaTeX/BibTeX.jl · GitHub to parse bibtex.

The current state is that Latex output works, html works apart from formatting the references. If BibTeXFormat.jl get’s updated then it would be easy to get nice html output as well.

8 Likes

Has there been any update on this? I don’t see citation information in either Documenter or Markdown.

Yes, you can use:

2 Likes