State of the Literature/References

I recently ran into the idea of using some references within my doc strings and within the documentation. Currently I do that using footnotes; which bears a small problem if one reuses a literature reference twice within strings that are later put on the same page by documenter – and sure there are first approaches, see https://github.com/JuliaDocs/Documenter.jl/issues/379 and the more recent https://github.com/ali-ramadhan/DocumenterCitations.jl .

Now I could need some references within a newly planned Franklin page (thanks @tlienart for providing this nice package), where I currently switched to storing my literature in a yaml file.

All this brings me to the question: Are there efforts to provide literature, see

which both seem to be somewhat active.

Can we maybe find a common base/startpoint? Currently I am missing for example support for the BibLaTeX flavor of .bib files, and sometimes small stuff like capitalized keywords (seem to be default in some tools) not being accepted as well as

  • CSL support for displaying
  • tools for notably reading but also writing bib-files?
  • employ biber to help parsing stuff

Maybe there are other projects around?
I would like to collect ideas how to get this a little more mature and usable (for example within Franklin), maybe compared to what jekyll scholar does (though one can not hook into the rendering in there actually, for example to enhance authors with their DOI)

6 Likes

BibTeX.jl isn’t really active: it was mostly written by @bramtayl and @stevengj, I just registered it and added docs. Bibliography.jl uses BibParser.jl, which uses the same underlying code (BibTeX.jl wasn’t registered at that point), and so has the same problems with parsing (e.g. BibParser.jl#5 and BibTeX.jl#12).

If someone wants to work on this, I would be very appreciative!

6 Likes

Thanks for the information. I am not yet sure whether I find time to really take the project on my side (since I already work on Manifolds.jl and Manopt.jl), but maybe collecting things here and getting to a nice roadmap would help (also concerning a CSL afterwards).

Bibliography.jl is active, but as I am the only maintainer it is dependent on my available time. The long-term plan is to rewrite a full Bib(La)TeX parser from scratch inspired by Tokenize.jl, or to go back to my original plan (below).

The original plan was to use a nice grammar and Automa.jl, but I am stuck on some practical issue with that package.

So, using the code of BibTeX.jl is actually a fix.

If the current can wait the end of the month, I will go back to it then. Let me know if there is an emergency.

4 Likes

Cool, that’s great! Sorry, I wasn’t 100% sure how active it is; and since I do not have the capacity to start a whole project myself, I’d happily check how I can help within Bibliography then.

Currently there is no emergency. For my Website project I am currently using a yaml file and the long time plan is to replace that with a .bib. For my docs (for example Symmetric positive semidefinite fixed rank · Manifolds.jl ) I use footnotes (see above, no double references), which somewhat works and is also not urgent.

What are next steps or things you could need some help with? Maybe for something not that large I can do a project in between the years.

I just found some guys of BioJulia for the Automa.jl on Slack, so I will ask for help to see if I can fix my current troubles with Automa. That would be the best for maintenance (and portability to other bibliography language) as you simply have to write a gramma. And the gramma can be easily modified on request (cf the current state: bibtex_automa.jl)

If I cannot manage, we can either do short fixes to the code from BibTeX.jl, and/or start a more stable parser (using Tokenize.jl as a model).

After my incoming deadline (on Friday), I should have some time for it.

1 Like

That sounds great! Gipst of all good luck for your deadline. This topic is surely not urgent; I am just interested in a nicer way to. handle my bibliography.

I might take a look into “the other end”, i.e. formatting/producing the output ( https://citationstyles.org ) then; of course after first looking in what you are doing in data representation and printing, for sure.

CombinedParsers.jl looks like it might be useful as well.

4 Likes

It looks really nice. I will definitely try next week :slight_smile:

Hello. Would you have any news on this topic?

I am sorry for being so late. I can’t find time for it (as an indirect consequence of covid ><)

I will have a quick look today and try to evaluate the amount of work implied and see if I can make it work on my little free time.

3 Likes

Quick update, CombinedParsers seems pretty easy to use. Building on the grammar I wrote for an Automa.jl parser last summer, I can probably make something usable in a week or two (I can only work on it on my free time).
I will notify people here once it is done.

4 Likes

Thank you for this! Will be a great development for Julia.

After trying different generic parsers and such, I finally handcrafted something for BibTeX … (handling Unicode was pain, but at least it is done)

On the current master branch of BibParser.jl, any valid bibtex entry should be parsed correctly.
Errors are not raised yet though (but the row/column indications are there, just I had no time to write the different error messages).

I think it is an ugly but working solution, and I hope to update it once we get something on the level of Yacc and Bison in Julia!

I will try to wrap up the error messages this week and tag a new version.

Meanwhile, I would be glad to have samples of BibTeX files from interested users to test in Bibliography.jl. That way I can have more resilient tests when I modify the parser.

I have just registered new versions of Bibliography.jl and BibParser.jl. The new BibTeX parser is handcrafted, which should be enough as BibTeX is not a language evolving much …

A few cautions about things I hope to fix soon (= when I have some free time again …):

  • Errors while parsing are not handled yet. Consequently, some invalid BibTeX strings might be parsed, or fail unexpectedly.
  • Most of valid BibTeX syntax will be accepted. The only known case currently inaccurately parsed is when the file contains new LaTeX commands in a preamble entry. All entries relying on those commands will fail (silently …)
  • LaTeX command within braces are not currently transformed to their respective Unicode character when possible

Please give it a try @simonbyrne @EvoArt @kellertuer @ignace and anyone else interested :wink:

6 Likes

Thanks for the update!

I checked my example from the issue I opened and up to me also having a typo in my MWE (so my fault) it worked when fixing said typo.

I was just wondering about the current data representation


julia> imported_bib["A"]
BibInternal.Entry(BibInternal.Access("", "", ""), BibInternal.Name[BibInternal.Name("", "A", "", "B.", "")], "", BibInternal.Date("", "", "2020"), BibInternal.Name[BibInternal.Name("", "", "", "", "")], BibInternal.Eprint("", "", ""), "A", BibInternal.In("", "", "", "", "A Journal", "", "", "", "", "", "", ""), Dict{String, String}(), "A Title", "article")

While for some fields, like the name you have to store it differently (I think it is something like surname-prefix, surname, surname-suffix, first name?) some Internal storage (or just this print) is not that nicely readable and I would have expected more like a dictionary of values?
I am Just asking out of curiosity, since – if I find time somewhen – I would maybe like to start a CSL parser, that provides pretty printing of Bibliographies; hence knowledge on storing data (or at least a good conversion) would be nice.

Another point might be to also cover BibLaTeX at some point (for example year changes to date and there is more fields available), but that might be something for later.

As it has been quite some time when I did that (and I was still catching up from Julia v0.3 to v1.4 …), I am not 100% sure of my choice. I kind of remember I did it that way to handle how to print following some bibliographies format such as APA. The result is visible in the StaticWebPages.jl package (example here)
I am opened to any change, though. It is also the reason why the bibliography packages are separated into 3.

CSL would be really nice. I started a little a year ago, and I lost the code on a computer crash (my bad, at the age of cloud …)

I guess it wouldn’t be too hard, I don’t have a lot of time now, though. I just hope that we could have an equivalent to Yacc and Bison in Julia.

So I will put CSL on my list as one of my next evening projects, when my evenings allow for that again :slight_smile:

And sure, BibLaTeX should be not that complicated, some field names change, some types and field names are added…

1 Like

I must admit I was not familiar with BibLaTeX/Biber, but after a brief check, i think the only difference on the parser side is to accept or not unicode characters in fields names and the entry key.

Now, on the storage side in BibInternal, I use a set of rules for BibTeX that is very restrictive by nature.
It seems BibLaTeX is more open on this side, so I would have to add a set of relevant rules.

On Bibliography.jl side, I would have to add an export to BibLaTeX.

It doesn’t seem too hard. I am more worried about mapping Unicode and Latex code ^^

With xelatex or the right inputenc/fontenc, the latex (and bibtex) files should be able to just be utf8 anyways. Was one of the reasons I switched to BibLaTeX, namely only having all in utf8 without mappings/conversions.

1 Like