[ANN] New Package: PlutoBook.jl

I’ve had a long (and possibly excessive) interest in how to generate PDF files programmatically. There aren’t too many options in this space – paged rendering is a pretty hard thing to do well. The state of the art methods today need either a full TeX distribution, or a full Chromium distribution, both of which are very large bundles. Especially if you are distributing your own software, many hundreds of megabytes of additional dependencies can often be a problem. We’ve certainly had issues with the size of our container images due to these dependencies.

So I was glad to see a new C++ project, PlutoBook, which converts html files into pdfs or pngs. (Or Cairo surfaces, for that matter). It implements its own purpose built html and css renderer. I liked what I saw, and have now wrapped it into a Julia package: PlutoBook.jl.

Julia Repo: GitHub - aviks/PlutoBook.jl: Paged HTML Rendering Library
Documentation: Home · PlutoBook.jl

Hopefully this will be of interest to a few other people as well. Feedback at the usual places much appreciated.

[PS: IMO the best method of generating high quality pdfs dynamically is a Java library, but that’s a conversation for some other time and place :smiley: ]

20 Likes

Can it convert html files into epub format also? EPUB format works better in E-readers/ kindle etc.

Isn’t EPUB essentially just html (with images and css zipped up)?

In general, EPUB is a reflowable content, not paged content. There are quite a lot of tools that convert from markdown or html to epub. For use from within Julia, Pandoc.jl is one option.

Plutobook.jl is for converting from html to png/pdf.

2 Likes

I could not get good EPUB file for latest Julia documentation.

Interesting. The name is quite bit puzzling, as one would expect this package to be able to generate well formatted pdfs from Pluto.jl files. But as the name comes from the plutobook C++ project, it appears to be legitimate. I think this should be explained in the readme (I contemplate a PR on this).
Nevertheless, it is interesting how this can work together with Pluto.jl, I tried:

  • html files directly generated by Pluto.jl end up in empty pages
  • html files created with PlutoStaticHTML.jl appear to render well. LaTeX formulas are rendered just as raw LaTeX, the reason for this of course is that plutobook does not support javascript. However, the first todo item in the plutobook README is the addition of some lightweight javascript engine. If this would support KaTeX, we could get a reasonable workflow with Pluto.jl notebooks.

Will put this onto my watch list.

4 Likes

The Typstry.jl Julia Typst wrapper probably deserves a mention here - much more lightweight than a LaTeX installation and high-quality rendering.

5 Likes

Pandoc.jl wraps the Swiss Army Knife of format converters.

2 Likes

But now I want to know what your thoughts are on this as someone also obsessed with dynamically preparing documents…

Thanks a lot for this package!

I created two issues.

1 Like

[quote=“j-fu, post:5, topic:132641”]
I think this should be explained in the readme (I contemplate a PR on this).
[/quote]

This is mentioned in a note at the end of the package readme.

Yes, indeed. There is also Tectonic.jl

Indeed, Pandoc is an amazing piece of software, but natively it writes .tex files. You then need a TeX implementation to actually create the PDF. Pandoc can call it, but you need to install it separately. Admittedly, this probably gives you the best fidelity in creating documents.

Yes, indeed. There is also Tectonic.jl

Thanks, I didn’t know that one yet.

This could also be nice for visual reference tests of html content in CI, as it can render to png. You can always string compare html but you might make changes that are not supposed to change the visual output and this way you could ensure that

1 Like

Yeah, full TeX is a monster. You can ameliorate with TinyTex, which provides the pdf engines without all the packages.