I’ve had a long (and possibly excessive) interest in how to generate PDF files programmatically. There aren’t too many options in this space – paged rendering is a pretty hard thing to do well. The state of the art methods today need either a full TeX distribution, or a full Chromium distribution, both of which are very large bundles. Especially if you are distributing your own software, many hundreds of megabytes of additional dependencies can often be a problem. We’ve certainly had issues with the size of our container images due to these dependencies.
So I was glad to see a new C++ project, PlutoBook, which converts html files into pdfs or pngs. (Or Cairo surfaces, for that matter). It implements its own purpose built html and css renderer. I liked what I saw, and have now wrapped it into a Julia package: PlutoBook.jl.
Isn’t EPUB essentially just html (with images and css zipped up)?
In general, EPUB is a reflowable content, not paged content. There are quite a lot of tools that convert from markdown or html to epub. For use from within Julia, Pandoc.jl is one option.
Plutobook.jl is for converting from html to png/pdf.
Interesting. The name is quite bit puzzling, as one would expect this package to be able to generate well formatted pdfs from Pluto.jl files. But as the name comes from the plutobook C++ project, it appears to be legitimate. I think this should be explained in the readme (I contemplate a PR on this).
Nevertheless, it is interesting how this can work together with Pluto.jl, I tried:
html files directly generated by Pluto.jl end up in empty pages
html files created with PlutoStaticHTML.jl appear to render well. LaTeX formulas are rendered just as raw LaTeX, the reason for this of course is that plutobook does not support javascript. However, the first todo item in the plutobook README is the addition of some lightweight javascript engine. If this would support KaTeX, we could get a reasonable workflow with Pluto.jl notebooks.
[quote=“j-fu, post:5, topic:132641”]
I think this should be explained in the readme (I contemplate a PR on this).
[/quote]
This is mentioned in a note at the end of the package readme.
Yes, indeed. There is also Tectonic.jl
Indeed, Pandoc is an amazing piece of software, but natively it writes .tex files. You then need a TeX implementation to actually create the PDF. Pandoc can call it, but you need to install it separately. Admittedly, this probably gives you the best fidelity in creating documents.
This could also be nice for visual reference tests of html content in CI, as it can render to png. You can always string compare html but you might make changes that are not supposed to change the visual output and this way you could ensure that