[ANN] CommonMark.jl

CommonMark.jl is a markdown parser which is fully compliant with the CommonMark 0.29 spec. In addition to the standard features it includes several extensions such as inline and display maths, admonitions, footnotes, and tables. Additional extensions are planned, as well as a public interface for 3rd-party extensions.

The package is registered in General so can be installed with

pkg> add CommonMark

Please report any problems you come across in the issue tracker. Saying that, the package does pass all tests required by the CommonMark spec and so can be considered usable in the regard. Feature requests are welcome as well.

In comparison to the built-in markdown parser shipped with Julia this package provides inline HTML, correct handling of lazy paragraph continuations, and link reference definitions. There may be others as well, but those are the main ones.

65 Likes

Welcome back!

10 Likes

Thanks @cormullion!

4 Likes

Awesome :+1: I look forward to replacing my crappy hacks in Franklin to work around the base Markdown parser & use your package :slight_smile:

12 Likes

This is awesome! Does Documenter have the ability to swap out markdown parsers?

That’s some heroic hacking you’ve managed in Franklin! I’d be happy to discuss merging your syntax extensions into the package, most of them shouldn’t be too difficult to achieve.

7 Likes

Not currently, Documenter reaches into some internals of the current Markdown parser so it’s not exactly straightforward to switch out right now. I’d hope that eventually we could move towards using it ecosystem-wide since it matches the behavior of other markdown dialects a little bit better, but that will require some stress testing to make sure it’s up to the job first.

5 Likes

I actually have a branch that allows for swapping out the parser. The only requirement is that your custom parser produces the Markdown standard library AST.

On a related note, Documenter has the Markdown2 module, which aims to be a cleaner & stricter version of the standard library Markdown AST. I wonder if it would make sense to have a common lightweight interface package for the AST that both parsers and consumers could depend on?

4 Likes

This is great! It would be very good to replace the internal markdown parser with a snapshot of this external one. Or maybe ship with a version of your markdown parser but allow it to be overridden (instead baking it into the system image).

4 Likes

common lightweight interface package for the AST that both parsers and consumers could depend on?

The AST used to represent the parsed documents could probably be factored out I’d think.

1 Like

Thanks @StefanKarpinski.

It would be very good to replace the internal markdown parser with a snapshot of this external one. Or maybe ship with a version of your markdown parser but allow it to be overridden (instead baking it into the system image).

I assume in a similar way to how Pkg is loaded these days?

Pkg is baked in, I’m afraid. We don’t really have a good model for how to do this yet. Needs to be figured out soon though.

3 Likes

To be honest, as frustrating as it can be for certain extensions of markdown to not work, I’ve come to appreciate the rigorous adherence to the standard in Base. I probably wouldn’t complain if something more permissive replaced it necessarily, but if the ecosystem (especially Documenter and Franklin) allows me to plug in whatever parser I want, that’s a better solution.

2 Likes

Version 0.2.0 has now been registered. Release notes can be found here which summarize the major changes since 0.1.0. The README.md has a more complete overview of the currently available features.


An aside: the new Pkg and surrounding infrastructure is great! Big thanks to everyone involved in those efforts.

10 Likes

Version 0.3.0 has now been registered. A smaller release compared to 0.2.0 including:

  • a raw literals extension for passing through arbitrary text,
  • markdown output (could be used to auto-format markdown documents),
  • Jupyter notebook output (no evaluation of code cells is performed),
  • and some fixes for LaTeX output.
7 Likes

Version 0.4.0 is out. Quite a large release, with the following new features:

12 Likes

Version 0.5.0 is out. New features include:

  • AutoIdentifierRule extension for Pandoc-style automatic IDs for headings.
  • Allow passing a Parser to open to parse files directly.
  • Better round-tripping for markdown writer.
  • Non-strict column alignment in tables.
8 Likes

It’s been a while since any new features got added to the package, but recently interpolation support, along with a @cm_str macro, were added which is probably worth announcing here:

using CommonMark
word = "Interpolation"
cm"***$(uppercase(word))!***"

This was one of the last remaining missing features when comparing the package to the Markdown standard library.

13 Likes