Does a project like this already exist outside of julia?
It seems that the current level of integration among julia packages for machine-learning, mechanistic modeling, and data provides a unique opportunity to develop (and provide a julia-based reference-implementation, see (2) below) of a markup standard which can aid in the reproduceability (in the scientific, language-agnostic sense, eg this standard which unfortunately neglects parameterization techniques or anything related to machine learning) of scholarly works.
Even in domains like biology, where standards like the one above existed back when models were essentially parameterized by hand via experts (so that a paper could be annotated comprehensively without the issue we currently face, which is how to annotate learning of model parameters or even model structure, in addition to annotation of the data and the model itself), the level of adoption has been really low. some reasons may include:
- the high cost (in mental effort and hours) that an author faces to curate a single publication using an existing standard, like the one above above.
- in general, standards organizations don’t publish reference-implementations to automate (1), probably because they are made up of people who program in a bunch of different languages
- annotation of even a primarily mathematical entity like a parameterized ODE model may somehow be easier from a domain-specific perspective, thus hindering development of interdisciplinary markup standards. based on very brief exploration I could only find one repository of such a curation effort and it is entirely focused on biology (also, despite the availability of namespaces to support identifiable parameter units)
- an interdisciplinary repository would require some group of people aware of the ontologies of their respective domains to work together to ensure consistency
I have no idea how to address (3-4), but my intuition is that (1-2) could be essentially solved by a team working in a single language via a doc-string like approach so that implementing (learning structural and parameter unknowns, simulating, etc) the model for a paper and curating the annotation for that paper could take place at the same time, and could even be automatically pushed to a repository like the one above via something resembling package registration.