Description
TLDR: Can we create a list of LaTeX commands that generate all elements described by the core spec?
The goal of the Wikimedia community group math is to improve the display of mathematical expressions in Wikipedia. Indeed, using browser-based MathML rendering to deliver high-quality formulae is desirable. The new MathML core specification seems promising as it appears to be detailed enough to implement and evaluate MathML rendering engines based on the spec. Therefore, there are good reasons to be optimistic. Once the spec is final and the rendering engines have been implemented, reasonable MathML markup will lead to appealing rendering results that the community will appreciate.
However, the de-facto standard in 2022 for authoring and rendering mathematical formulae are formats from the TeX family. Therefore, I suggest a deeper investigation of the conversion process TeX like inputs formats to MathML. We need conversion tools that generate the intended MathML 4 output from TeX like input as a prerequisite for our new MathML 4 standard to become a success story. In 2018, we evaluated several TeX2MathML conversion tools including those listed on our tool page. At that point, we created a manual gold standard dataset for presentation and Content MathML. However, the gold standard dataset's quality might not be optimal as it was influenced by LaTeXML. In particular, we used LaTeXML to generate the initial version of the MathML output and fixed problems we spotted by chance in that output.
Therefore, I suggest creating a non-normative document describing how to convert TeX expressions to the corresponding MathML core expression. While this task is open-ended, I recommend stopping after all elements described in the MathML core spec have at least one corresponding LaTeX input.
After that is completed and we still have enthusiasm, we could extend the exercise not only for core but also for intent. Here one could stop, for example, after having touched all symbols with the planned custom style tag annotations and their corresponding content MathML representations.
Disclaimer: I am currently considering implementing a texvc to MathML converter in PHP. For a TDD workflow, it would therefore be good to be able to generate meaningful test cases.
Activity