Extract math from Word
It looks like there are 2 main ways of embedding math into .docx files (other than plain text):
- Using the built-in equation editor. This uses a tag XML structure - no binaries, it's all inline:
- MathType, the most common math add-on for Word, which uses math binaries that need to be extracted.
For both of these, we should be representing these in MathML (as the standard for HTML5). It looks like we will have to define the mapping for the first option, which could be pretty time consuming. For MathType, we'll need to convert the binaries. @jure's made a ruby gem that converts from MathType to MathML. It may be that we'll need to do a rewrite of this to use it, but it could be a helpful resource.