fix(service): fixes memory issue, adds math processing behind flag
This MR does two things:
-
This implements the
exec
/`spawn` fix suggested in #4 in `DOCXToHTMLSyncHandler`, which makes one of our problematic documents pass through correctly. -
Behind a flag, I've put in some math fixes. If (and only if) there's a
useMath
in the form-data, I run a function on<math-display>
and<math-inline>
that reduces the number of escaped backslashes. Here's what you see (in Postman) without this:
This is what the service does now. I've highlighted a <math-display>
element; in it, you can see that the LaTeX has 4 backslashes (`\\\\frac`). If I check the useMath
setting, here's what I get:
There are 2 backslashes (`\\frac`) which is better – you can see that JSON also escapes quotation marks in <p class="paragraph">
, and when these are unescaped we should end up with what we want, 1 backslash (`\frac`).
I don't know if this is a wonderful fix, though it will solve some of the problems we've been seeing. LaTeX embedded in HTML is inherently problematic – there are <
and >
and \
and &
which easily get busted, and JSON and HTML's escaping aren't helping (to say nothing of whatever's happening with the command-line script). It might be smarter to turn it into base-64 to send it back?