all paragraphs come in as H2s
The attached .DOCX is a Kotahi test file; it comes in with all of its paragraphs as H2s (see screenshot of source inside of Kotahi). The same thing happens if I run it through http://pdf2html.cloud68.co/, which makes me think that this is XSweet – there's something about the file (the source of which I don't know) that's encoded incorrectly.
I don't have MS Word on my computer, but opening it up in Mac TextEdit shows some weirdness – all paragraphs are right-aligned, which is clearly incorrect. If I open it in Apple Pages, it looks more or less how I would expect it to.
This particular file isn't very important, but because Kotahi is processing a lot of Word docs coming from strange sources, we sometimes run into bugs that feel similar. (Most recently: display math is incorrectly coming in as H4s.) I don't know what they did to the DOCX to make it behave this way, though it would be nice if we could handle it?