Factoring out CSS styles into classes
So far we have quite shamelessly -- and successfully -- exploited HTML @style attributes, prolifically, as a way of representing the formatting information presented to us in the .docx.
However, this is (to say the least) likely to be controversial if not also problematic (and probably that too) as soon as we hit the HTML environment. Indeed, wouldn't an early goal of any reasonable HTML conversion have to be getting element-level @style out of there (whatever we have as warrants and justifications)?
How to do that? Well actually an XSLT could take all instances of
<p class="SomewhatParagraph" style="text-align:center; font-size:14">
and faithfully render them
<p class="SomewhatParagraph size14pt-centered">
with elsewhere in the HTML (once for the set)
<style type="text/css">
.size14pt-centered { text-align:center; font-size:14 }
</style>
This would probably be considered a vast improvement by almost anyone -- and can be done with a single filter in XSLT. It would implement a little language casting property settings into @class values. Presumably this could be done with the subset of CSS we expect.
I propose we may wish to do this preemptively as I can easily see aching in bellies, heads and even hearts if we proceed to deliver raw @style attributes - serviceable as they has been so far for their purposes (and however easy and straightforward a transformation this is for those in command of the right tools). Especially since we also have a way to do the processing called for, so neatly.