Scrub step should drop formatting for single spaces, just like it does for tabs
The scrub step should recognize when whitespace-only elements are wrapped in useless tags, remove the wrappers and throw away the formatting on them. This is exactly what happens tabs wrapped in spans, but not for spaces wrapped in spans. We should do the same thing for both.
In this example, you can see that tabs and spaces are handled differently
Extracted:
<p class="Body">
<span style="font-size: 12pt">Some paragraph text</span>
<span style="font-size: 12pt"><tab/></span>
<span style="font-size: 12pt"> </span></p>
</p>
Scrubbed:
<p class="Body">
<span style="font-size: 12pt">Some paragraph text</span>
<tab/>
<span style="font-size: 12pt"> </span>
</p>
We should give spaces the same treatment as tabs, to end up with this instead:
<p class="Body"><span style="font-size: 12pt">Some paragraph text</span><tab/> </p>
Or, if there's a reason it's problematic to remove the spans that wrap the space, then we should at least throw away the formatting on those spans.
Here's another example from the same doc. Here, a paragraph with one space in spans with formatting looks like this all the way through the joined step:
<p class="Body"><span style="font-size: 12pt"> </span></p>
Then, in "collapse" the style info is moved to the paragraph level, and persists all the way into the final rinsed html:
<p class="Body" style="font-size: 12pt"> </p>
So it really should be dropped in the scrub step.