Incorrect fonts in html - coming from w:rFonts attributes?
Brinton Ch 8 has some incorrect fonts coming through into the html. The following text is all Times in Word:
Even though ‘ulama’ like Qaradawi assume that images...
However, it comes through in the rinsed html in 3 different fonts: Times, Menlo Regular, and Helvetica. It looks like it has to do with the w:rFonts
attributes: w:cs
, w:eastAsia
, w:ascii
and w:hAnsi
. These specify the font to use for certain character types.
The word "Qaradawi" is extracted as Helvetica:
<w:r w:rsidRPr="009337E2">
<w:rPr>
<w:rFonts w:eastAsia="Helvetica"/>
</w:rPr>
<w:t>Qaradawi</w:t>
</w:r>
And " assume that " is extracted as Menlo Regular:
<w:r w:rsidRPr="009337E2">
<w:rPr>
<w:rFonts w:eastAsia="Helvetica" w:cs="Menlo Regular"/>
</w:rPr>
<w:t xml:space="preserve"> assume that</w:t>
</w:r>
The html doesn't specify different fonts for different character types in the same way. How does XSweet handle these w:rFonts
attributes? Since this displays in the original Word as all Times, I am guessing that Word doesn't consider any of the characters in these runs to be of the type w:eastAsia
or w:cs
, but I'm not sure how it decides what kind of character it's looking at. Do you have a better idea what's going on here?
Here's the full XML:
<w:p w14:paraId="3E8B35BD" w14:textId="77777777" w:rsidR="00DE7EE7" w:rsidRPr="009337E2" w:rsidRDefault="00DE7EE7" w:rsidP="00DE7EE7">
<w:pPr><w:widowControl w:val="0"/>
<w:tabs><w:tab w:val="left" w:pos="560"/><w:tab w:val="left" w:pos="1120"/><w:tab w:val="left" w:pos="1680"/><w:tab w:val="left" w:pos="2240"/><w:tab w:val="left" w:pos="2800"/><w:tab w:val="left" w:pos="3360"/><w:tab w:val="left" w:pos="3920"/><w:tab w:val="left" w:pos="4480"/><w:tab w:val="left" w:pos="5040"/><w:tab w:val="left" w:pos="5600"/><w:tab w:val="left" w:pos="6160"/><w:tab w:val="left" w:pos="6720"/></w:tabs><w:autoSpaceDE w:val="0"/><w:autoSpaceDN w:val="0"/><w:adjustRightInd w:val="0"/><w:spacing w:line="480" w:lineRule="auto"/>
<w:rPr><w:rFonts w:cs="Times"/></w:rPr>
</w:pPr>
<w:r w:rsidRPr="009337E2">
<w:rPr>
<w:rFonts w:eastAsia="Helvetica" w:cs="Times New Roman"/>
<w:color w:val="000000"/>
<w:szCs w:val="20"/>
</w:rPr>
<w:tab/>
</w:r>
<w:r w:rsidRPr="009337E2">
<w:rPr>
<w:rFonts w:cs="Times"/>
</w:rPr>
<w:t xml:space="preserve">Even though</w:t>
</w:r>
<w:r w:rsidR="00BA3E1D">
<w:rPr>
<w:rFonts w:cs="Times"/>
</w:rPr>
<w:t>‘ulama’</w:t>
</w:r>
<w:r w:rsidRPr="009337E2">
<w:rPr>
<w:rFonts w:cs="Times"/>
</w:rPr>
<w:t xml:space="preserve"> like </w:t>
</w:r>
<w:r w:rsidRPr="009337E2">
<w:rPr>
<w:rFonts w:eastAsia="Helvetica"/>
</w:rPr>
<w:t>Qaradawi</w:t>
</w:r>
<w:r w:rsidRPr="009337E2">
<w:rPr>
<w:rFonts w:eastAsia="Helvetica" w:cs="Menlo Regular"/>
</w:rPr>
<w:t xml:space="preserve"> assume that</w:t>
</w:r>
<w:r w:rsidRPr="009337E2">
<w:rPr>
<w:rFonts w:cs="Times"/>
</w:rPr>
<w:t xml:space="preserve"> images of certain objects </w:t>
</w:r>
Here's how it's extracted:
<p>
<span style="font-family: Times New Roman"><tab/></span>
<span style="font-family: Times">Even though </span>
<span style="font-family: Times">‘ulama’</span>
<span style="font-family: Times"> like </span>
<span style="font-family: Helvetica">Qaradawi</span>
<span style="font-family: Menlo Regular"> assume that</span>
<span style="font-family: Times"> images of certain objects </span>
And here's the final html
<p><span class="tab"><!-- tab --></span>
<span style="font-family: Times">Even though ‘ulama’ like </span>
<span style="font-family: Helvetica">Qaradawi</span>
<span style="font-family: Menlo Regular"> assume that</span>
<span style="font-family: Times"> images of certain objects