Promote headers noted with normal text, underlined
In Bakker ch 1: b_02_ch_1_Bakker.docx
Conversion output: output_b02_ch_1_Bakker.zip
There are 6 of the same level heading in the original word doc. They all look identical to one another: underlined text in the same font and size as the content. Because 2 of these are labeled as Word style "section headings" and the other 4 are not, only the two labeled as section headings are promoted (Word style "section heading" is on the list to pay attention to, right?).
These 2 were promoted to h3s:
- Research Methods and Data Collection
- Navigating the chapters to come
These 4 were not:
- Migration, remittances, and development: three vignettes
- Introduction
- Contextualizing the remittances-to-development agenda
- Notes
There are 2 things that could help here, below and in issue #57 (moved):
Setting aside the Word styles, the header promotion should be able to find these and promote these on its own with no help from styles. Underlining text that otherwise matches the content is surely a common way for authors to mark their headings. The authors could just as easily have done this by italicizing or bolding. Could you implement something to catch this? The biggest clues I see are that it's:
- A piece of text between line breaks that is not too long (see issue #58 (closed))
- Formatted differently than the surrounding text with underlining, bold, and/or italics and maybe
- the formatting is consistent across whatever's being considered for promotion, i.e. there's no formatting changes within the candidate
What do you think? Curious to hear if you think that would be a good fix for this particular document, and a helpful rule in general.
The 2nd issue - looking for formatting similar to what's in the Word styles we're paying attention - is here: #57 (moved)