Book TOC invalid XML because RID attributes reference unknown xref ids not in TOC XML
Expected behaviour
The TOC that the BCMS generates should be valid
Current behaviour
In this example book which is a chapter processed book, the TOC gets an error when you try to publish it, because of invalid XML.
The error reported is: TOC.xml: Invalid XML.
-:36: parser warning
: xmlParsePITarget: invalid name prefix 'xml'
^
-:26: element xref: validity error :
IDREFS attribute rid references an unknown ID "consstemcellmet.fn1"
If you go to the chapter called 'Bone Marrow or Blood Stem Cell Transplants in Children With Certain Rare Inherited Metabolic Diseases* (https://ncbi.cloud68.co/organizations/6b3576c3-eb69-42a2-8f60-bfbca9df9dcc/bookmanager/585d921f-13be-4513-9a8a-600f87ac71a9/part/undefined/112d658f-34f8-4d63-ad20-481d7edd59c8), you will see that the converted XML consstemcellmet.xml shows a footnote in the chapter title, like this:
<title>Bone Marrow or Blood Stem Cell Transplants in Children With Certain Rare Inherited Metabolic Diseases<xref ref-type="fn" rid="consstemcellmet.fn1">*</xref>
When you look at the preview or published version of that chapter, you can see that in the preview, the footnote displays fine and the preview is correct. We can't strip the footnote from the chapter XML because we need it for the preview to display correctly. However this footnote is causing issues when we try publish the TOC of the book, in the TOC XML that we write, which includes the chapter title with the footnote XREF. The footnote referred to is not present in the TOC (because we only have the chapter title there and can't have the referenced footnote there), so the error shown above occurs.
Solution
We can't strip the footnote out of the chapter title everywhere because it previews for the chapter preview and is valid, so it needs to remain in the converted chapter XML, but when we build the TOC XML, we need to strip the footnote totally out of the chapter title (when a footnote exists for the chapter title), and we need to strip all <xref>
and ref-type
attributes when building the TOC in case there are others that can cause invalid TOC XML.
Steps to reproduce
- Create a chapter processed PDF book and upload this chapter consstemcellmet.xml
- Publish the chapter and make sure it published successfully (you'll see to fill out required metadata fields first)
- Once the chapter is published try publish the TOC, see the error that appears.
Priority
Y
QA Steps
- Create a chapter processed PDF book of 'full text' content type
- Upload the source: consstemcellmet.pdf
- and upload this converted consstemcellmet.xml
- Fill out the required metadata field
- Publish the chapter and make sure it publishes successfully
- Publish the TOC and check that TOC is published successfully
- For existing books, the file should be submitted or reloaded again, in order to have the converted file updated before publishing the toc again.