Books processed at the book-level in PDF and XML workflows
Hi @lathrops1
In the PDF and XML workflows, there are two ways books can be processed:
- At the chapter-level (as developed for the Word workflow)
- At the book-level (as discussed here)
I'm describing the use case as I understand it.
Design
This wireframe shows the Book manager page.
- There are no individual book components, instead the tabs 'preview', 'files' and 'errors' relate to the whole book.
- There will be some changes to the 'file' tab (to be discussed in a separate issue)
- The tabs 'metadata' and 'manage team' are not relevant for this use case since the team and metadata is only applicable for the whole book.
- In the PDF workflow there is an additional tab for the PDF2XML vendor issues (to be discussed in a separate issue)
Versioning
Similar versioning principles apply, except it's at the book-level:
- The book can have more than one version, each of which corresponds to a published version of the book. Although this isn't common as I understand.
- Each book version can have multiple associated file versions.
- When a book version is published, it replaces the previously published version on Bookshelf. In other words, there is no concept of multiple published versions of the book on Bookshelf (unlike the chapters for some books in the Word workflow).
- When the book is updated sufficiently to be considered a new edition, this edition is a new book record in the BCMS with a unique book ID. Previous editions should be archived, however there needs to be a way to link the book records for previous editions to the current book record.
looking back at the data model issue (#55), we said that:
The discussion of processing and publishing books as a whole was put on hold by NCBI so that the needs of book component publishing can be dealt with first. However, we think that the current data model supports this use case already. Since book components are versioned, we can consider the whole book as one component. For every edition of the book, there can be a number of versions. A version is comprised of the book (one book component in this case), its TOC and its metadata. These go through the system as one unit and is published as one unit. Jure suggested that we could test this, if necessary. We would need an example book from NCBI.
@John.kopanas Can you confirm of the data model does support this or if you need to do some testing and additional development.
XML structure and rendering
In terms of the xml structure and rendering on Bookshelf, books can be tagged in two ways, as described here.
- The Standard
<book>
structure is used for books that will have a TOC. There are multiple<book-part>
nodes in the<book-body>
, each includes<book-part metadata>
. The toc.xml file is supplied by the PDF2XML vendor or organization submitting source xml.
In FTP submissions, book files of this kind will have the file type book
in the manifest file, and submission-type="book"
in the meta.xml.
- The Alternative
<book>
structure is used for books that will not have a TOC. There is one<book-part>
node in the<book-body>
without<book-part metadata>
.
In FTP submissions, book files of this kind will have the file type document
in the manifest file , and submission-type="book"
in the meta.xml.
Book settings
There are no TOC settings required
Please let me know if I've misunderstood anything here.