Support for Book Versions
Context
The current Bookshelf CMS supports Book Versions and in testing the new BCMS in development, both internally and with stakeholders, it is clear there are use cases where Book Versions will be necessary in the new BCMS to associate a set of related files accurately with that set of file's version content type and the version number for that version content type, as well as that set of file's workflow, and that the submiter BCMS accession ID be mapped to other unique content IDs. Minimally, we will agree on the data model architecture and the two values to be passed to PMC at each ingest, consistently for both book and chapter versions.
Proposal
Proposal
Scope
This proposal is limited to ‘wholebooks’. This includes:
-
Books in the XML workflow that are processed by NCBI systems as ‘wholebooks’
-
Books in the PDF workflow that are processed by the PDF2XML vendor (Apex) and NCBI systems as ‘wholebooks’.
NCBI has confirmed that Book versions should also apply to ‘One Doc’ books in the Word workflow, so that their processing can change to ‘wholebooks’ in the PDF or XML workflow. This requires a separate proposal so that all related open issues, especially the design improvement requested in #1138 (closed), can be considered together. (See issues linked to #215 (closed)).
Additionally, NCBI expects that in the future users will need book versions to support changing from wholebook-processing to chapter-processing. This case is not included in this proposal.
Use cases
NCBI provides three use cases for book versioning support:
- PDF First Match to Later Full Text Version — see #1151 (closed)
- Maintaining each scientifically or partially updated version of a book — see #1154 (closed)
- Resolving book creation errors and supporting workflow and submission type changes — see #783 (closed)
What is a Book Version in the BCMS?
All versions of the same book have the same domain name. Each version has a BCMS ID in format {ID.version-number}
.
Example for new book:
Book V1
- BCMS ID = bcms10002833.1
- domain name = bcms10002833
- BCMS Database ID = (unique per version)
Book V2
- BCMS ID = bcms10002833.2
- domain name = bcms10002833
- BCMS Database ID = (unique per version)
Book V3
- BCMS ID = bcms10002833.3
- domain name = bcms10002833
- BCMS Database ID = (unique per version)
Migrated book example:
Book V1
- BCMS ID = bcms10002833.1
- domain name = vasleepap
- BCMS Database ID = (unique per version)
Book V2
- BCMS ID = bcms10002833.2
- domain name = vasleepap
- BCMS Database ID = (unique per version)
Book V3
- BCMS ID = bcms10002833.3
- domain name = vasleepap
- BCMS Database ID = (unique per version)
Each Book version has one or more file versions. (See file\_versions
table in db model). Metadata is versioned because it is contained in these files.
How and when does the BCMS create a Book Version?
The BCMS allows source submissions by FTP uploads (FTP path) and BCMS user uploads (BCMS). Both paths are described below.
FTP Path: create or update Book Version
Currently the BCMS only supports FTP uploads for the XML workflow, but in the future we should expect the same support for PDF uploads so this proposal considers both. In Coko-NCBI design sprints we decided to use a version-name
and version-number
to support all the use cases.
This requires the following update to FTP submission specification:
In the meta.xml file the user should provide a version-name
. The current list of possible controlled values are:
- manuscript
- prepub
- published-pdf
- final-full-text
Related update to BCMS design: Points 1–3 map to the current options in the ‘funded content type’ setting in the New Book and Book Settings UIs, and are only relevant to PDF workflow submissions. Point 4 should be added to the ‘funded content type’ dropdown. ‘Final-full-text’ is the default content type for Word and XML workflow so users do not need to choose this in the UI.
Since it’s possible to have more than one version of each funded content type, users should also provide a version-number
(integer only).
meta.xml file example for a PDF source submission:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book-submit SYSTEM "books-bulk-pdf.dtd">
<book-submit book-submit-id="10.1080/15588742.2015.1017687"
workflow="pdf"
submission-type="book"
version-name="published-pdf"
version-number=”1”>
</book-submit>
meta.xml file example for a XML source submission:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book-submit SYSTEM "books-submit-xml.dtd">
<book-submit book-submit-id="10.1080/15588742.2015.1017687"
workflow="xml"
submission-type="book"
version-name="final-full-text"
version-number=”1”>
</book-submit>
FTP submission rules:
-
Create book version 1: If the
book-submit-id
does not exist in the BCMS for that organisation, create the first Book version and name it{version-name}.{version-number}
for examplepublished-pdf.1
. If noversion-number
is provided, default to1
. -
Update existing book version: If the
book-submit-id
+version-name
+version-number
exists, update the Book version as specified in the meta.xml (i.e. increase the file version of the Book version). -
Create additional Book versions: If the
book-submit-id
exists in BCMS butversion-name
+version-number
specified in the meta.xml does not exist in BCMS, create this as a new Book version. -
Use default values: If
version-name
andversion-number
is not provided, usefinal-full-text
and1
as the default.
Possible FTP submission errors:
- The
version-name
is unknown (only accept defined list of values above), then create submission error. - If the
book-submit-id
andversion-name
exists in the BCMS and there is more than one book version, create an error for any submission that does not include aversion-number
.
BCMS Path: create or update Book Version
For all use cases above, creating a new book version becomes relevant after the current version is published. This is in line with how chapter versions are currently created in the BCMS.
User experience in the BCMS:
- When the current Book version is “Published”, the “New version” button is active in the UI.
- In the “New version” UI the users set the parameters of the new
version. The options include:
- Change the workflow from PDF to XML (or visa versa)
- If the current version has a funded content type, select the updated version name (or leave as is if relevant).
- If the current version has no funded content type, use the
default value
final-full-text
as theversion-name
. (In this case we only show theversion-number
in UI.) - Select the version number (integer starting from 2).
- Select the button “Create new version”.
- The user is redirected to the new version. The Status is ‘New version’. The next step would be to upload the files.
- The user can go back to any previous version to update its files and republish.
NCBI notes the following use cases for the future:
- An Investigator (future BCMS role) tries to submit a manuscript, prepub or published pdf version when the final full text version already exists. In this case, whether the final full text is published or not, the BCMS does not create a new version. Instead the required metadata is collected from the user.
- A publisher tries to submit a published pdf version when the prepub version already exists but has not been published yet. This is highly unlikely but NCBI should consider what result they would expect for this case. (For example: block submission? Delete current version?).
We don’t think that either case is incompatible with the workflow outlined in this proposal.
How do Book versions affect Book Files?
- The files display
{book-version.file-version}
for all relevant sections of the files tab. - When BCMS sends files as a wholebook package to Load to PMC, the json file includes the
version-name
andversion-number
as shown below. - The zip file created when downloading files changes from
{bcms-id}-{book-component-version}-{timestamp of user} to:
{bcms-id}-{timestamp of user}` - The converted file name must remain stable for the same Book version. It’s expected that the converted file name can change between Book versions.
Update to json sent in wholebook Load to PMC packages:
"package_id": 1234567890,
"domain": "vasleepap",
"version-name": "prepub": // Book version name
"version-number": "2": // Book version number
"main_xml": "vasleepap.bxml",
"thumb": "vasleepap.png",
"package": "vasleepap.123457890.2020_05_15-09_30_19.zip",
"target_database": "prod",
"release": true,
"notification_recipients":
Sending files to load to PMC
Feedback from NCBI on current support:
Stacy: NCBI in my meeting with Jeff and Evgeny agrees to the minimal data model requirement that the BCMS will for both book and chapter versioning always pass at every ingest values for two separate fields - the version content type (in an agreed to controlled list - defined already in this document) and the version number (at present only an integer).
How do Book Versions affect Book Settings?
Currently Book Settings are maintained in one template (see book\_settings\_template
table in db model). There's a template type per processing type (chapter-processed and wholebook). This template includes all book settings, whether they are domain settings (NCBI-systems) or bcms-specific settings. Books inherit these settings at creation point.
To support book versioning, we have two options:
The recommended approach is to separate those settings that should remain stable for all Book versions (the domain settings) from those that we should be able to change. This results in something like this:
Book
- settings: remain stable for all versions
- domain
Book version
- settings: change per version
- workflow
- submission type
- funded content type
- all other bcms-specific settings (Unclear at this stage, but these apply to chapter-processed books which are not included in this proposal.)
This is a significant change that requires extensive refactoring.
The workaround approach is to version all settings, but only allow the settings of the current version to be edited. Note: this excludes the settings ‘workflow’ and ‘submission type’ – a new book version is created in order to change these settings. From the UI perspective that means that the form in the Settings modal moves to a tab per version. NCBI has confirmed we should take the workaround approach.
How do Book Versions affect Book Metadata?
Book metadata is versioned because it is contained in the converted xml files. From the UI perspective that means that the form in the Metadata modal moves to a tab per version. There is some minor backend work.
How do Book Versions affect Book Team?
Currently only the role Editor is supported at the Book team level. There’s no use case at this stage to support the ability to have different team members per book version. This may be necessary in future iterations when the Investigator role is supported.
For deployment:
- From the user’s perspective there is no change here.
- Teams remain the same across all versions of a book.
- There is some work on the backend.
How does Book Version affect Chapter Versions?
This proposal does not include Book versions for chapter-processed books. NCBI has also confirmed that when this support is required, this will not include support for a book that has both book versions and chapter versions.
However the current implementation for chapter versions should be brought inline with this proposal. This means the following changes:
- All versions of a chapter should have a BCMS ID in format
{ID.version-number}
. - Current value
version
in json files changes toversion-number
- Chapter versions should support
version-name
andversion-number
in the json file for all workflows. For Word and XML workflow chapters, the defaultversion-name
is sent. For PDF workflow the value provided in the FTP or BCMS submission is sent. - In the UI, chapters versions in a PDF workflow display
version-name
- The BCMS sends both values in the json file for all chapter ingest sessions (Load to PMC), even though PMC does not support both at this time.
- version-name (ignored by NCBI-systems)
- version-number (read by NCBI-systems)
BCMS Design updates
BCMS Design updates
Dashboard
The book version name and number is shown in the book row. The Dash always displays the current version.
Search: The user can search to find previous versions quickly. For example:
- Search by the base BCMS ID
bcms1234
(this shows current version) - Enter
.book-version
to show a specific version. Seebcms1234.1
in the first example below. - Repeat point 1, then filter by the version name. See the second example below.
Collection manager
The design and functionality on the Dashboard applies to the Collection manager page.
Wholebook book manager
- Metadata and Settings move to tabs so they can be viewed per version.
- A New version button (shown when status is “Published”) opens a page to create the new version.
- Users can change between versions.
New version (new UI)
The UI opens from the “New version” button. Users choose the settings for the new version.
Implementation tasks
Implementation tasks
Dev Task | Notes | Weight | Time estimate (days) | issue |
---|---|---|---|---|
Updating the data model: add book_version to books table and relate books table and file versions table. |
3 | 3d | 1191 | |
Updating requests to the database to include book_version
|
2 | 4d | 1192 | |
Replace version paginator component to dropdown (can display version name and number, or only number | See new design, NCBI is fine with dropdown only | 1 | 2d | |
Extending the wholebooks book manager UI | 3 | 3d | ||
Creating a new UI to set the parameters of the new version | 2 | 3d | ||
Dashboard UI updates | 2 | 2d | ||
Collection manager UI updates | 2 | 2d | ||
Search and filter functionality update | TBC after NCBI wireframe review | |||
“New version” status at book level (should be done with any other changes to statuses) | See 1156, if we go with suggestion of matching book/collection status to TOC, then "New book/New collection" statuses are redundant) | 2 | 3d | 1193 |
Changing status for Book not fully inherited from Chapters status | 2 | 2d | 1048 | |
Book settings: Recommended approach as above | If we go with that approach opens a whole lot cycle of development and changes regarding updating the domain | 5 | 5d | |
Book settings: workaround approach as above | 2 | 2d | 1194 | |
Backend: Change from Word complete book (chapter processed) to PDF or XML (wholebook processed) | 5 | 2d | 1195 | |
Book-id relationship with user-ids (because of teams) | 2 | 1d | 1196 | |
Remove the existence of one default chapter for the whole book case | This refactor is needed because it causes problems when we will have versions of a book | 5 | 5d | 1197 |
Refactor the way we retrieve the notifications for the whole book case and the way we update the metadata | This is a result of removing the one default chapter from whole book case. | 5 | 3d | 972 |
Update/refactor divisions table because we have many books for divisions | 2 | 2d | 1198 | |
Update bcms_id relation from books, bookComponents, Collections, to errors, and notification Messages. | 3 | 4d | 1199 | |
Pass version_number and version_name On load to PMC. | 2 | 2d | 1200 | |
Writing migrations for the books that exist | 2 | 2d | 1201 |
Development time required
The total days of development for the tasks above = 47. The spilt is roughly 80% backend / 20% frontend. Our team requires 4 cycles (8 weeks) to implement books versioning support. This includes time for the developers to write unit tests.
Open issues (if applicable)
Related issues: