Metadata Documentation
Updated 20211115 - see strikeouts for things that will be postponed to Phase 2, see changes based on those reductions in bold
MAIN USER STORY
Bookshelf relies on machine-readable, well-structued book and book part metadata that is tagged and stored according to a standard, archival format for indexing and display on its site as well as downstream uses on other NCBI and external sites via programmatic and other processes. To ensure the ability to index and display this metadata, Bookshelf relies on its participants to submit their metadata supported by Bookshelf according to supported, documented setup and production workflows. As part of the setup and production process, Bookshelf has various templates and data quality checks to ensure the complete and accurate indexing and display of metadata.
ACCEPTANCE CRITERIA
Metadata Setup / Templates
-
System Admin can create PUBLISHERand COLLECTION metadata templates according to book vs chapter processing and source submission type workflows that permit the ability to add and update the following information for cloning to components created within or linked to thatpublisher orcollection: -
NCBI-use-only metadata fields
-
NCBI-standardizd metadata fields with NCBI-only rules for stylizing and tagging
-
System Admin can indicatesee which metadata fields in the template are required vs optional for population by submitters by means of their workflow rules. -
Users can style their added / modified content in metadata template fields according to agreed supported styles and these styled fields are inherited and stored accordingly in any collection | book metadata UI table field.
Chapter-Processed Metadata
BOOK:
-
In Phase 1, all required book metadata for chapter-processed content not automatically inherited by a metadata template must be added and updated by a System Admin | Org Admin | Editor. -
In Phase 1, a notification will be sent to System Admin | Org Admin | Editor whenever a book metadata table for chapter-processed content is modified. -- Not included in 22 Dec release -
In Phase 1, all book metadata for chapter-processed content is written into all chapter-processed components (front matter, body chapters and parts, backmatter, TOC) sent to Load to PMC for ingest whenever any metadata field value in the book metadata table is modified AND system admin reload it to PMC / republish it as required to all live chapters. -
Users can style their metadata fields when inputing and updating their book metadata according to agreed supported styles. -
All book metadata fields currently supported by Bookshelf at point of migration for chapter-processed books are supported via by UI fields, or minimally for metadata fields not required for building a citation or to meet participant requirements and to preview / review critical data, by upload via a supported structured file type that will be read by the BCMS and added into the metadata the BCMS writes into chapter-processed components. -
All book metadata written by the BCMS is valid and meets Bookshelf tagging guidelines and displays accurately on Bookshelf site(s). -
Per agreed rules, BCMS sends to PMCBook Database all metadata fields required as part of the Domain and Domain Settings attributes by the PMCBook Domain Service. -
When chapter-processed book metadata is written into converted files, Coko notifies NCBI to remove any fields they are currently writing into the data to permit processing. -- Feature to be spec'ed in early 2022
Priority Phase 2: Support book metadata being added via source files for chapter-processed content and for automated loading rules.
CHAPTER:
-
In Phase 1, all chapter-processed book part metadata other than funding information will be supported by source files (and support note.txt files if necessary for PDF submissions). -
Book part metadata fields required for building a citation / TOC or to meet participant requirements and to preview / review critical data display in the BCMS chapter metadata UI pages.
Priority Phase 2: Permit some book part metadata fields to be updated / added via UI to support author and funded investigator / contractor submission workflows yet to be determined.
Wholebook Metadata
XML:
-
In Phase 1, all required book metadata for XML wholebook processed content not automatically inherited by a metadata template or required to be manually reviewed and updated by a metadata template, must be provided in source XML files according to documented file submission specifications. -
In Phase 1, supported book metadata inherited / created by metadata templates and book metadata fields are passed to NCBI for XML conversion according to provided XML conversion integration specifications. -
In Phase 1, BCMS displays minimal book metadata fields in the UI necessary for review and modification from a template for accurate processing and display on Bookshelf. -
In Phase 1, all metadata fields in the BCMS are read-only are locked to all users to prevent modification in post-conversion statuses (conversion, loading to pmc, previewing, publishing, published, loading / publishing errors). -
Per agreed rules, BCMS sends to PMCBook Database any updated metadata fields required as part of the Domain and Domain Settings attributes by the PMCBook Domain Service.
PDF:
-
In Phase 1, all required book metadata for PDF wholebook processed content not automatically inherited by a metadata template or required to be manually reviewed and udpated by a metadata template, must be provided in source PDF files AND/OR by upload via a supported structured file type according to documented file submission specifications. -
In Phase 1, supported book metadata inherited / created by metadata templates and book metadata fields AND / OR by uploaded supported structured file types are passed to PDF2XML taggers for tagging according to provided PDF2XML integration specifications. -
In Phase 1, BCMS displays minimal book metadata fields in the UI necessary for review and modification from a template for accurate processing and display on Bookshelf. -
In Phase 1, all metadata fields in the BCMS are read-only locked to all users to prevent modification in post-tagging statuses (tagging, loading to pmc, previewing, publishing, published, loading / publishing errors). -
Per agreed rules, BCMS sends to PMCBook Database any updated metadata fields required as part of the Domain and Domain Settings attributes by the PMCBook Domain Service.
Phase 2 priority: support metadata post-tagging metadata updates by PDF source content providers who cannot edit the converted XML but have requested the ability to make post-tagging metadata updates and corrections.
Collection Metadata
Refer to BCMS collection meta-sheet for specific fields to write per use case
-
All required collection metadata not automatically inherited by a metadata templatemust be added and updated by a System Admin | Org Admin | Editor. -
Users can style their collection metadata fields when inputing and updating their collection metadata according to agreed supported styles. -
All collection metadata fields currently supported by Bookshelf at point of migration for chapter- and wholebook processed books are supported via by UI fields, or minimally for metadata fields not required for building a citation or to meet participant requirements and to preview / review critical data, by upload via a supported structured file type that will be read by the BCMS and added into the metadata the BCMS writes into chapter-processed components orpassed in the PDF wholebook case to taggers to writing into the XML they tag. -- -
In Phase 1, all collection metadata for chapter-processed content is written into all chapter-processed components (front matter, body chapters and parts, backmatter, TOC) sent to Load to PMC for ingest whenever any metadata field value in the book metadata table is modified AND system admin reload it to PMC / republish it as required to all live chapters and books upon receipt of a notification from BCMS that collection metadata has been modified. --** -
In Phase 1, a notification will be sent to System Admin | Org Admin | Editor whenever a collection metadata table is modified. -- Not included in 22 Dec release -
All collection metadata written by the BCMS is valid and meets Bookshelf tagging guidelines and displays accurately on Bookshelf site(s). -
All collection metadata for XML wholebook content is passed to NCBI for XML conversion according to provided XML conversion integration specifications. -
All collection metadata for PDF wholebook content is passed to PDF2XML taggers for PD2XML tagging according to provided PDF tagging integration specifications. -
Per agreed rules, BCMS sends to PMCBook Database all metadata fields required as part of the Domain and Domain Settings attributes by the PMCBook Domain Service.
Metadata Processing, Managing, Tracking
-
All metadata can be previewed in either a Books preview / published tab AND/OR via a metadata field UI tab. -
Metadata is processed according to agreed documented acceptance criteria noted above by processing and conversion / tagging type. -
For tracking, manaing, and resolution, any and all metadata errors are recorded at point of input to the UI metadata fields or in an errors tab and statuses if the result of supported processing. -
Metadata errors identified during supported processing are sent to submitters via email notifications.
RELATED TECHNICAL SPECIFICATIONS DOCUMENT
Specific metadata UI rules are recorded and updated here for development and testing: https://docs.google.com/spreadsheets/d/1dKIoTy_b_OSRQEj0kVc8DY647gNGoRYD7-vFl3Mb0dA/edit#gid=1467930499
NOTES, CHANGE HISTORY, RELATED DOCUMENTATION
-
All funding metadata documentation and acceptance criteria is documented here: