Book component ID requirements for the Word workflow
Hi @lathrops1
This is to confirm what we have already discussed.
FYI @John.kopanas and @yannis
For the Word-workflow NCBI creates the book-part id. This is generated during conversion using the filename of the source doc, for example: <book-part id="deutetrabenazine" book-part-type="chapter">
.
Coko extracts the ID from the xml file. This id is displayed in the metadata tab for each component and is not editable. Authors can copy this id to create chapter cross references in the Word source docs. This id is translated in NCBI database to a chapter accession id (NBK id).
Since the id is created from the filename, your conversion script requires that the file name always remains the same, therefore users must upload files with the same filename.
We spoke about the following option which would allow users to submit files with any file name, but since you're not able to change your conversion process and expect the system to create new file versions in bulk, this isn't a possibility because the only way the system can match incoming chapters to existing records is by filename.
- Let's imagine the user uploads a new chapter file called
abacavir.docx
. When the file is converted the idabacavir
is created. - When the user uploads the next version of the file,
abacavir-v2-20200519.docx
we could send the file and the existing idabacavir
to NCBI conversion. - NCBI conversion can then check: does id
abacavir
exist? If yes, then don't create an id from the file name, useabacavir
as the id in the xml you send back to us. This would require a change to NCBI's conversion process.
This would mean:
- If Coko gives NCBI file + id, then use this id in the xml
- If Coko gives NCBI file only, then generate id from the file name.