Normalize file names—replace ' ', '(', ')' with underscores.
I know in #624 (closed) we found a different solution for this, but it turns out the NCBI requires the filenames we ultimately send to them to be normalized—no whitespace, parentheses, or special characters. Can you normalize all filenames on upload?
EDIT:
Actually, I think I can do something similar to what the old system did, and only change the file names when sent to tagging, to the NCBI, and in the XML. Only supplemental files are a problem—vendor provided figure filenames already follow a standard.
In all the following instances supplemental file names should be changed to EMSID-type-normalized_label.ext
-
Change file names in package sent to Taggers -
Change file names/fix matching in file list for XSL to make HTML -
Change file names to send to the NCBI
We don't have to be concerned with the PDF conversion, since supplemental files aren't included.
The end result is that files associated with the XML will have normalized filenames.