Refactor document storage

The current way manuscripts and their versions (plus reviews, decisions etc) are stored in the DB is very inflexible and not well suited for storing document history or adding other types of document such as author proofing feedback. It also involves unnecessary duplication of code. I would like a more generic and flexible schema that permits different collections of documents and more fine-grained versioning:

The DocSet is the top-level container: the thing that appears in the list on the Manuscripts page.
It contains multiple separate Docs, which include different major versions of the manuscript/preprint, plus reviews, summaries, decisions and other related items. One of these is the mainDoc, usually the latest version of the manuscript/preprint.
We keep a set of DocRelations that tell us e.g. doc B is a review of doc A, doc C is also a review of A, and doc D is a summary of B and C. The context can hold further information that may be of use for generating DOCMAPs etc.
A Doc has multiple DocVersions: these are not versions of record, but more fine-grained: the version I've just edited versus the version someone else edited 5 minutes ago. If I'm editing a document such as a review that I haven't yet submitted, isPendingSubmission will be true, and this version will remain private to me. Until a first version has been submitted, the doc won't exist as far as the client is concerned.
The JSON data in the DocVersion will contain everything we currently store in manuscript.meta and manuscript.submission. We'll just keep a very basic title at the DocSet level for convenience.

For versioning, I'm thinking we should create a new version each time edits are made by a user who is different to the previous user who edited, or if 5 minutes have elapsed since the previous edit (and isPendingSubmission isn't true).

Refactoring steps

(optional) We would benefit from more extensive integration tests covering the full standard workflows.
Make the submission form work the same as the decision and review forms: all form data to be stored in the submission object, and SupplementaryFiles and VisualAbstract fields store file references within the form data.
I propose creating the graphql API for this structure before we change the database schema: it will obtain data from the existing schema and restructure it to mimic the new schema. We can keep both old and new APIs running in parallel and gradually port client code to use the new structure.
Once the old API is completely phased out, we can refactor the back-end and DB.

Benefits

Submissions, reviews and decisions are treated the same, meaning we need only a single set of code to deal with all of these.
New artifacts/document types, such as author proofing feedback, will be very easy to add.
Publishing can become much simpler and more flexible, using simple mappings rather than specialised code to publish the various artifacts.
Helps rationalise the confusing and error-prone system of manuscript versioning we currently have. Querying to get a doc will always return the most recent version, while prior versions of record can be obtained with a subquery.
It distinguishes between submitted and unsubmitted data. With decisions and submissions we do this very poorly.
It keeps all version information needed for auditing, diffing and rollback.
Simplifies production of rich DOCMAPs.

Edited Sep 19, 2023 by Ben Whitmore