RFC: Modular data model - An extendable and composable data model
TL;DR: We propose to enable an extendable and composable data model via uniquely namespaced (the npm package name) extensions of existing models (in addition to adding new models).
The PubSweet data model currently consists of Collections, Fragments, Users and Teams. The relationship between these is used in multiple ways across apps:
-
in Editoria
- a Book is a Collection
- a Chapter is a Fragment
-
in xPub (Faraday and Collabra)
- a Manuscript is a Collection
- a ManuscriptVersion is a Fragment
-
in xPub (eLife)
- a Manuscript is a Fragment
-
in PubSweet Starter (a.k.a. Science Blogger)
- a Blog is a Collection
- a BlogPost is a Fragment
As a result of the constrained data model, we're seeing reuse of its concepts, but this reuse does not go deeper than the lowest level concept (is this a collection? is this a fragment of a collection?). What ends up being stored in these Collections and Fragments is usually completely custom to the application, which means that components that deal with this data must also be completely custom. Completely custom components almost always result in little to no reuse across apps.
Manuscript and SpecialIssue model
I'll try to paint a picture of where we'd instead like to be. Let's say we agree to store Manuscripts in the Manuscript model, and let's assume that we agree what data should be stored in there (https://gitlab.coko.foundation/xpub/shared-data-model#manuscript)
journalId UUID Link to Journal
versions [UUID] Link to Versions
This is a trivial example, but xpub-model-manuscript
component. This component would A xpub-journal-dashboard
component could depend on xpub-model-manuscript
, and could therefore rely on:
- the relevant GraphQL mutations that
xpub-model-manuscript
brings in - the migrations for the
Manuscripts
table, also present inxpub-model-manuscript
.
Now let's say that our journal application would like to have a special issue
feature, where previously published manuscripts are collected based on a certain topic. Let's say that this would equate to a xpub-special-issues-dashboard
component. xpub-special-issues-dashboard
wants to reuse xpub-model-manuscript
, but it wants to also represent a link to a SpecialIssue model with a specialIssueId
property on Manuscript
. There are a few ways it could get to that state:
- Use a
xpub-model-special-issue
dependency to create a new tableSpecialIssues
, where each row would be a SpecialIssue and would have amanuscripts
property, that's an array of UUIDs, each pointing to a Manuscript.
This seems fine, but maybe sub-optimal, since any lookup to determine if a Manuscript is part of a SpecialIssue, would have to go through the SpecialIssues
table. Additionally, it's unclear where any meta-data about each Manuscript, as it pertains to the SpecialIssue, is kept.
- Use
xpub-model-special-issue
dependency to create two new tables, aSpecialIssueManuscripts
, and aSpecialIssues
table, whereSpecialIssueManuscript
is a link toSpecialIssue
, a link to theManuscript
+ the metadata that needs to be represented about this relationship.
This again seems fine, but again sub-optimal, since there are now suddenly two tables to manage for a single concept.
- The third way, and our proposal: Use
xpub-model-special-issue
dependency to extend the existingManuscripts
table with columns of its own, e.g.specialIssueId
andmetadata
, and create a new table forSpecialIssues
.
This seems better, but there's an issue with conflicting extensions, e.g. if there are multiple components that would like to extend the Manuscripts
table with metadata
columns. So we propose that extensions to current model be namespaced under a unique namespace. For this unique namespace, we propose the published package's name on NPM. In practice, this would mean that e.g. the Manuscript
model becomes extended to:
journalId UUID Link to Journal
versions [UUID] Link to Versions
xpub-model-special-issue-specialIssueId UUID Link to SpecialIssue
xpub-model-special-issue-metadata JSON Metadata about Manuscript in this Special Issue
This of course is very verbose, but we're proposing to add some sugar and sugar this away, so that within the context of xpub-model-special-issue
, you'd automagically access namespaced contexts in extended models, and in the context of xpub-special-issue-dashboard
(which depends on xpub-model-special-issue
), you would perhaps have a way of specifying the namespace context once, and then accessing namespaced properties automatically.
User model and OAuth
Another example is extending the User model with OAuth information (token, refresh token, scope, etc). Here's an example from xpub-eLife
's ORCID integration:
orcid: Joi.string(),
oauth: Joi.object({
accessToken: Joi.string(),
refreshToken: Joi.string(),
}),
As soon as you'd want two different OAuth providers, you'd have a conflict in the oauth
property. With the mechanism proposed above, a User model extended by e.g. pubsweet-oauth-orcid
and pubsweet-oauth-google
components would look something like this:
User {
id
type
username
email
passwordHash
admin
passwordResetToken
passwordResetTimestamp
pubsweet-oauth-orcid-id
pubsweet-oauth-orcid-tokens
pubsweet-oauth-google-id
pubsweet-oauth-google-tokens
}
Any component depending on both pubsweet-oauth-orcid
and pubsweet-oauth-google
would then be able to rely on the existence of these columns on user (e.g. a admin component that needs to know what services the user has OAuthed to).