[NCRC] Update Kotahi database with all research object currently capture via shiny app DB (spreadsheet)
Description; the following spreadsheet contains articles that should not be imported into Kotahi when pulling content from bioRxiv/Pubmed API.
- feb 2021: https://docs.google.com/spreadsheets/d/1zQPXGIy2xsfQmri-_PLA-5YP3nPefJkpYaYY8rCGde8/edit
- sept 2020: https://docs.google.com/spreadsheets/d/1G8stDQUmDbI8qY0NK4vBru2A55j7wcPs956sBcfv3hY/edit
- pre sept 2020: https://docs.google.com/spreadsheets/d/1Y0juNsPpMa_fddk82dPGYfjJbL_BFHoOHmI3HBLJYk8/edit#gid=0
Articles should be in the system, but should not be displayed on UI.
Acceptance criteria;
-
Add all existing article data to Kotahi database in order to mitigate the inclusion of the previous articles entering the system for triage. -
Delete all current articles from the Manuscripts view.
Notes from the client; this R code can read and combine to a single data frame of all considered papers. Note we froze using R version 3.6.3 since the googlesheets4 package changed a lot in R v4.0.
## latest paper
existing_papers = googlesheets4::sheets_speedread(
ss="1zQPXGIy2xsfQmri-_PLA-5YP3nPefJkpYaYY8rCGde8",
sheet="papers",
col_types = cols(
pubmed_id = col_integer(),
date = col_character(),
secondary_reviewer_initial = col_character(),
subtopic = col_character())
) %>%
dplyr::mutate(title_journal = paste0(title, " (", journal, ")"))
existing_papers$doi[is.na(existing_papers$doi)] = existing_papers$title_journal[is.na(existing_papers$doi)]
## hard-code previously considered papers
existing_papers_older = googlesheets4::sheets_speedread(
ss="https://docs.google.com/spreadsheets/d/1Y0juNsPpMa_fddk82dPGYfjJbL_BFHoOHmI3HBLJYk8/edit#gid=0",
sheet="papers",
col_types = readr::cols(
pubmed_id = col_integer(),
date = col_character(),
secondary_reviewer_initial = col_character(),
subtopic = col_character())) %>%
dplyr::mutate(title_journal = paste0(title, " (", journal, ")"))
existing_papers_old = googlesheets4::sheets_speedread(
ss="https://docs.google.com/spreadsheets/d/1G8stDQUmDbI8qY0NK4vBru2A55j7wcPs956sBcfv3hY/edit#gid=0",
sheet="papers", col_types = readr::cols(
pubmed_id = col_integer(),
date = col_character(),
secondary_reviewer_initial = col_character(),
subtopic = col_character())) %>%
dplyr::mutate(title_journal = paste0(title, " (", journal, ")"))
## and join
existing_papers = rbind(existing_papers,
existing_papers_old,
existing_papers_older)