[CoLab] Integrate with Semantic Scholar
Get recommended articles from Semantic Scholar API, based on sending DOIs of the most recent 100 selected articles.
Notes from Daniel Ecer:
The Semantic Scholar Recommendation API is documented here: https://api.semanticscholar.org/api-docs/recommendations#tag/Paper-Recommendations
For the "paper ids" you can use DOIs by prefixing them with DOI:
, e.g. DOI:10.1101/2020.02.20.958025
.
The negativePaperIds
field is optional.
You can select the field to return. externalIds
would include the DOI.
If you post for example to https://api.semanticscholar.org/recommendations/v1/papers?limit=500&fields=paperId,externalIds,title,venue
The following JSON:
{
'positivePaperIds': [
'DOI:10.1101/2020.02.20.958025',
'DOI:10.1101/2020.11.04.367797'
],
'negativePaperIds': [
]
}
You could have gotten something like this:
{'recommendedPapers':
[{'paperId': '52cdb6ed946dfed25113bd194d5e2bb843c66331',
'externalIds': {'PubMedCentral': '9096565',
'DOI': '10.3389/fcell.2022.866491',
'CorpusId': 248408522,
'PubMed': '35573695'},
'title': 'Dissection of the microRNA Network Regulating Hedgehog Signaling in Drosophila',
'venue': 'Frontiers in Cell and Developmental Biology'},
{'paperId': 'aad65af66c76d57a50ce9d3bf2686e798ff9f324',
'externalIds': {'DOI': '10.1101/2022.03.25.483021', 'CorpusId': 247749289},
'title': 'Acute manipulation and real-time visualization of membrane trafficking and exocytosis in Drosophila',
'venue': 'bioRxiv'},
{'paperId': 'af678fcd24058154f798462f2f90d89c545b2ab3',
'externalIds': {'PubMedCentral': '9104307',
'DOI': '10.3390/ijms23094543',
'CorpusId': 248333822,
'PubMed': '35562934'},
'title': 'The Tbx6 Transcription Factor Dorsocross Mediates Dpp Signaling to Regulate Drosophila Thorax Closure',
'venue': 'International journal of molecular sciences'},
...
]
}
The venue
isn't very reliable. You could use it to filter bioRxiv, but filtering it by DOI prefix might be more accurate (or data from other sources).
(I did raise some issues with it)
As mentioned in the call, positivePaperIds
and negativePaperIds
is limited to 100 papers.
If Semantic Scholar doesn't know the DOI,then it will silently be ignored (at least it was like that, I raised with them to get some response feedback about it).
If it doesn't have any of the DOIs you provided in positivePaperIds
then it will error as if you hadn't provided any positive examples.
I am doing some quantitatively evaluation based on whether the articles that semantic scholar recommended include the article that a Sciety user saved. But due to the 60 day window, I am limited to recently saved articles. And we are sharing a Google Data Studio dashboards with some individuals or groups for some qualitative feedback.