1. Collection of citations and extraction of ADE pairs
ADE reports (citations in PubMed) were collected and the corresponding ADE pairs were extracted [1].
A citation which is indexed with a MeSH term followed by the MeSH term subheading “adverse effects” (drug MeSH term) represents a drug causing some adverse event. A citation which is indexed with a MeSH term followed by the subheading “chemically induced” (AE MeSH term) represents an AE caused by some drug. A citation which was indexed with a drug MeSH term and an AE MeSH term represents that the drug caused the AE (ADE report). From this situation, ADE pairs are extracted according to all the combinations of drug MeSH terms and AE MeSH terms (ADE pair - citation combination). The process is performed on a daily basis and ADE pair - citation combinations are listed up.
2. Classified by MeSH novelty score
As the objective of looking for first reports of ADE pairs, we devised MeSH novelty score. In terms of ADE pair combinations, ADE pair - citation combinations are classified by MeSH terms by taking a difference between a day’s list and the previous day’s list. If an ADE pair has previously been reported, the MeSH novelty score will be 0, if an ADE pair has previously been reported irrespective to the subheadings (“adverse effects” and/or “chemically induced”), the MeSH novelty score will be 1, and if an ADE pair has not been reported at all, the MeSH novelty score will be 2.
3. Classified by predicted author novelty score
As the objective of looking for first reports of ADE pairs, we devised author novelty score. The author novelty score of an ADE pair is 0 if the authors did not argue the novelty of the report. If the author argued that the ADE is reported for the first time, then if the ADE pair is the same as that described in the title and the abstract of the citation, the author novelty score is 2, otherwise 1.
To substitute laborious manual curation of abstracts and titles for evaluating ADE pairs by author novelty score with simple word matching, we made a positive list of phrases that indicate that the citations report the first cases of ADE pairs and constructed a model to predict the author novelty score of an ADE pair as follows. 1) we extracted short contents (positive phrases) that the authors argued their novelty from the citations that report the ADE pairs with MeSH novelty score 2 that were evaluated as author novelty score 2 (positive phrase set). 2) If at least one of the positive phrases are found in the title or abstract of the citation that reports an ADE pair, its author novelty score was predicted to be >= 1, otherwise 0.
[1] R.Winnenburg, A.Sorbello, A.Ripple, R.Harpaz, J.Tonning, A.Szarfman, H.Francis, O.Bodenreider, Leveraging MEDLINE indexing for pharmacovigilance - Inherent limitations and mitigation strategies, Journal of Biomedical Informatics. 57 (2015) 425-435. https://doi.org/10.1016/J.JBI.2015.08.022.