Reliability and Efficiency of the CAPRI-3 Metastatic Prostate Cancer Registry Driven by Artificial Intelligence
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Patient Identification
- Identification of patients with a urology and/or internal medicine DBC code (in Dutch: Diagnose Behandeling Combinatie, a code used for reimbursement);
- Identification of patients based on mHSPC- and CRPC-specific systemic treatments supplied by clinical pharmacies (i.e., cabazitaxel, abiraterone, enzalutamide, radium-223);
- Identification of patients based on textual notification (including synonyms, abbreviations, and common typos) of “metastatic prostate cancer” or “castration-resistant prostate cancer” in the EHR;
- Identification of patients based on chemical castration treatments supplied by clinical pharmacies (i.e., LHRH-agonists and -antagonists) or textual notification (including synonyms, abbreviations, and common typos) of chemical castration treatments in the EHR and/or conducted surgical castration.
2.2. Data Extraction and Storage
2.3. Quality Control
- 5.
- Automatic checks are programmed into the eCRF to make sure data meet certain required formats and ranges to maximize accuracy;
- 6.
- Based on the results of the pilot, unreliable automatically extracted data fields are manually validated by trained datamanagers. The remaining unanswered eCRF questions are manually collected from the EHR. In particular, scanned forms and pdf files are examined, which are inaccessible to the software;
- 7.
- During the periodic quality checks, the coordinating researchers analyze the manually completed data for discrepancies and/or missing values. Any found (manual) errors are subsequently checked at the source and altered accordingly. All activities and alterations are logged in Castor [23].
2.4. Outcomes
- Negative predictive value (NPV) of the patient-identification algorithm;
- Accuracy (i.e., percentage of data free from error) and completeness (i.e., percentage of the total amount of data) of the automated data extraction.
3. Results
3.1. Patient Identification
3.2. Data Extraction
3.3. Quality Control
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Posdzich, P.; Darr, C.; Hilser, T.; Wahl, M.; Herrmann, K.; Hadaschik, B.; Grunwald, V. Metastatic Prostate Cancer—A Review of Current Treatment Options and Promising New Approaches. Cancers 2023, 15, 461. [Google Scholar] [CrossRef]
- Sayegh, N.; Swami, U.; Agarwal, N. Recent Advances in the Management of Metastatic Prostate Cancer. JCO Oncol. Prac. 2022, 18, 45–55. [Google Scholar] [CrossRef]
- Ng, K.; Smith, S.; Shamash, J. Metastatic Hormone-Sensitive Prostate Cancer (mHSPC): Advances and Treatment Strategies in the First-Line Setting. Oncol. Ther. 2020, 8, 209–230. [Google Scholar] [CrossRef]
- Cleophas, T.J.; Zwinderman, A.H. Limitations of randomized clinical trials. Proposed alternative designs. Clin. Chem. Lab. Med. 2000, 38, 1217–1223. [Google Scholar] [CrossRef] [PubMed]
- Kostis, J.B.; Dobrzynski, J.M. Limitations of Randomized Clinical Trials. Am. J. Cardiol. 2020, 129, 109–115. [Google Scholar] [CrossRef] [PubMed]
- APC Society Structure and Definitions. Available online: https://www.apccc.org/about-apc-who-are-we/structure-and-definitions.html (accessed on 14 April 2023).
- Del Pino-Sedeno, T.; Infante-Ventura, D.; de Armas Castellano, A.; de Pablos-Rodriguez, P.; Rueda-Dominguez, A.; Serrano-Aguilar, P.; Trujillo-Martin, M.M. Molecular Biomarkers for the Detection of Clinically Significant Prostate Cancer: A Systematic Review and Meta-analysis. Eur. Urol. Open Sci. 2022, 46, 105–127. [Google Scholar] [CrossRef] [PubMed]
- Ikeda, S.; Elkin, S.K.; Tomson, B.N.; Carter, J.L.; Kurzrock, R. Next-generation sequencing of prostate cancer: Genomic and pathway alterations, potential actionability patterns, and relative rate of use of clinical-grade testing. Cancer Biol. Ther. 2019, 20, 219–226. [Google Scholar] [CrossRef]
- Wang, Y.; Galante, J.R.; Haroon, A.; Wan, S.; Afaq, A.; Payne, H.; Bomanji, J.; Adeleke, S.; Kasivisvanathan, V. The future of PSMA PET and WB MRI as next-generation imaging tools in prostate cancer. Nat. Rev. Urol. 2022, 19, 475–493. [Google Scholar] [CrossRef]
- Porten, S.P.; Cooperberg, M.R.; Konety, B.R.; Carroll, P.R. The example of CaPSURE: Lessons learned from a national disease registry. World J. Urol. 2011, 29, 265–271. [Google Scholar] [CrossRef] [Green Version]
- Tian, Q.; Liu, M.; Min, L.; An, J.; Lu, X.; Duan, H. An automated data verification approach for improving data quality in a clinical registry. Comput. Methods Programs Biomed. 2019, 181, 104840. [Google Scholar] [CrossRef]
- Sun, W.; Cai, Z.; Li, Y.; Liu, F.; Fang, S.; Wang, G. Data Processing and Text Mining Technologies on Electronic Medical Records: A Review. J. Health Eng. 2018, 2018, 4302425. [Google Scholar] [CrossRef] [Green Version]
- CTcue: Making Electronic Health Records More Searchable with Elastic. Available online: https://ctcue.com/ (accessed on 15 April 2023).
- van Laar, S.A.; Gombert-Handoko, K.B.; Guchelaar, H.J.; Zwaveling, J. An Electronic Health Record Text Mining Tool to Collect Real-World Drug Treatment Outcomes: A Validation Study in Patients with Metastatic Renal Cell Carcinoma. Clin. Pharmacol. Ther. 2020, 108, 644–652. [Google Scholar] [CrossRef]
- Westgeest, H.M.; Uyl-de Groot, C.A.; van Moorselaar, R.J.A.; de Wit, R.; van den Bergh, A.C.M.; Coenen, J.; Beerlage, H.P.; Hendriks, M.P.; Bos, M.; van den Berg, P.; et al. Differences in Trial and Real-world Populations in the Dutch Castration-resistant Prostate Cancer Registry. Eur. Urol. Focus 2018, 4, 694–701. [Google Scholar] [CrossRef] [Green Version]
- Westgeest, H.M.; Kuppen, M.C.P.; van den Eertwegh, A.J.M.; de Wit, R.; Coenen, J.; van den Berg, H.P.P.; Mehra, N.; van Oort, I.M.; Fossion, L.; Hendriks, M.P.; et al. Second-Line Cabazitaxel Treatment in Castration-Resistant Prostate Cancer Clinical Trials Compared to Standard of Care in CAPRI: Observational Study in the Netherlands. Clin. Genitourin Cancer 2019, 17, e946–e956. [Google Scholar] [CrossRef]
- Kuppen, M.C.; Westgeest, H.M.; van der Doelen, M.J.; van den Eertwegh, A.J.; Coenen, J.L.; Aben, K.K.; van den Bergh, A.C.; Bergman, A.M.; den Bosch, J.V.; Celik, F.; et al. Real-world outcomes of radium-223 dichloride for metastatic castration resistant prostate cancer. Future Oncol. 2020, 16, 1371–1384. [Google Scholar] [CrossRef]
- Kuppen, M.C.P.; Westgeest, H.M.; van den Eertwegh, A.J.M.; van Moorselaar, R.J.A.; van Oort, I.M.; Coenen, J.; van den Bergh, A.; Mehra, N.; Somford, D.M.; Bergman, A.M.; et al. Real-world Outcomes of Sequential Androgen-receptor Targeting Therapies with or without Interposed Life-prolonging Drugs in Metastatic Castration-resistant Prostate Cancer: Results from the Dutch Castration-resistant Prostate Cancer Registry. Eur. Urol. Oncol. 2021, 4, 618–627. [Google Scholar] [CrossRef] [Green Version]
- Westgeest, H.M.; Kuppen, M.C.P.; van den Eertwegh, A.J.M.; de Wit, R.; Bergman, A.M.; van Moorselaar, R.J.A.; Coenen, J.; van den Bergh, A.C.M.; Somford, D.M.; Mehra, N.; et al. The effects of new life-prolonging drugs for metastatic castration-resistant prostate cancer (mCRPC) patients in a real-world population. Prostate Cancer Prostatic Dis. 2021, 24, 871–879. [Google Scholar] [CrossRef]
- Westgeest, H.M.; Kuppen, M.C.P.; van den Eertwegh, F.; van Oort, I.M.; Coenen, J.; van Moorselaar, J.; Aben, K.K.H.; Bergman, A.M.; Huinink, D.T.B.; van den Bosch, J.; et al. High-Intensity Care in the End-of-Life Phase of Castration-Resistant Prostate Cancer Patients: Results from the Dutch CAPRI-Registry. J. Palliat. Med. 2021, 24, 1789–1797. [Google Scholar] [CrossRef]
- Kuppen, M.C.P.; Westgeest, H.M.; van den Eertwegh, A.J.M.; van Moorselaar, R.J.A.; van Oort, I.M.; Tascilar, M.; Mehra, N.; Lavalaye, J.; Somford, D.M.; Aben, K.K.H.; et al. Symptomatic Skeletal Events and the Use of Bone Health Agents in a Real-World Treated Metastatic Castration Resistant Prostate Cancer Population: Results from the CAPRI-Study in the Netherlands. Clin. Genitourin Cancer 2022, 20, 43–52. [Google Scholar] [CrossRef] [PubMed]
- EAU Guidelines Prostate Cancer. Available online: https://uroweb.org/guidelines/prostate-cancer/chapter/treatment (accessed on 14 April 2023).
- Castor EDC. Available online: https://www.castoredc.com/electronic-data-capture-system/ (accessed on 27 January 2021).
- van Dijk, W.B.; Fiolet, A.T.L.; Schuit, E.; Sammani, A.; Groenhof, T.K.J.; van der Graaf, R.; de Vries, M.C.; Alings, M.; Schaap, J.; Asselbergs, F.W.; et al. Text-mining in electronic healthcare records can be used as efficient tool for screening and data collection in cardiovascular trials: A multicenter validation study. J. Clin. Epidemiol. 2021, 132, 97–105. [Google Scholar] [CrossRef] [PubMed]
- Jonnalagadda, S.R.; Adupa, A.K.; Garg, R.P.; Corona-Cox, J.; Shah, S.J. Text Mining of the Electronic Health Record: An Information Extraction Approach for Automated Identification and Subphenotyping of HFpEF Patients for Clinical Trials. J. Cardiovasc. Transl. Res. 2017, 10, 313–321. [Google Scholar] [CrossRef] [PubMed]
- Ni, Y.; Kennebeck, S.; Dexheimer, J.W.; McAneney, C.M.; Tang, H.; Lingren, T.; Li, Q.; Zhai, H.; Solti, I. Automated clinical trial eligibility prescreening: Increasing the efficiency of patient identification for clinical trials in the emergency department. J. Am. Med. Inf. Assoc. 2015, 22, 166–178. [Google Scholar] [CrossRef] [Green Version]
- CESPHN—Data Extraction Tools. Available online: https://cesphn.org.au/general-practice/practice-support-and-development/data-extraction-polar (accessed on 14 July 2023).
- Laique, S.N.; Hayat, U.; Sarvepalli, S.; Vaughn, B.; Ibrahim, M.; McMichael, J.; Qaiser, K.N.; Burke, C.; Bhatt, A.; Rhodes, C.; et al. Application of optical character recognition with natural language processing for large-scale quality metric data extraction in colonoscopy reports. Gastrointest. Endosc. 2021, 93, 750–757. [Google Scholar] [CrossRef] [PubMed]
- Yu, A.Y.X.; Liu, Z.A.; Pou-Prom, C.; Lopes, K.; Kapral, M.K.; Aviv, R.I.; Mamdani, M. Automating Stroke Data Extraction from Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study. JMIR Med. Inf. 2021, 9, e24381. [Google Scholar] [CrossRef]
- Jackson, R.G.; Patel, R.; Jayatilleke, N.; Kolliakou, A.; Ball, M.; Gorrell, G.; Roberts, A.; Dobson, R.J.; Stewart, R. Natural language processing to extract symptoms of severe mental illness from clinical text: The Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project. BMJ Open 2017, 7, e012012. [Google Scholar] [CrossRef] [Green Version]
- Nath, C.; Albaghdadi, M.S.; Jonnalagadda, S.R. A Natural Language Processing Tool for Large-Scale Data Extraction from Echocardiography Reports. PLoS ONE 2016, 11, e0153749. [Google Scholar] [CrossRef] [Green Version]
- Fonferko-Shadrach, B.; Lacey, A.S.; Roberts, A.; Akbari, A.; Thompson, S.; Ford, D.V.; Lyons, R.A.; Rees, M.I.; Pickrell, W.O. Using natural language processing to extract structured epilepsy data from unstructured clinic letters: Development and validation of the ExECT (extraction of epilepsy clinical text) system. BMJ Open 2019, 9, e023232. [Google Scholar] [CrossRef]
- Janiesch, C.; Zschech, P.; Heinrich, K. Machine learning and deep learning. Electron. Mark. 2021, 31, 685–695. [Google Scholar] [CrossRef]
- Davenport, T.; Kalakota, R. The potential for artificial intelligence in healthcare. Future Healthc. J 2019, 6, 94–98. [Google Scholar] [CrossRef] [Green Version]
- Naylor, C.D. On the Prospects for a (Deep) Learning Health Care System. JAMA 2018, 320, 1099–1100. [Google Scholar] [CrossRef] [PubMed]
- Echle, A.; Rindtorff, N.T.; Brinker, T.J.; Luedde, T.; Pearson, A.T.; Kather, J.N. Deep learning in cancer pathology: A new generation of clinical biomarkers. Br. J. Cancer 2021, 124, 686–696. [Google Scholar] [CrossRef]
- Tran, K.A.; Kondrashova, O.; Bradley, A.; Williams, E.D.; Pearson, J.V.; Waddell, N. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med. 2021, 13, 152. [Google Scholar] [CrossRef]
- Gravina, M.; Spirito, L.; Celentano, G.; Capece, M.; Creta, M.; Califano, G.; Colla Ruvolo, C.; Morra, S.; Imbriaco, M.; Di Bello, F.; et al. Machine Learning and Clinical-Radiological Characteristics for the Classification of Prostate Cancer in PI-RADS 3 Lesions. Diagnostics 2022, 12, 1565. [Google Scholar] [CrossRef]
- Roest, C.; Kwee, T.C.; Saha, A.; Futterer, J.J.; Yakar, D.; Huisman, H. AI-assisted biparametric MRI surveillance of prostate cancer: Feasibility study. Eur. Radiol. 2023, 33, 89–96. [Google Scholar] [CrossRef] [PubMed]
- Qiao, X.M.; Hu, C.H.; Hu, S.; Hu, C.H.; Wang, X.M.; Shen, J.K.; Ji, L.B.; Song, Y.; Bao, J. The value of machine learning models based on biparametric MRI for diagnosis of prostate cancer and clinically significant prostate cancer. Zhonghua Yi Xue Za Zhi 2023, 103, 1446–1454. [Google Scholar] [CrossRef]
- Gresser, E.; Schachtner, B.; Stuber, A.T.; Solyanik, O.; Schreier, A.; Huber, T.; Froelich, M.F.; Magistro, G.; Kretschmer, A.; Stief, C.; et al. Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets. Quant Imaging Med. Surg. 2022, 12, 4990–5003. [Google Scholar] [CrossRef]
- Xie, F.; Yuan, H.; Ning, Y.; Ong, M.E.H.; Feng, M.; Hsu, W.; Chakraborty, B.; Liu, N. Deep learning for temporal data representation in electronic health records: A systematic review of challenges and methodologies. J. Biomed. Inf. 2022, 126, 103980. [Google Scholar] [CrossRef]
- Leyh-Bannurah, S.R.; Tian, Z.; Karakiewicz, P.I.; Wolffgang, U.; Sauter, G.; Fisch, M.; Pehrke, D.; Huland, H.; Graefen, M.; Budaus, L. Deep Learning for Natural Language Processing in Urology: State-of-the-Art Automated Extraction of Detailed Pathologic Prostate Cancer Data from Narratively Written Electronic Health Records. JCO Clin. Cancer Inf. 2018, 2, 1–9. [Google Scholar] [CrossRef]
- Xie, T.; Zhen, Y.; Tavakoli, M.; Hundley, G.; Ge, Y. A deep-learning based system for accurate extraction of blood pressure data in clinical narratives. AMIA Jt. Summits Transl. Sci. Proc. 2020, 2020, 703–709. [Google Scholar] [PubMed]
- Zhao, B. Clinical Data Extraction and Normalization of Cyrillic Electronic Health Records via Deep-Learning Natural Language Processing. JCO Clin. Cancer Inf. 2019, 3, 1–9. [Google Scholar] [CrossRef]
- Gunter, D.; Puac-Polanco, P.; Miguel, O.; Thornhill, R.E.; Yu, A.Y.X.; Liu, Z.A.; Mamdani, M.; Pou-Prom, C.; Aviv, R.I. Rule-based natural language processing for automation of stroke data extraction: A validation study. Neuroradiology 2022, 64, 2357–2362. [Google Scholar] [CrossRef] [PubMed]
- Gliklich, R.E.; Dreyer, N.A.; Leavy, M.B. 11 Data Collection and Quality Assurance. In Registries for Evaluating Patient Outcomers: A User’s Guide, 3rd ed.; Gliklich, R.E., Dreyer, N.A., Leavy, M.B., Eds.; Agency for Healthcare Research and Quality: Rockville, MD, USA, 2014; Volume 1, pp. 251–277. [Google Scholar]
Pilot 1 (2019) | Pilot 2 (2022) | |
---|---|---|
Identified inclusions by algorithm, n (%) | 1229 (60.5) | 4431 (46.8) |
True inclusions, n (valid %) | 1040/1229 (84.6) | 2018/4431 (45.5) |
False inclusions, n (valid %) | 189/1229 (15.4) | 2413/4431 (54.5) |
Identified exclusions by algorithm, n (%) | 452 (22.3) | NR |
True exclusions, n (valid %) | 447/452 (98.9) | |
False exclusions, n (valid %) | 5/452 (1.1) | |
Remaining identified subjects by algorithm, n (%) | 349 (17.2) | 5033 (53.2) |
True exclusions, n (valid %) | 180/349 (51.6) | 4772/5033 (94.8) |
False exclusions, n (valid %) | 169/349 (48.4) | 261/5033 (5.2) |
Pilot 1 (2019) | Pilot 2 (2022) | |
---|---|---|
Sensitivity | 85.7% | 88.5% |
Specificity | 76.8% | 66.4% |
Positive predictive value (PPV) | 84.6% | 45.5% |
Negative predictive value (NPV) | 78.3% | 94.8% |
Accuracy | 82.1% | 71.7% |
Manually n = 20 | Automated n = 20 | Completeness | Accuracy | |
---|---|---|---|---|
Date of initial diagnosis, n (%) | 20/20 (100) | 20/20 (100) | 20/20 (100) | 2/20 (10) |
20/20 (100) A | ||||
Type of tumor, n (%) | 18/20 (90) | 18/20 (90) | 18/18 (100) | 18/18 (100) |
Adenocarcinoma | 18/20 (90) | 18/20 (90) | ||
Unknown | 2/20 (10) | 2/20 (10) | ||
Gleason score, n (%) | 18/20 (90) | 17/20 (85) | 17/18 (94.4) B | 16/17 (94.1) C |
6–7 | 10/20 (50) | 8/20 (40) | 17/17 (100) A | |
8–10 | 8/20 (40) | 9/20 (45) | ||
Unknown | 2/20 (10) | 3/20 (15) | ||
Weight, n (%) | 1/20 (10) | 5/20 (25) | 5/1 (500) | 5/5 (100) |
ECOG PS, n (%) | 0/20 (0) | 1/20 (5) | 1/1 (100) | - |
PSA, n (%) | 20/20 (100) | 17/20 (85) | 17/20 (85) | 17/17 (100) |
20/20 (100) D | 20/20 (100) D | |||
Hb, n (%) | 14/20 (70) | 13/20 (65) | 13/14 (92.9) | 13/13 (100) |
20/20 (100) D | 20/14 (142.9) D | |||
MDT-date, n (%) | NR | 13/20 (65) | - | 12/13 (92.3) |
Drug treatment, n (%) | ||||
Docetaxel | 8/20 (40) | 6/20 (30) | 6/8 (75) | 6/6 (100) |
Abiraterone | 14/20 (70) | 13/20 (65) | 13/14 (92.9) | 13/13 (100) |
Enzalutamide | 6/20 (30) | 5/20 (25) | 5/6 (83.3) | 5/5 (100) |
Cabazitaxel | 2/20 (10) | 1/20 (5) | 1/2 (50) | 1/1 (100) |
Radium-223 | 2/20 (10) | 2/20 (10) | 2/2 (100) | 2/2 (100) |
Total | 27/32 (84.4) E | 27/27 (100) | ||
Start date of drug, n (valid %) | ||||
Docetaxel | NR | 6/20 (30) | 6/6 (100) | 6/6 (100) |
Abiraterone | NR | 13/20 (65) | 13/13 (100) | 12/13 (92.3) F |
Enzalutamide | NR | 5/20 (25) | 5/5 (100) | 3/5 (60) G |
Cabazitaxel | NR | 1/20 (5) | 1/1 (100) | 1/1 (100) |
Radium-223 | NR | 2/20 (10) | 2/2 (100) | 2/2 (100) |
Total | 27/27 (100) | 24/27 (88.9) | ||
Dose of drug, n (valid %) | ||||
Docetaxel | NR | 0/20 (0) | 0/NR (NR) | - |
Abiraterone | NR | 13/20 (65) | 13/NR (NR) | 13/13 (100) |
Enzalutamide | NR | 5/20 (25) | 5/NR (NR) | 5/5 (100) |
Cabazitaxel | NR | 0/20 (0) | 0/NR (NR) | - |
Radium-223 | NR | 0/20 (0) | 2/NR (NR) | 2/2 (100) D |
Total | 20/NR (NR) | 20/20 (100) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bosch, D.; Kuppen, M.C.P.; Tascilar, M.; Smilde, T.J.; Mulders, P.F.A.; Uyl-de Groot, C.A.; van Oort, I.M. Reliability and Efficiency of the CAPRI-3 Metastatic Prostate Cancer Registry Driven by Artificial Intelligence. Cancers 2023, 15, 3808. https://0-doi-org.brum.beds.ac.uk/10.3390/cancers15153808
Bosch D, Kuppen MCP, Tascilar M, Smilde TJ, Mulders PFA, Uyl-de Groot CA, van Oort IM. Reliability and Efficiency of the CAPRI-3 Metastatic Prostate Cancer Registry Driven by Artificial Intelligence. Cancers. 2023; 15(15):3808. https://0-doi-org.brum.beds.ac.uk/10.3390/cancers15153808
Chicago/Turabian StyleBosch, Dianne, Malou C. P. Kuppen, Metin Tascilar, Tineke J. Smilde, Peter F. A. Mulders, Carin A. Uyl-de Groot, and Inge M. van Oort. 2023. "Reliability and Efficiency of the CAPRI-3 Metastatic Prostate Cancer Registry Driven by Artificial Intelligence" Cancers 15, no. 15: 3808. https://0-doi-org.brum.beds.ac.uk/10.3390/cancers15153808