Alper, B. S. (2023). Reflections on defining a standard for computable expression of scientific knowledge: What teach us yoda can [Journal Article]. Learn Health Syst, 7(1), e10312. https://doi.org/10.1002/lrh2.10312
Andersen, K. M., Bates, B. A., Rashidi, E. S., Olex, A. L., Mannon, R. B., Patel, R. C., Singh, J., Sun, J., Auwaerter, P. G., Ng, D. K., Segal, J. B., Garibaldi, B. T., Mehta, H. B., Alexander, G. C., & National, C. C. C. C. (2022). Long-term use of immunosuppressive medicines and in-hospital COVID-19 outcomes: A retrospective cohort study using data from the national COVID cohort collaborative [Journal Article]. Lancet Rheumatol, 4(1), e33–e41. https://doi.org/10.1016/S2665-9913(21)00325-8
Ankan, A., Wortel, I. M. N., & Textor, J. (2021). Testing graphical causal models using the r package "dagitty" [Journal Article]. Curr Protoc, 1(2), e45. https://doi.org/10.1002/cpz1.45
Anzalone, A. J., Horswell, R., Hendricks, B. M., Chu, S., Hillegass, W. B., Beasley, W. H., Harper, J. R., Kimble, W., Rosen, C. J., Miele, L., et al. (2023). Higher hospitalization and mortality rates among SARS-CoV-2-infected persons in rural america. The Journal of Rural Health, 39(1), 39–54. https://doi.org/10.1111/jrh.12689
Benchimol, E. I., Smeeth, L., Guttmann, A., Harron, K., Moher, D., Petersen, I., Sorensen, H. T., Elm, E. von, Langan, S. M., & Committee, R. W. (2015). The REporting of studies conducted using observational routinely-collected health data (RECORD) statement [Journal Article]. PLoS Med, 12(10), e1001885. https://doi.org/10.1371/journal.pmed.1001885
Bradwell, K. R., Wooldridge, J. T., Amor, B., Bennett, T. D., Anand, A., Bremer, C., Yoo, Y. J., Qian, Z., Johnson, S. G., Pfaff, E. R., et al. (2022). Harmonizing units and values of quantitative data elements in a very large nationally pooled electronic health record (EHR) dataset. Journal of the American Medical Informatics Association, 29(7), 1172–1182. https://doi.org/10.1093/jamia/ocac054
Casiraghi, E., Malchiodi, D., Trucco, G., Frasca, M., Cappelletti, L., Fontana, T., Esposito, A. A., Avola, E., Jachetti, A., Reese, J., Rizzi, A., Robinson, P. N., & Valentini, G. (2020). Explainable machine learning for early assessment of COVID-19 risk prediction in emergency departments. IEEE Access, 8, 196299–196325. https://doi.org/10.1109/access.2020.3034032
Casiraghi, E., Wong, R., Hall, M., Coleman, B., Notaro, M., Evans, M. D., Tronieri, J. S., Blau, H., Laraway, B., Callahan, T. J., Chan, L. E., Bramante, C. T., Buse, J. B., Moffitt, R. A., Stürmer, T., Johnson, S. G., Shao, Y. R., Reese, J., Robinson, P. N., … Wilkins, K. J. (2023). A method for comparing multiple imputation techniques: A case study on the u.s. National COVID cohort collaborative. Journal of Biomedical Informatics, 139, 104295. https://doi.org/10.1016/j.jbi.2023.104295
Caton, S., & Haas, S. (2020). Fairness in machine learning: A survey. [Journal Article]. arXiv. https://doi.org/10.48550/arXiv.2010.0405
Charlson, M. E., Pompei, P., Ales, K. L., & MacKenzie, C. R. (1987). A new method of classifying prognostic comorbidity in longitudinal studies: Development and validation. Journal of Chronic Diseases, 40(5), 373–383. https://doi.org/10.1016/0021-9681(87)90171-8
Chollet, F. (2021). Deep learning with python. Simon; Schuster.
Cutter, S., Ash, K., & Emrich, CT. (2014). The geographies of community disaster resilience [Journal Article]. Global Environmental Change, 29(Nov 1), 65–77. https://doi.org/10.1016/j.gloenvcha.2014.08.005
Dong, X., Li, J., Soysal, E., Bian, J., DuVall, S. L., Hanchrow, E., Liu, H., Lynch, K. E., Matheny, M., Natarajan, K., et al. (2020). COVID-19 TestNorm: A tool to normalize COVID-19 testing names to LOINC codes. Journal of the American Medical Informatics Association, 27(9), 1437–1442. https://doi.org/10.1093/jamia/ocaa145
Elm, E. von, Altman, D. G., Egger, M., Pocock, S. J., Gotzsche, P. C., Vandenbroucke, J. P., & Initiative, S. (2014). The strengthening the reporting of observational studies in epidemiology (STROBE) statement: Guidelines for reporting observational studies [Journal Article]. Int J Surg, 12(12), 1495–1499. https://doi.org/10.1016/j.ijsu.2014.07.013
Franklin, J. M., Lin, K. J., Gatto, N. M., Rassen, J. A., Glynn, R. J., & Schneeweiss, S. (2021). Real-world evidence for assessing pharmaceutical treatments in the context of COVID-19 [Journal Article]. Clin Pharmacol Ther, 109(4), 816–828. https://doi.org/10.1002/cpt.2185
Franklin, J. M., Platt, R., Dreyer, N. A., London, A. J., Simon, G. E., Watanabe, J. H., Horberg, M., Hernandez, A., & Califf, R. M. (2022). When can nonrandomized studies support valid inference regarding effectiveness or safety of new medical treatments? [Journal Article]. Clin Pharmacol Ther, 111(1), 108–115. https://doi.org/10.1002/cpt.2255
Fu, S., Leung, L. Y., Raulli, A.-O., Kallmes, D. F., Kinsman, K. A., Nelson, K. B., Clark, M. S., Luetmer, P. H., Kingsbury, P. R., Kent, D. M., & Liu, H. (2020). Assessment of the impact of EHR heterogeneity for clinical research through a case study of silent brain infarction. BMC Medical Informatics and Decision Making, 20(1). https://doi.org/10.1186/s12911-020-1072-9
Gold, S., Batch, A., McClure, R., Jiang, G., Kharrazi, H., Saripalle, R., Huser, V., Weng, C., Roderer, N., Szarfman, A., et al. (2018). Clinical concept value sets and interoperability in health data analytics. AMIA Annual Symposium Proceedings, 2018, 480. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6371254
Gold, S., Lehmann, H., Schilling, L., & Lutters, W. (2021). Practices, norms, and aspirations regarding the construction, validation, and reuse of code sets in the analysis of real-world data. medRxiv, 2021–2010. https://doi.org/10.1101/2021.10.14.21264917
Griffith, G. J., Morris, T. T., Tudball, M. J., Herbert, A., Mancano, G., Pike, L., Sharp, G. C., Sterne, J., Palmer, T. M., Davey Smith, G., Tilling, K., Zuccolo, L., Davies, N. M., & Hemani, G. (2020). Collider bias undermines our understanding of COVID-19 disease risk and severity. Nature Communications, 11(1), 5749. https://doi.org/10.1038/s41467-020-19478-2
Haendel, M. A., Chute, C. G., Bennett, T. D., Eichmann, D. A., Guinney, J., Kibbe, W. A., Payne, P. R. O., Pfaff, E. R., Robinson, P. N., Saltz, J. H., Spratt, H., Suver, C., Wilbanks, J., Wilcox, A. B., Williams, A. E., Wu, C., Blacketer, C., Bradford, R. L., Cimino, J. J., … N3C Consortium, the. (2020). The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment. Journal of the American Medical Informatics Association, 28(3), 427–443. https://doi.org/10.1093/jamia/ocaa196
Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H. (2009). The elements of statistical learning: Data mining, inference, and prediction (Vol. 2). Springer.
Hernan, M. A., & Robins, J. M. (2016). Using big data to emulate a target trial when a randomized trial is not available [Journal Article]. Am J Epidemiol, 183(8), 758–764. https://doi.org/10.1093/aje/kwv254
Islam, J. Y., Madhira, V., Sun, J., Olex, A., Franceschini, N., Kirk, G., & Patel, R. (2022). Racial disparities in COVID-19 test positivity among people living with HIV in the united states [Journal Article]. Int J STD AIDS, 33(5), 462–466. https://doi.org/10.1177/09564624221074468
Kharrazi, H., Chi, W., Chang, H.-Y., Richards, T. M., Gallagher, J. M., Knudson, S. M., & Weiner, J. P. (2017). Comparing population-based risk-stratification model performance using demographic, diagnosis and medication data extracted from outpatient electronic health records versus administrative claims. Medical Care, 55(8), 789–796. https://doi.org/10.1097/MLR.0000000000000754
Klein, J. T. (1996). Crossing boundaries knowledge, disciplinarities, and interdisciplinarities [Book]. University Press of Virginia. https://www.google.com/books/edition/Crossing_Boundaries/bNJvYf3ROPAC
Kleinberg, J. M., Mullainathan, S., & Raghavan, M. (2016). Inherent trade-offs in the fair determination of risk scores. CoRR, abs/1609.05807. https://doi.org/10.48550/arXiv.1609.05807
Kuehne, F., Jahn, B., Conrads-Frank, A., Bundo, M., Arvandi, M., Endel, F., Popper, N., Endel, G., Urach, C., Gyimesi, M., Murray, E. J., Danaei, G., Gaziano, T. A., Pandya, A., & Siebert, U. (2019). Guidance for a causal comparative effectiveness analysis emulating a target trial based on big real world evidence: When to start statin treatment [Journal Article]. J Comp Eff Res, 8(12), 1013–1025. https://doi.org/10.2217/cer-2018-0103
Li, C., Alsheikh, A. M., Robinson, K. A., & Lehmann, H. P. (2023). Use of recommended real-world methods for electronic health record data analysis has not improved over 10 years. medRxiv. https://doi.org/10.1101/2023.06.21.23291706
Lundberg, S. M., Erion, G. G., & Lee, S.-I. (2018). Consistent individualized feature attribution for tree ensembles. arXiv. https://doi.org/10.48550/ARXIV.1802.03888
Madlock-Brown, C., Wilkens, K., Weiskopf, N., Cesare, N., Bhattacharyya, S., Riches, N. O., Espinoza, J., Dorr, D., Goetz, K., Phuong, J., Sule, A., Kharrazi, H., Liu, F., Lemon, C., & Adams, W. G. (2022a). Clinical, social, and policy factors in COVID-19 cases and deaths: Methodological considerations for feature selection and modeling in county-level analyses [Journal Article]. BMC Public Health, 22(1), 747. https://doi.org/10.1186/s12889-022-13168-y
Madlock-Brown, C., Wilkens, K., Weiskopf, N., Cesare, N., Bhattacharyya, S., Riches, N. O., Espinoza, J., Dorr, D., Goetz, K., Phuong, J., Sule, A., Kharrazi, H., Liu, F., Lemon, C., & Adams, W. G. (2022b). Correction: Clinical, social, and policy factors in COVID-19 cases and deaths: Methodological considerations for feature selection and modeling in county-level analyses [Journal Article]. BMC Public Health, 22(1), 1250. https://doi.org/10.1186/s12889-022-13562-6
Mehta, H. B., An, H., Andersen, K. M., Mansour, O., Madhira, V., Rashidi, E. S., Bates, B., Setoguchi, S., Joseph, C., Kocis, P. T., Moffitt, R., Bennett, T. D., Chute, C. G., Garibaldi, B. T., & Caleb Alexander, G. (2021). Use of hydroxychloroquine, remdesivir, and dexamethasone among adults hospitalized with covid-19 in the united states: A retrospective cohort study. Annals of Internal Medicine, 174(10), 1395–1403. https://doi.org/10.7326/M21-0857
Mitra, R., McGough, S. F., Chakraborti, T., Holmes, C., Copping, R., Hagenbuch, N., Biedermann, S., Noonan, J., Lehmann, B., Shenvi, A., et al. (2023). Learning from data with structured missingness. Nature Machine Intelligence, 5(1), 13–23. https://doi.org/10.1038/s42256-022-00596-z
Morgan, R. L., Whaley, P., Thayer, K. A., & Schunemann, H. J. (2018). Identifying the PECO: A framework for formulating good questions to explore the association of environmental and other exposures with health outcomes [Journal Article]. Environ Int, 121(Pt 1), 1027–1031. https://doi.org/10.1016/j.envint.2018.07.015
Narrett, J. A., Mallawaarachchi, I., Aldridge, C. M., Assefa, E. D., Patel, A., Loomba, J. J., Ratcliffe, S., Sadan, O., Monteith, T., Worrall, B. B., Brown, D. E., Johnston, K. C., Southerland, A. M., & consortium, N. C. (2023). Increased stroke severity and mortality in patients with SARS-CoV-2 infection: An analysis from the N3C database [Journal Article]. J Stroke Cerebrovasc Dis, 32(3), 106987. https://doi.org/10.1016/j.jstrokecerebrovasdis.2023.106987
OHDSI. (2019). The book of OHDSI: Observational health data sciences and informatics. OHDSI. https://ohdsi.github.io/TheBookOfOhdsi/
Palantir. (2023). Documentation: Code repositories overview. https://www.palantir.com/docs/foundry/code-repositories/overview/.
Peshawa J Muhammad Ali, & Rezhna Hassan Faraj. (2014). Data normalization and standardization: A technical report. https://doi.org/10.13140/RG.2.2.28948.04489
Pfaff, E. R., Girvin, A. T., Bennett, T. D., Bhatia, A., Brooks, I. M., Deer, R. R., Dekermanjian, J. P., Jolley, S. E., Kahn, M. G., Kostka, K., McMurry, J. A., Moffitt, R., Walden, A., Chute, C. G., Haendel, M. A., Bramante, C., Dorr, D., Morris, M., Parker, A. M., … Niehaus, E. (2022). Identifying who has long COVID in the USA: A machine learning approach using N3C data. Lancet Digit Health, 4(7), e532–e541. https://doi.org/10.1016/S2589-7500(22)00048-6
Pfaff, E. R., Girvin, A. T., Gabriel, D. L., Kostka, K., Morris, M., Palchuk, M. B., Lehmann, H. P., Amor, B., Bissell, M., Bradwell, K. R., et al. (2022). Synergies between centralized and federated approaches to data quality: A report from the national COVID cohort collaborative. Journal of the American Medical Informatics Association, 29(4), 609–618. https://doi.org/10.1093/jamia/ocab217
Pfaff, E. R., Madlock-Brown, C., Baratta, J. M., Bhatia, A., Davis, H., Girvin, A., Hill, E., Kelly, E., Kostka, K., Loomba, J., et al. (2023). Coding long COVID: Characterizing a new disease through an ICD-10 lens. BMC Medicine, 21(1), 1–13. https://doi.org/10.1186/s12916-023-02737-6
Redelmeier, D. A., Wang, J., & Thiruchelvam, D. (2023). COVID vaccine hesitancy and risk of a traffic crash [Journal Article]. Am J Med, 136(2), 153–162 e5. https://doi.org/10.1016/j.amjmed.2022.11.002
Reese, J. T., Blau, H., Casiraghi, E., Bergquist, T., Loomba, J. J., Callahan, T. J., Laraway, B., Antonescu, C., Coleman, B., Gargano, M., et al. (2023). Generalisable long COVID subtypes: Findings from the NIH N3C and RECOVER programmes. EBioMedicine, 87. https://doi.org/10.1016/j.ebiom.2022.104413
Richesson, R. L., Hammond, W. E., Nahm, M., Wixted, D., Simon, G. E., Robinson, J. G., Bauck, A. E., Cifelli, D., Smerek, M. M., Dickerson, J., et al. (2013). Electronic health records based phenotyping in next-generation clinical trials: A perspective from the NIH health care systems collaboratory. Journal of the American Medical Informatics Association, 20(e2), e226–e231. https://doi.org/10.1136/amiajnl-2013-001926
Roberts, M., Driggs, D., Thorpe, M., Gilbey, J., Yeung, M., Ursprung, S., Aviles-Rivero, A. I., Etmann, C., McCague, C., Beer, L., Weir-McCall, J. R., Teng, Z., Gkrania-Klotsas, E., Ruggiero, A., Korhonen, A., Jefferson, E., Ako, E., Langs, G., Gozaliasl, G., … and, C.-B. S. (2021). Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nature Machine Intelligence, 3(3), 199–217. https://doi.org/10.1038/s42256-021-00307-0
Sahner, D., & Spellmeyer, D. C. (2020). Artificial intelligence: Emerging applications in biotechnology and pharma. In Biotechnology entrepreneurship (pp. 399–417). Elsevier. https://doi.org/10.1016/b978-0-12-815585-1.00028-0
Schneeweiss, S., Rassen, J. A., Brown, J. S., Rothman, K. J., Happe, L., Arlett, P., Dal Pan, G., Goettsch, W., Murk, W., & Wang, S. V. (2019). Graphical depiction of longitudinal study designs in health care databases [Journal Article]. Ann Intern Med, 170(6), 398–406. https://doi.org/10.7326/M18-3079
Schuemie, M. J., Ryan, P. B., Hripcsak, G., Madigan, D., & Suchard, M. A. (2018). Improving reproducibility by using high-throughput observational studies with empirical calibration [Journal Article]. Philos Trans A Math Phys Eng Sci, 376(2128). https://doi.org/10.1098/rsta.2017.0356
Schuemie, M. J., Ryan, P. B., Pratt, N., Chen, R., You, S. C., Krumholz, H. M., Madigan, D., Hripcsak, G., & Suchard, M. A. (2020). Large-scale evidence generation and evaluation across a network of databases (LEGEND): Assessing validity using hypertension as a case study [Journal Article]. J Am Med Inform Assoc, 27(8), 1268–1277. https://doi.org/10.1093/jamia/ocaa124
Shapley, L. S. (1953). 17. A value for n-person games. In Contributions to the theory of games (AM-28), volume II (pp. 307–318). Princeton University Press. https://doi.org/10.1515/9781400881970-018
Sharafeldin, N., Bates, B., Song, Q., Madhira, V., Yan, Y., Dong, S., Lee, E., Kuhrt, N., Shao, Y. R., Liu, F., Bergquist, T., Guinney, J., Su, J., & Topaloglu, U. (2021). Outcomes of COVID-19 in patients with cancer: Report from the national COVID cohort collaborative (N3C). Journal of Clinical Oncology, 39(20), 2232–2246. https://doi.org/10.1200/JCO.21.01074
Sidky, H., Young, J. C., Girvin, A. T., Lee, E., Shao, Y. R., Hotaling, N., Michael, S., Wilkins, K. J., Setoguchi, S., Funk, M. J., & Consortium, N. C. (2023). Data quality considerations for evaluating COVID-19 treatments using real world data: Learnings from the national COVID cohort collaborative (N3C) [Journal Article]. BMC Med Res Methodol, 23(1), 46. https://doi.org/10.1186/s12874-023-01839-2
Stoudt, S., Vasquez, V. N., & Martinez, C. C. (2021). Principles for data analysis workflows [Journal Article]. PLoS Comput Biol, 17(3), e1008770. https://doi.org/10.1371/journal.pcbi.1008770
Sun, J., Zheng, Q., Madhira, V., Olex, A. L., Anzalone, A. J., Vinson, A., Singh, J. A., French, E., Abraham, A. G., Mathew, J., Safdar, N., Agarwal, G., Fitzgerald, K. C., Singh, N., Topaloglu, U., Chute, C. G., Mannon, R. B., Kirk, G. D., & Patel, R. C. (2022). Association between immune dysfunction and COVID-19 breakthrough infection after SARS-CoV-2 vaccination in the US. Archives of Internal Medicine (Chicago, Ill. : 1908), 182(2), 153–162. https://doi.org/10.1001/jamainternmed.2021.7024
Tan, A. L. M., Getzen, E. J., Hutch, M. R., Strasser, Z. H., Gutierrez-Sacristan, A., Le, T. T., Dagliati, A., Morris, M., Hanauer, D. A., Moal, B., Bonzel, C. L., Yuan, W., Chiudinelli, L., Das, P., Zhang, H. G., Aronow, B. J., Avillach, P., Brat, G. A., Cai, T., … Holmes, J. H. (2023). Informative missingness: What can we learn from patterns in missing laboratory data in the electronic health record? [Journal Article]. J Biomed Inform, 139, 104306. https://doi.org/10.1016/j.jbi.2023.104306
U.S. Food and Drug Administration. (2017). Software as a medical device (SAMD): Clinical evaluation/guidance for industry and food and drug administration staff [Web Page]. FDA. https://www.fda.gov/media/100714/download
U.S. Food and Drug Administration. (2023). Considerations for the design and conduct of externally controlled trials for drug and biological products guidance for industry [Report]. Food; Drug Administration. https://www.fda.gov/media/164960/download
U.S. Food and Drug Administration and the Duke-Margolis Center for Health Policy. (2019). Developing real-world data and evidence to support regulatory decision-making [Online Multimedia]. https://www.youtube.com/watch?v=-G6ltatA71I
U.S. Food and Drug Administration, Health Canada, and the United Kingdom’s Medicines and Healthcare products Regulatory Agency (MHRA). (2021). Good machine learning practice for medical device development: Guiding principles [Web Page]. https://www.fda.gov/medical-devices/software-medical-device-samd/good-machine-learning-practice-medical-device-development-guiding-principles
Walonoski, J., Klaus, S., Granger, E., Hall, D., Gregorowicz, A., Neyarapally, G., Watson, A., & Eastman, J. (2020). Synthea™ novel coronavirus (COVID-19) model and synthetic data set. Intelligence-Based Medicine, 1-2, 100007. https://doi.org/doi.org/10.1016/j.ibmed.2020.100007
Wang, S. V., Pinheiro, S., Hua, W., Arlett, P., Uyama, Y., Berlin, J. A., Bartels, D. B., Kahler, K. H., Bessette, L. G., & Schneeweiss, S. (2021). STaRT-RWE: Structured template for planning and reporting on the implementation of real world evidence studies [Journal Article]. BMJ, 372, m4856. https://doi.org/10.1136/bmj.m4856
Weiskopf, N. G., Dorr, D. A., Jackson, C., Lehmann, H. P., & Thompson, C. A. (2023). Healthcare utilization is a collider: An introduction to collider bias in EHR data reuse [Journal Article]. J Am Med Inform Assoc. https://doi.org/10.1093/jamia/ocad013
Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., Silva Santos, L. B. da, Bourne, P. E., et al. (2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3. https://doi.org/10.1038/sdata.2016.18
Yang, X., Sun, J., Patel, R. C., Zhang, J., Guo, S., Zheng, Q., Olex, A. L., Olatosi, B., Weissman, S. B., Islam, J. Y., et al. (2021). Associations between HIV infection and clinical spectrum of COVID-19: A population level analysis based on US national COVID cohort collaborative (N3C) data. The Lancet HIV, 8(11), 690–700. https://doi.org/10.1016/S2352-3018(21)00239-3
Zhou, R., Johnson, K. E., Rousseau, J. F., Rathouz, P. J., & Consortium, N. C. (2022). Comparative effectiveness of dexamethasone in treatment of hospitalized COVID-19 patients during the first year of the pandemic: The N3C data repository [Journal Article]. medRxiv. https://doi.org/10.1101/2022.10.22.22281373