CHAPTER 2.2 56 Estimating eligibility in real-world data (RWD) before study commencement facilitates optimization between internal validity and generalizability as well as improve trial efficiency. In this study we aim to assess and compare the influence of most commonly used eligibility criteria for trials in heart failure with reduced ejection fraction (HFrEF) on eligibility between two patient populations, a European and an Asian registry cohort. As a secondary objective, we assessed the theoretical impact of gradual addition common inclusion and exclusion criteria on overall trial eligibility. METHODS Selection of heart failure trials Clinical study registration as of 31 December 2021 was downloaded from Aggregate Analysis of ClinicalTrials.gov13, a daily updated trial registration database.14 Relevant studies were identified by the ‘condition or disease’ of heart failure and its equivalent terms (Supplementary table 1). We characterized all interventional studies for HF and then, focused analysis on eligibility criteria for phase 3 trials for HF with reduced ejection fraction (HFrEF). HFrEF trials were defined as those which included patients with left ventricular ejection fraction (LVEF) of an upper limit of 40% and below. The primary outcome variable is trial eligibility criteria. This information is entered by investigators as free text; therefore, it first needs to undergo text analysis into a structured data format. Other trial-related variables were available in structured formats and analysed as potential predictors of trial eligibility. These are study start year, anticipated sample size and intervention type. In addition, we defined a study’s primary funder by the following definition: industry-funded if its lead or collaborator is industry, NIH/ other government agency if present as lead or collaborator for a non-industry sponsored study, and otherwise it is a healthcare or academic institution or other. Text analysis of trial eligibility criteria For text analysis, we used combined two methods to capture all relevant clinical entities in the eligibility criteria. First, we trained a machine-learning (ML) algorithm to recognize named-entities using a sample of manually annotated criteria combined with a standardized dictionary from the Unified Medical Language System.15 Second, we identified remaining unmarked entities using scripts defined by Apache
RkJQdWJsaXNoZXIy MjY0ODMw