When Recruiting Patients is Not Enough: How Synthetic Control Arms Can Enhance Clinical Trial Design

When Recruiting Patients is Not Enough: How Synthetic Control Arms Can Enhance Clinical Trial Design
Thought Leadership
Member News

Clinical trials are a crucial component of present-day clinical research, providing the data that regulatory agencies rely on to make decisions about the safety and efficacy of new drugs and treatments. However, traditional clinical trials are often slow, expensive, and may not always reflect the diversity of patients in the real world. For this reason, the past few years have witnessed a surge in interest in novel trial designs that can tackle these limitations.

We sat down with Professor Krishnarajah Nirantharakumar, MD from Dexter, a cutting-edge startup and Globant’s partner out of the University of Birmingham, to discuss key concepts in clinical research and the synthetic control arm, an emerging approach to study design that is transforming the field.

Randomized Control Trials

Randomized control trials (RCT) are typically designed with two main groups of patients: the intervention (or exposure) arm, which receives the new drug or treatment being tested, and the control arm, which receives either a placebo or a standard-of-care treatment. The control group serves as a reference point to which researchers compare the effects of the intervention. Controls in clinical trials help to determine if the observed effects are in fact due to the intervention and to minimize possible unintended bias or unforeseen confounder effects. 

The process of recruiting a large number of participants for a clinical trial can be time-consuming and resource-demanding. Challenges include finding qualifying patients, the difficulty in recruiting a diverse pool of participants, narrow eligibility criteria, which reduce the number of qualifying candidates for participation, lack of understanding of clinical trials among the general population, and a level of mistrust associated with clinical research, especially among minority groups. The number of participants in a clinical trial directly impacts the statistical power of the study and its ability to detect a true treatment effect given a certain sample size and level of significance. Increasing the power reduces the likelihood of a type II error, where a treatment effect exists but is not detected due to insufficient sample size. Therefore an inability to recruit a large number of patients, further complicated by an often insufficient patient retention during a trial, is one of the factors that lead to a high failure rate in clinical trials.

KN:In certain circumstances, [recruiting enough participants] can be quite problematic. And this is where the so-called external controls or synthetic controls come into play. The most common application is in oncology, and the second most common in rare diseases. In oncology, when new treatments are studied, there is a lot of interest and most cancer patients consent to take part in such studies with a hope they will be receiving the treatment. No one wants to be put in the placebo group. So in such circumstances, we have two issues. One is that we might not have enough cancer patients to randomize them into two groups, so we don’t have the [statistical] power. The second is that it might not be ethical to not give the treatment. In those cases, we may try to find what we call external controls [or synthetic controls]. We can find a set of cancer patients who are in some other place, not receiving this treatment, but are getting a similar kind of management otherwise.”

Synthetic Control Arms

Synthetic control arms (SCAs), a type of external control arms, are an innovative approach in which researchers use data to construct a virtual or synthetic control arm, rather than recruiting new patients for a control group. Building an SCA consists in utilizing patient data found in pre-existing datasets, such as electronic health records (EHR), that are de-identified or stripped of any personal identifiable information (PII). Such controls are used to mimic real patients who would otherwise be recruited to the trial. 

KN: “If randomization has worked well in a RCT, the characteristics of the people you find in the intervention arm will be similar to those in the control arm. You will find that the ages are very similar, the sex distribution is similar, even their biomarkers and parameters may be similar. With external controls, we try to mimic this control population. We try to find [in pre-existing datasets] the type of people who are very similar to those receiving the intervention. And because we’re trying to mimic the characteristics of real people with pre-existing data, that’s why we call this type of control group, synthetic controls. The datasets where we look for those data come from various places. They can come from cancer registries, from electronic health records, from previous health surveys for common diseases, and they can come from other sources as well. [To generate a synthetic arm, we have to make sure that the data we use to represent a control population is very similar to the population that will receive the intervention]. There are statistical methods to [test comparability of the two groups]. The most common one that we use is the propensity score matching, but there are other methods such as exact matching, inverse probability weighting, etc. There are a number of ways we ensure the two arms are very similar.”

Generating an SCA involves a tactical process that requires careful consideration of several factors. First, the patient population for the intervention arm needs to be defined, including their demographic and health characteristics. Next, the goal is to find patients with similar characteristics in EHRs or other RWD sources to use as controls. These patients should ideally be from the same time period and geographic location as those in the exposure arm, and should be representative of those receiving the most current available treatment, if possible. The standard of care can vary significantly depending on geographic location and can change over time. Therefore, it is essential to ensure that the SCA patients are receiving the most recent advised treatment to produce a fair comparison to the intervention arm. However, the selection of controls also depends on the outcome being measured and the proposed mechanism of the drug. 

KN: “Obviously we make the decision early if we’re going to use a synthetic control. The next thing we need to know is what the outcome is and if this outcome is available in the synthetic controls. If it is available then we need to know how the outcome is measured, so that we can judge if the process of outcome measurement is accurate and if this is aplicable in our trial. We will then identify patients with a distribution of characteristics that’s similar [to the intervention arm]. We also need to think about the eligibility criteria very early on. For example, we might want to exclude people with liver disease. Liver disease may not be terribly well-coded in electronic health records So the [synthetic] control arm may actually hold early onset liver disease that is not detected. Whereas in the treatment arm we may screen and exclude patients with early onset liver disease. A number of other things have to be considered, you need to think carefully and ensure what a good comparator is and any trade off between what we want to measure [in the intervention arm], what is measured [in the synthetic control arm], and how to make those comparisons.”

The Critical Role of EHR Data with Ethical and Regulatory Considerations 

The generation of SCAs relies on the availability of high-quality RWD, which is essential for accurate modeling of control groups. EHR systems are an example of an important RWD source and their evolution is therefore critical. As EHR data become more comprehensive, the quality of SCAs will improve, leading to better outcomes in clinical trials. Additionally, the use of machine learning (ML) algorithms can help identify patterns and relationships within EHR data, which can further improve the accuracy and effectiveness of SCAs. 

There are ethical considerations that must be taken into account when utilizing EHR data. The informed consent process for using EHR data in research must be carefully managed if the data used for research is not anonymized, and patients must be provided with clear information about how their data will be used and what risks may be associated with its use. Finally, the use of EHR data for synthetic controls must be carried out in accordance with relevant regulations and guidelines to ensure that research is conducted ethically and with due regard for patient safety and welfare. Regulatory considerations may vary from country to country. 

KN: “If data are anonymized, they can be used as far as they are not identifiable and as long as there is a mechanism to ensure patient confidentiality. The key thing is that [the use of patient data] has to be communicated. Information should be available very clearly for those people whose data we are making good use of, people should have the right to withdraw their data from the electronic health records for research. If we are not using anonymized data, then we need to get patient consent for those data to be used as synthetic controls.”

To support regulatory review in drug development programs, the FDA requires sponsors to consider two key factors. Firstly, sponsors must communicate with the relevant FDA review division early in the program to determine whether an externally controlled trial is suitable. Sponsors should provide detailed information about the study design, proposed data sources for the external control arm, planned statistical analyses, and plans to meet FDA’s data submission expectations. Secondly, sponsors must include patient-level data for both the treatment and external control arms. If they do not own the data for the external control arm, they must ensure agreements with the data owners to provide patient-level data to FDA.


The use of SCAs in clinical trials has numerous benefits, such as increasing the precision of some studies, reducing the cost of clinical research, or shortening time-to-market for new drugs. They can provide a solution for the challenges associated with recruiting a sufficient number of patients in rare diseases or cancer studies. Additionally, they offer an appealing alternative in cases when withholding the experimental treatment by assigning patients to a control group is unethical.

However, it is also important to acknowledge the limitations of SCAs. They rely on assumptions and modeling, which may introduce bias or uncertainty into the results. Availability of data representation, imperfect matching, or inability to account for unmeasured confounding factors may all impact the outcomes of clinical trials that leverage SCAs.

The employment of SCAs has the potential to transform clinical trial design. As technology continues to evolve, EHRs become increasingly prevalent, and the availability and quality of real-world data improves, the use of synthetic control arms becomes more feasible and reliable and the potential benefits of this approach increases. The use of artificial intelligence and ML algorithms may enhance the ability to identify appropriate synthetic controls, optimize the selection of covariates, and even create new datasets. By reducing the cost and time required for clinical trials, synthetic control arms could expedite the development of treatments for rare diseases and cancers, leading to improved outcomes and quality of life for people around the globe. It is clear that the future of clinical trial design will continue to be shaped by innovative approaches whose potential impact on public health is truly remarkable.

Discover more about GLOBANT:

We are a digitally native company that helps organizations reinvent themselves and unleash their potential.


Related News

Evondos Anna in Siun Sote, Finland: Technology brings quality home care to sparsely-populated areas

9 Apr 2024
In Siun Sote, it is believed that medicine-dispensing robots will increasingly establish themselves as part of future home care. In North Karelia, dem...

ECHAlliance Announces Reciprocal Agreement with AgeTech Atlanta

8 Apr 2024
ECHAlliance and AgeTech Atlanta forge a Foundation Partnership to drive AgeTech Innovation and Excellence

Digital Health Collaboration: ECHAlliance Session at Africa-Europe Science Forum

8 Apr 2024
Connecting the Dots: ECHAlliance Hosts session on Digital Health Collaboration at Africa-Europe Science Collaboration and Innovation Forum

New Foundations 2024

8 Apr 2024
New Foundations is a key driver in progressing our strategic priorities by enabling awardees to pursue research, networking or dissemination activitie...

Neurotech Entrepreneurship to Validate Emerging Innovations (NERVE)

8 Apr 2024
The NERVE (Neurotech Entrepreneurship to Validate Emerging Innovations) program is Canada's single largest award that catalyzes early stage entreprene...

Horizon Europe Pump Priming Programme

8 Apr 2024
The programme supports individual UK SMEs wanting to explore and access Horizon Europe collaborative research and innovation opportunities.

Become a member

Join ECHAlliance to amplify your organisation’s message, grow your networks, connect with innovators and collaborate globally.
First name *
Last Name *
Email Address *
Country *
Position *
First name *
Last Name *
Email Address *
Country *
Position *