They are the best comparative design available when the investigator wants to evaluate an intervention that: operates at a group level, manipulates the social or physical environment, or cannot be delivered to individual participants without substantial risk of contamination. Potential for Confounding Parallel GRTs often involve a limited number of groups randomized to each study condition.
As a result, when the number of groups to be randomized to each study condition is limited, a priori matching and a priori stratification are widely recommended to help ensure balance across the study conditions on potential confounders Campbell and Walters , ; Donner and Klar , ; Hayes and Moulton , ; Murray , More recently, constrained randomization is recommended as another option for parallel GRTs Li et al.
Intraclass Correlation The more challenging feature of parallel GRTs is that members of the same group usually share some physical, geographic, social, or other connection. Solutions The recommended solution to these challenges is to employ a priori matching, a priori stratification, or constrained randomization to balance potential confounders, to reflect the hierarchical structure of the design in the analytic plan; and to estimate the sample size for the GRT based on realistic and data-based estimates of the ICC and the other parameters indicated by the analytic plan.
What is the difference between a pragmatic trial and a parallel group- or cluster-randomized trial? What are some important references on the design and analysis of parallel GRTs? Which one should be used? If I randomize blocks of time, rather than groups of people, is it still a parallel group- or cluster-randomized trial? In longer trials, it is common for participants to change groups over time.
Is this a problem? What is the impact of variation in the size of the groups or clusters that are randomized, or through which participants receive their intervention? When families or spouses are randomized, the ICCs are often large. Why does that happen? What is wrong with that approach? Is there any way to avoid having to include the groups in the analysis as a random effect?
This is because nested factors must be modeled as random effects Zucker , Many studies seem to pick an ICC value arbitrarily for use in their power or sample size calculations. What criteria should be used for selecting an ICC for such calculations? Can a priori matching or stratification, or constrained randomization, improve power in a parallel GRT? What is the minimum number of groups per condition in a parallel GRT? What is the minimum number of members per group in a parallel GRT? Many people say that if you match or stratify a priori , you must use a matched or stratified analysis.
Some have suggested that if the ICC is small, e. Is that true? Some have suggested testing the ICC or group-component of variance and ignoring it if it is not significantly greater than zero.
Is this ok? What is the best analytical model for a pretest-posttest parallel GRT? First, one could analyze the posttest data, ignoring the pretest data altogether. What about parallel GRTs that include multiple time points? How should those be analyzed. The material on this website focuses on model-based methods. What about randomization tests? Or generalized estimating equations? Consort statement: Extension to cluster randomised trials. A tutorial on sample size calculation for multiple-period cluster randomized parallel, cross-over and stepped-wedge trials using the Shiny CRT Calculator.
Int J Epidemiol. Essential ingredients and innovations in the design and analysis of group-randomized trials. Annu Rev Public Health. How to design efficient cluster randomised trials. An evaluation of constrained randomization for the design and analysis of group-randomized trials with binary outcomes. Statistics in Medicine. An evaluation of constrained randomization for the design and analysis of group-randomized trials.
Review of recent methodological developments in group-randomized trials: Part 1-Design. Am J Public Health. Review of recent methodological developments in group-randomized trials: Part 2-Analysis. Crespi CM. Improved designs for cluster randomized trials. Recommendations for choosing an analysis method that controls Type I error for unbalanced cluster sample designs with Gaussian outcomes.
The merits of breaking the matches: A cautionary tale. A comparison of permutation and mixed-model regression methods for the analysis of simulated data in the context of a group-randomized trial. Design and analysis of group-randomized trials: A review of recent methodological developments. American Journal of Public Health. On design considerations and randomization-based inference for community intervention trials.
Zucker DM. An analysis of variance pitfall: The fixed effects analysis in a nested design. Educational and Psychological Measurement. Randomization by cluster: Sample size requirements and analysis. American Journal of Epidemiology. Cornfield J. Randomization by group: A formal analysis. Methodological review showed that time-to-event outcomes are often inadequately handled in cluster randomized trials. J Clin Epidemiol. Design and analysis of group-randomized trials in cancer: A review of current practices.
Preventive Medicine. Reporting and methodological quality of sample size calculations in cluster randomized trials could be improved: A review. Are missing data adequately handled in cluster randomised trials? A systematic review and guidelines. Clin Trials. Cluster randomized trials of cancer screening interventions: Are appropriate statistical methods being used? Contemporary Clinical Trials. Research Support, N. Impact of CONSORT extension for cluster randomised trials on quality of reporting and study methodology: Review of random sample of trials, Internal and external validity of cluster randomised trials: Systematic review of recent trials.
Design and analysis of group-randomized trials: A review of recent practices. Accounting for cluster randomization: A review of Primary Prevention Trials, through A methodological review of non-therapeutic intervention trials employing cluster randomization, Sample size estimation for stratified individual and cluster randomized trials with binary outcomes.
Stat Med. Li J, Jung SH. It is true that this approach will improve the fidelity of implementation. In a parallel GRT, the groups are the units of assignment and are nested within study conditions, with different groups in each condition. In an IRGT, the groups are created in the intervention condition to facilitate delivery of the intervention; those groups may be defined by their instructor or facilitator, surgeon, therapist, or other interventionist, or they may be virtual groups.
So long as the groups are nested within study conditions, they must be included in the analysis as levels of a random effect; ignoring them, or including them as levels of a fixed effect, will result in an inflated type 1 error rate. This is because nested factors must be modeled as random effects.
This explanation also offers a potential solution — if the investigator can avoid nesting groups within study conditions, the requirement to model those groups as levels of a random effect disappears. The alternative to nesting is crossing, so if it is possible to cross the levels of the grouping factor with study conditions, then the grouping factor becomes a stratification factor and the investigator is free to model the grouping factor as a random effect, as a fixed effect, or to ignore the grouping factor in the analysis.
For example, if schools are randomized to study conditions, the study is a GRT. But if students within schools are randomized to study conditions, the schools will be crossed with study conditions and we have a stratified RCT; the investigator can model the schools as a random effect, as a fixed effect, or ignore it in the analysis. As another example, if the therapists used to deliver the intervention in an IRGT also deliver an alternative intervention in the control condition, the therapists will be crossed with study condition and the investigator can model therapist as a random effect, as a fixed effect, or ignore therapist in the analysis.
In either example, the choice between modeling the grouping factor as random, as fixed, or ignoring it will depend on factors like power and generalizability.
The best estimate for the ICC will reflect the circumstances for the trial being planned. That estimate will be from the same target population, so that it reflects the appropriate groups or clusters e. That estimate will derive from data collected for the same outcome using the same measurement methods to be used for the primary outcome in the trial being planned.
For example, if planning a trial to improve servings of fruits and vegetables in inner-city third graders, it would be important to get an ICC estimate for servings of fruits and vegetables, measured in the same way as servings would be measured in the trial being planned, from third-graders in inner-city schools like the schools that would be recruited for the trial being planned. As such, it is important to choose covariates carefully.
The best covariates will be related to the outcome and unevenly distributed between the study conditions or among the groups or clusters randomized to the study conditions.
When a priori matching or stratification is used for balance, the matching or stratification factor may be included in the analysis of intervention effects, but that is not required, and it may be inefficient to do so. It is not required because the type 1 error rate is unaffected when the matching or stratification factor is ignored in the analysis of intervention effects Diehr et al.
Both procedures reduce the df available for the test of the intervention effect, and if the number of df is limited, the unmatched or unstratified analysis may be more powerful than the matched or stratified analysis.
The choice of whether to include the matching or stratification factor in the analysis should be made a priori based on sample size calculations comparing the matched or stratified analysis to the unmatched or unstratified analysis. The choice between a priori matching and a priori stratification for balance should be guided by whether the investigator anticipates doing analyses that do not involve intervention effects.
Donner et al. Stratification with strata of size four avoids this problem and improves efficiency almost as much as matching. For this reason, stratification with strata of size four is a prudent strategy for balancing potential confounders across study conditions because it is almost as efficient as matching, and it does not limit the range of analyses that can be applied to the data.
These studies are individually randomized group treatment trials IRGTs , sometimes called partially clustered designs. While these trials are common, most investigators do not recognize the implications of this design. There may or may not be a similar structure in the control condition, depending on the nature of the control condition.
Whether it exists in one or both study conditions, the hierarchical structure requires that the positive ICC expected in the data be accounted for in the sample size estimation and in the data analysis.
Any analysis that ignores the positive ICC or what may be limited df will have a type 1 error rate that is inflated, Baldwin et al. The recommended solution to these challenges is like the solution recommended for GRTs. It is important to employ a priori matching or stratification to balance potential confounders if the number of assignment units is limited, to reflect the hierarchical or partially hierarchical structure of the design in the analytic plan, and to estimate the sample size for the IRGT based on realistic and data-based estimates of the ICC and the other parameters indicated by the analytic plan.
In public health and medicine, ICCs in group- or cluster-randomized trials are often small, usually ranging from 0. The prudent course is to reflect all nested factors as random effects and to plan the study to have sufficient power given a proper analysis.
That is another tempting strategy that can risk an inflated type I error rate. The standard error for the variance component is not well estimated when the value is close to zero, and if the df are limited, the power will be limited. As such, it is likely that the result will suggest that the ICC or variance component is negligible, when ignoring it will inflate the type I error rate. Features and Uses Groups or Common Interventionist or Facilitator An IRGT trial is a randomized trial in which participants in one or more study conditions receive at least some of their treatment in groups or through a common interventionist or facilitator.
This design is common in surgical trials, where each surgeon operates on multiple patients Cook et al. Appropriate Uses IRGTs can be employed in a wide variety of settings and populations to address a wide variety of research questions. They are an appropriate design when the investigator wants to evaluate an intervention that: involves at least one component that is delivered in a group format, it is necessary to use a limited number of intervention delivery staff, or interventionists or facilitators, so that each one interacts with multiple participants, or it is necessary to have participants interact with one another in a virtual environment.
Solutions The recommended solution to these challenges is like the solution proposed for GRTs. What are some important references on the design and analysis of IRGTs? Which one should be used? In longer trials, it is common for participants to change groups over time. Is this a problem? What is the impact of variation in the size of the groups or clusters that are randomized, or through which participants receive their intervention?
If I randomize individuals to conditions, but they receive their treatments in small groups led by a trained instructor or facilitators, how many such small groups do I need? What is wrong with that approach? Is there any way to avoid having to include the groups in the analysis as a random effect? This is because nested factors must be modeled as random effects Zucker , Many studies seem to pick an ICC value arbitrarily for use in their power or sample size calculations.
In order to demonstrate the postulated positive effect of the combination therapy, women were enrolled in the trial. In trials with randomized and controlled design e. The patients in the control group receive either another treatment or a placebo.
The ALIFE trial is a three-armed parallel group study to establish whether the combination treatment or the monotherapy improve the live birth rate compared with placebo.
The use of placebos in clinical trials is ethically justified provided that no standard treatment is available. If comparison with placebo is indispensable for methodological reasons, it can be justified as long as patients will not be harmed That is the case, for example, if the study is of only short duration or if the severity of disease permits postponement or interruption of treatment.
As in any study of human subjects, the study population of an RCT must be clearly defined. Precise inclusion and exclusion criteria are elaborated to ensure that only eligible patients are recruited. The study participants must be homogeneous with regard to demographic characteristics, disease state, and possibly even comorbidity and comedication. This can be achieved by standardization of, for example, the time s of intake of the study medication and the methods used to measure clinical parameters, but most important for comparability is randomization of the participants.
In RCTs the patients are randomly assigned to the different study groups. This is intended to ensure that all potential confounding factors are divided equally among the groups that will later be compared structural equivalence. Only if the groups are structurally equivalent can any differences in the results be attributed to a treatment effect rather than the influence of confounders. If the confounders are known, structural equivalence of the patient groups can be attained by stratified randomization Box.
In the ALIFE study the patients were assigned to the three treatment groups with a randomization ratio of If patients were allocated to treatment groups by conscious or unconscious selection for prognosis-related characteristics, rather than randomly, this could lead to biased treatment comparison and distorted results selection bias.
The assignment to study groups must not be in any way predictable. Predictability of group allocation is avoided by ensuring the study staff are unaware to which treatment the next patient will be allotted. Alternating assignment to the different treatments is not truly random.
Bias is avoided not only by randomization but also by blinding. A study may be double blind, single blind, or open. In a double-blind study neither patient nor study physician knows to which treatment the patient has been assigned. Double-blind studies are advantageous if knowledge of the treatment might influence the course and therefore the results of the study. Thus it is particularly important that the study physician is blinded to treatment if the endpoints are subjective.
Blinding of patients to their treatment is important, for example, if their attitude could potentially affect their reliability in taking the test medication compliance or even their response to treatment.
If only one party, either patient or study physician, is blinded to the treatment, the study is called single blind; a study with no blinding is described as open.
The highest possible degree of blinding should be chosen to minimize bias. The data subjected to statistical analysis in RCTs are those gathered from patient populations defined in the study protocol.
The primary population for analysis is the so-called intention-to-treat ITT population, comprising all randomized patients. In analysis according to the ITT principle, patients are allocated to the group to which they were randomized, thus retaining the advantages of randomization such as structural equivalence. Because the ITT population includes all patients who were randomized, the data for analysis include some patients whose treatment was interrupted, prematurely discontinued, or did not take place at all.
The analysis strategy for ITT data is therefore conservative, i. Many studies define a modified ITT mITT population, which may for example comprise the patients who received at least a defined amount of study treatment.
An alternative strategy is to restrict analysis to the data from the per-protocol PP population. Patients in whom study conduct deviated from the protocol are excluded from analysis. These so-called protocol violations include, for example, failure concerning the application of inclusion or exclusion criteria and incorrect administration of the study treatment.
In analysis according to the PP principle, patients are allocated to the treatment groups depending on the treatment they actually received. Because the PP population includes only those patients who completed the study according to the protocol, the results may be distorted in favor of the investigational intervention To assess the robustness of the study findings, PP evaluation is carried out as a sensitivity analysis if the ITT population is the patient population for the primary efficacy analysis If the results of PP and ITT evaluation of the primary endpoint are very similar, they can be regarded as reliable.
Should this not be the case, the possible reasons for the discrepancy between the results of the ITT and PP analyses must be discussed in the results section of the publication. The rates of live births in the three treatment groups did not differ significantly Table 1.
Analysis according to the PP principle confirmed this finding. Neither aspirin and heparin combined nor aspirin alone were demonstrated to have a greater effect than placebo on the live birth rate. Relative risk and absolute difference were calculated for the comparisons between aspirin plus heparin and placebo and between aspirin alone and placebo.
The p-value applies to all treatment group comparisons. Clinical trials have to be performed according to national and international regulations. The Declaration of Helsinki, first formulated by the World Medical Association in and revised several times in the intervening years 20 , lays down fundamental ethical principles for research on human beings.
The aim of GCP is to protect study participants and ensure high quality of study data. In the International Committee of Medical Journal Editors made registration of a clinical trial in a public registry a precondition for its publication The professional code of conduct for physicians in Germany demands that every study in human subjects be submitted to the responsible ethics committee for approval.
The applications have to be accompanied by the study protocol, the information to be supplied to the patients, the consent form for participation, and confirmation that adequate insurance has been arranged. Trials of drugs and medical devices also have to be registered with state authorities. There are legally defined obligations to report suspected unexpected serious adverse reactions or early termination of a study, and the final study report must also be submitted.
In other words, information revealing the identity of a patient name or initials must be replaced by a code. Only patients who have agreed in advance to the recording, storage, processing and dissemination of their data may participate in a clinical study. Any publication of an RCT must lucidly describe the planning, conduct, and analysis of the study. The most important aspects that have to be described in the publication are listed in Table 2. The progress of patients through an RCT and the numbers of patients whose data were analyzed can be depicted in a flow diagram Figure.
Patient flow in a randomized controlled trial adapted from [ 23 ].
0コメント