Chapter 2 Introduction

In this section, we will lay out the notation for the following chapters, discuss what assumptions are required for these methods to be applied, and survey the literature on covariate adjustment. While the examples and notation focus on trials with two treatment arms, most methods can be applied to trials with more than two treatment arms.


2.1 Notation and Assumptions

Let \(A\) denote a binary treatment assignment: \(A = 1\) indicates assignment to receive the treatment of interest, and \(A = 0\) indicates assignment to the control or comparator group. Let \(Y\) denote the outcome of interest, and \(X\) denote a vector of baseline covariates. If stratified randomization is used, let \(X_{S}\) denote the stratification variables, and \(X_{\bar{S}}\) denote the other baseline covariates, and \(X = (X_{S}, X_{\bar{S}})\). We assume that treatment assignment is independent of the baseline covariates, i.e. \(A \perp X\), or if stratified randomization is used, that the treatment assignment is conditionally independent of the other covariates given the stratification variables, i.e. \(A \perp\!\!\!\perp X_{\bar{S}} \vert X_{S}\). Each participant’s data is assumed to be independent, identically distributed (IID) draws from an unknown distribution \(P(X, A, Y)\).

In order to include covariate information in an analysis, we have to specify how these covariates relate to the outcome in a regression model, which models a conditional distribution of the outcome as as a function of the covariates. In most circumstances, the validity of our inference from regression models depends on how close the regression model’s specification reflects the true data generation mechanism, which is almost always unknown. This may make investigators understandably reluctant to use covariate adjusted methods, as they do not want the validity of the analysis to depend on assumptions of a correctly specified model.

What may be surprising is that there are covariate adjustment methods that provide valid estimates of the treatment effect, even if the models used to obtain these estimates are arbitrarily misspecified. Having models that more closely reflect the underlying data generating mechanism improves precision and power. Additionally, covariate adjusted estimators also can have precision that is equal or better than the unadjusted estimators.


2.2 The Analysis of Covariance (ANCOVA)

The ANCOVA is perhaps the best known method for covariate adjustment with a continuous outcome. The simplest version of an ANCOVA is a linear regression of the final outcome \(Y\) on the baseline covariate \(X\), usually the outcome measured at baseline, and treatment assignment \(A\):

\[Y_{i} = \beta_{0} + \beta_{X}X_{i} + \beta_{A}A_{i} + \epsilon_{i}\]

This regression model assumes that the final outcome is linearly related to the outcome at baseline, with an additive effect of treatment. Even if the true relationship between the final outcome and the baseline covariate is nonlinear, includes interactions, and includes other variables that are omitted from the model, the ANCOVA estimate will provide a consistent estimate of the treatment effect that is as precise or even more precise than an unadjusted estimate. An outcome model that more accurately reflects the data generating mechanism will improve precision, but is not required for a consistent estimate of the outcome.

Linear regression involves the conditional mean, but we are interested in a marginal treatment effect. In order to obtain a marginal treatment effect, we need to marginalize (i.e. average over) the variation in the covariates:

\[\hat{\mu}_{1} = \hat{E}[Y \vert A = 1] = \frac{1}{n} \sum_{i=1}^{n} \hat{E}[Y_{i} \vert A_{i} = 1, X_{i}]\]

In the case of the simple ANCOVA model with only one covariate, the marginal mean under treatment would be:

\[\hat{\mu}_{1} = \frac{1}{n} \sum_{i=1}^{n} \hat{E}[Y_{i} \vert A_{i} = 1, X_{i}] = \frac{1}{n} \sum_{i=1}^{n} (\hat{\beta}_{0} + \hat{\beta}_{X}X_{i} + \hat{\beta}_{A}) = \hat{\beta}_{0} + \hat{\beta}_{X}\bar{X}_{n} + \hat{\beta}_{A}\] Note that this utilizes the covariate data from the entire sample, not just those who were assigned to receive the active treatment. The marginal mean under control would be:

\[\hat{\mu}_{0} = \frac{1}{n} \sum_{i=1}^{n} \hat{E}[Y_{i} \vert A_{i} = 0, X_{i}] = \frac{1}{n} \sum_{i=1}^{n} (\hat{\beta}_{0} + \hat{\beta}_{X}X_{i}) = \hat{\beta}_{0} + \hat{\beta}_{X}\bar{X}_{n}\] Likewise, our marginal estimate of the mean under control utilizes the covariate data from the entire sample, not just those who were assigned to receive the control treatment. Our estimate of the average treatment effect would be the contrast between these quantities:

\[\hat{\theta}_{ATE} = \hat{\mu}_{1} - \hat{\mu}_{0} = (\hat{\beta}_{0} + \hat{\beta}_{X}\bar{X}_{n} + \hat{\beta}_{A}) - (\hat{\beta}_{0} + \hat{\beta}_{X}\bar{X}_{n}) = \hat{\beta}_{A}\]

The procedure for gives us the same result as we would have otherwise used to estimate the treatment effect: the regression coefficient associated with treatment. When there are no treatment-by-covariate interactions and the model does not use a nonlinear link function, the conditional effect and the marginal effect coincide. However, the approach outlined here is applicable to a wide range of outcome models, including those using link functions, like logistic regression, or models that include treatment-by-covariate interactions. Computing the variance or standard errors of the estimate involves the use of the nonparametric bootstrap or appropriate robust standard errors for the study design and model specification.


2.3 Generalizing the ANCOVA: G-computation

In general, if we have a regression model \(\hat{f}(X, A) = \hat{E}[Y \vert A, X]\), we can obtain the average treatment effect by respectively computing the marginal mean under treatment and control, and taking a contrast between these quantities:

\[\hat{\theta}_{ATE} = \hat{\mu_{1}} - \hat{\mu_{0}} = \left(\frac{1}{n} \sum_{i=1}^{n} \hat{f}(X_{i}, A_{i} = 1) \right) - \left(\frac{1}{n} \sum_{i=1}^{n} \hat{f}(X_{i}, A_{i} = 0) \right)\] This approach is known as G-computation or the standardization estimator. This allows us to obtain the average treatment effect for regression models which may include interactions or use a nonlinear link function, such as logistic regression.


2.4 Combining Covariate Adjustment and Stratified Randomization

Stratified randomization is a way of addressing the potential for baseline imbalance in covariates at the design phase by having a separate randomization sequence for each stratum of baseline variables. Each randomization sequence is constructed by randomly choosing a block size (e.g. 1, 2, or 4 participants per treatment arm), and then permuting the order of the treatment labels in each block. For example, below is a sequence of three blocks, the first with size 3, the second and third with size 2.

[ABBABA][BABA][BAAB]

Randomly choosing block sizes makes it difficult to anticipate when blocks begin and end, and thus which treatment the next participant will receive. Balance between treatment arms within a stratum is achieved whenever a block is completed. Stratifying randomization on too many variables can lead to many incomplete blocks, and thus poor within-stratum balance in treatment allocation. Stratified randomization generally provides the greatest benefit in efficiency when there is high variability between strata and lower variability within strata.

When stratified randomization is utilized, it is generally recommended to adjust for the stratification variables in the analysis. Unfortunately, this may not be done in practice. Additional improvements in efficiency can be realized using a variance estimator specifically designed for stratified randomization designs (Wang et al. 2021).

References

Wang, Bingkai, Ryoko Susukida, Ramin Mojtabai, Masoumeh Amin-Esmaeili, and Michael Rosenblum. 2021. “Model-Robust Inference for Clinical Trials That Improve Precision by Stratified Randomization and Covariate Adjustment.” Journal of the American Statistical Association 118 (542): 1152–63. https://doi.org/10.1080/01621459.2021.1981338.