Introduction

The cfcausal package implements weighted conformal inference-based procedures for counterfactuals and individual treatment effects proposed in Lei and Candès (2020a) and extended in Lei and Candès (2020b). It includes both the weighted split conformal inference described and the weighted cross-validation+, a weighted variant of Barber et al. (2019). For each type of conformal inference, both conformalized quantile regression (CQR) (Romano, Patterson, and Candès 2019) and standard conformal inference are supported. It provides a pool of convenient learners and allows flexible user-defined learners for conditional mean and quantiles. In this tutorial we illustrate the usage of conformalCf for counterfactual inference and conformalIte for inference of individual treatment effects.

# Install the "cfcausal" package from github.
# if (!require("devtools")){
#     install.packages("devtools")
# }
# devtools::install_github("lihualei71/cfcausal")
library("cfcausal")

conformalCf: inference of counterfactuals

We illustrate the usage of conformalCf using the numerical study in Section 3.6 of Lei and Candès (2020a). Here we choose the simplest scenario with \(n = 1000, d = 10\), homoscedastic errors and independent covariates. We summarize the data-generating process below.

  • The covariates \(X_{ij}\) are i.i.d. generated from \(\mathrm{Unif}([0, 1])\).

  • \(Y_{i}(0)\equiv 0\)

  • \(\mathbb{E}[Y_{i}(1) \mid X_{i}] = f(X_{i1})f(X_{i2})\) where \[f(x) = \frac{2}{1 + \exp(-12(x - 0.5))}\]

  • \(Y_{i}(1) = \mathbb{E}[Y_{i}(1) \mid X_{i}] + \epsilon_{i}\) where \(\epsilon_{i}\stackrel{i.i.d.}{\sim}N(0, 1)\).

  • The propensity score \(e(x) \triangleq P(T_{i} = 1\mid X_{i} = x)\) is set as follows: \[e(x) = \frac{1}{4}(1 + \beta_{2, 4}(x),\] where \(\beta_{2, 4}(x)\) is the cdf of the Beta-distribution with parameters \(2\) and \(4\).

The outcome variable Y used by conformalCf should be a mixture of observed values and missing values while the covariate matrix X should have no missing values. It can handle general missing value problems with ignorable missing mechanisms. The inference of counterfactuals under the potential outcome framework with ignorable treatment assignment is thus a special case.

# Generate data
set.seed(2020)
genY <- function(X){
    2 / (1 + exp(-12 * (X[, 1] - 0.5))) * 2 / (1 + exp(-12 * (X[, 2] - 0.5))) + rnorm(n)
}
n <- 1000
d <- 10
X <- matrix(runif(n * d), nrow = n, ncol = d)
Y <- genY(X)
ps <- (1 + pbeta(X[, 1], 2, 4)) / 4
T <- as.numeric(ps < runif(n))
Y[!T] <- NA
summary(Y)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
## -2.5271 -0.3099  0.6025  0.8511  1.7124  5.8222     427

Let \((X, Y)\) denote a generic observation and \(T\) denote the indicator of \(Y\) being observed. conformalCf can produce intervals with three types of coverage guarantees:

  • when estimand = "unconditional", \(\mathbb{P}(Y\in \hat{C}(X))\ge 1 - \alpha\);

  • when estimand = "nonmissing", \(\mathbb{P}(Y\in \hat{C}(X)\mid T = 1)\ge 1 - \alpha\);

  • when estimand = "missing", \(\mathbb{P}(Y\in \hat{C}(X)\mid T = 0)\ge 1 - \alpha\).

Throughout the tutorial, we focus on the first type of coverage guarantee. We start by using weighed split conformalized quantile regression (CQR) with the built-in quantile random forest to illustrate the usage. See ?conformalCf for a list of built-in learners.

obj <- conformalCf(X, Y, type = "CQR",
                   quantiles = c(0.05, 0.95),
                   outfun = "quantRF", useCV = FALSE)
## Loading required namespace: grf
class(obj)
## [1] "conformalSplit"

obj is a conformalSplit object that is produced by the function conformal. See ?conformal for details. To generate counterfactual intervals, we first generate \(5\) testing points.

ntest <- 5
Xtest <- matrix(runif(ntest * d), nrow = ntest, ncol = d)

Then we call the generic predict function on obj to obtain the intervals. See ?predict.conformalSplit for details.

CI <- predict(obj, Xtest, alpha = 0.1)
CI
##        lower    upper
## 1 -0.8851895 4.444696
## 2 -1.3931902 2.660598
## 3 -1.2131488 3.708961
## 4 -1.5857651 3.223314
## 5 -1.7173005 3.386824

The output is a data.frame with two columns with lower being the lower confidence bound and upper being the upper confidence bound. As a sanity check, we generate \(10000\) testing points with their outcomes and compute the empirical coverage. As expected the coverage is above \(1 - \alpha = 0.9\).

ntest_large <- 10000
Xtest_large <- matrix(runif(ntest_large * d), nrow = ntest_large, ncol = d)
Ytest_large <- genY(Xtest_large)
CI_large <- predict(obj, Xtest_large, alpha = 0.1)
mean(CI_large[, 1] <= Ytest_large & CI_large[, 2] >= Ytest_large)
## [1] 0.9166

Now we tweak the inputs of conformalCf. First we can replace split-CQR with CQR-CV+ to improve the data efficiency by setting useCV = TRUE. The default number of folds is nfolds = 10. It will be much slower than the split-CQR due to the repeated fitting.

obj <- conformalCf(X, Y, type = "CQR",
                   quantiles = c(0.05, 0.95),
                   outfun = "quantRF", useCV = TRUE,
                   nfolds = 10)
predict(obj, Xtest, alpha = 0.1)
##        lower    upper
## 1 -0.6729283 4.421649
## 2 -1.3671891 2.381733
## 3 -1.1704087 3.497659
## 4 -1.6124339 2.884751
## 5 -1.5832323 2.585830

Second, we can replace CQR with the standard conformal inference that calibrates an estimate of conditional mean. In this case, the built-in "quantRF" learner should be also be replaced by the built-in "RF" learner. See ?conformalCf for a list of available learners.

obj <- conformalCf(X, Y, type = "mean",
                   outfun = "RF", useCV = FALSE)
## Loading required namespace: randomForest
predict(obj, Xtest, alpha = 0.1)
##        lower    upper
## 1  1.1794250 4.735874
## 2 -1.5167359 2.039713
## 3 -0.7787671 2.777682
## 4 -2.2798321 1.276617
## 5 -1.9231280 1.633321

Third, conformalCf produces two-sided intervals under the default setting. One-sided intervals can be produced by changing the argument side to above and below. In both cases, the argument quantiles should be a scalar.

obj <- conformalCf(X, Y, type = "CQR", side = "above",
                   quantiles = 0.95,
                   outfun = "quantRF", useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
##   lower    upper
## 1  -Inf 3.944405
## 2  -Inf 1.166249
## 3  -Inf 2.816787
## 4  -Inf 1.482637
## 5  -Inf 1.550704
obj <- conformalCf(X, Y, type = "CQR", side = "below",
                   quantiles = 0.05,
                   outfun = "quantRF", useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
##        lower upper
## 1 -0.3188462   Inf
## 2 -1.1976574   Inf
## 3 -1.1758072   Inf
## 4 -1.2536275   Inf
## 5 -1.1932263   Inf

Next, we can replace the built-in learners with any user-defined function that satisfies certain minimal requirements. For CQR, outfun should have at least four inputs: Y for the outcome, X for the covariates, Xtest for the covariates of testing points and quantiles that is either a vector of length 2 or a scalar, depending on the argument side.The output of must be a matrix with two columns giving the conditional quantile estimates when is a vector of length 2; otherwise, it must be a vector giving the conditional quantile estimate. Here we re-define the quantile random forest from scratch. For this purpose we need to install grf package (Tibshirani, Athey, and Wager (2019)).

# Install grf package
if (!require("grf")){
    install.packages("grf")
}
## Loading required package: grf
# User-defined quantile random forest
quantRF <- function(Y, X, Xtest, quantiles, ...){
    fit <- grf::quantile_forest(X, Y, quantiles = quantiles, ...)
    res <- predict(fit, Xtest, quantiles = quantiles)$predictions
    if (length(quantiles) == 1){
        res <- as.numeric(res)
    } else {
        res <- as.matrix(res)
    }
    return(res)
}
# conformalCf with user-defined quantRF
obj <- conformalCf(X, Y, type = "CQR",
                   quantiles = c(0.05, 0.95),
                   outfun = quantRF, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
##       lower    upper
## 1 -1.013241 4.277715
## 2 -1.629999 3.206267
## 3 -1.629999 3.782109
## 4 -2.293557 3.762190
## 5 -2.042822 3.575441

For standard conformal inference, outfun should have at least three inputs: Y for the outcome, X for the covariates, and Xtest for the covariates of testing points. The output of must be a vector giving the conditional quantile estimate. Here we define a linear learner.

linearReg <- function(Y, X, Xtest){
    X <- as.data.frame(X)
    Xtest <- as.data.frame(Xtest)
    data <- data.frame(Y = Y, X)
    fit <- lm(Y ~ ., data = data)
    as.numeric(predict(fit, Xtest))
}
obj <- conformalCf(X, Y, type = "mean", 
                   outfun = linearReg, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
##        lower    upper
## 1  0.2000005 4.341969
## 2 -2.2590300 1.882938
## 3 -1.7579230 2.384045
## 4 -2.4456027 1.696365
## 5 -2.9286439 1.213324

Finally, we can replace the built-in propensity score method with any user-defined function that satisfies certain minimal requirements. Specifically psfun should have at least three inputs: Y for the binary outcome, X for the covariates, Xtest for the covariates of testing points. The output of must be a vector of predicted probabilities. Here we define a learner via logistic regression.

logitReg <- function(Y, X, Xtest, ...){
    X <- as.data.frame(X)
    Xtest <- as.data.frame(Xtest)
    data <- data.frame(Y = Y, X)
    fit <- glm(Y ~ ., data = data, family = "binomial", ...)
    as.numeric(predict(fit, Xtest, type = "response"))
}
obj <- conformalCf(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
                   outfun = "quantRF", psfun = logitReg, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
##        lower    upper
## 1 -0.8949307 4.474130
## 2 -1.6331808 3.038276
## 3 -1.1362423 3.699452
## 4 -1.6357136 3.495265
## 5 -1.8677126 3.038276

conformalIte: inference of individual treatment effects

We illustrate the usage of conformalCf using the same numerical example as above except that \(Y(0)\) is generated as a standard normal variable.

# Generate training data
set.seed(2020)
genY <- function(X){
    2 / (1 + exp(-12 * (X[, 1] - 0.5))) * 2 / (1 + exp(-12 * (X[, 2] - 0.5))) + rnorm(n)
}
n <- 1000
d <- 10
X <- matrix(runif(n * d), nrow = n, ncol = d)
Y1 <- genY(X)
Y0 <- rnorm(n)
ps <- (1 + pbeta(X[, 1], 2, 4)) / 4
T <- as.numeric(ps < runif(n))
Y <- ifelse(T == 1, Y1, Y0)

# Generate testing data
ntest <- 5
Xtest <- matrix(runif(ntest * d), nrow = ntest, ncol = d)
pstest <- (1 + pbeta(Xtest[, 1], 2, 4)) / 4
Ttest <- as.numeric(pstest < runif(ntest))
Y1test <- genY(Xtest)
Y0test <- rnorm(ntest)
Ytest <- ifelse(Ttest == 1, Y1test, Y0test)

conformalIte supports four algorithms: the nested approach with exact and inexact calibration for cases with both potential outcomes missing, the naive approach for cases with both potential outcomes missing and the counterfactual inference for cases with only one potential outcome missing. The output of conformalIte is a function that outputs the interval estimates on a given dataset. When algo = "nest" or "naive", it take a single input X; when algo = "counterfactual", it takes three inputs X, Y and T.

# Inexact nested method
CIfun <- conformalIte(X, Y, T, alpha = 0.1, algo = "nest", exact = FALSE, type = "CQR",
                      quantiles = c(0.05, 0.95), outfun = "quantRF", useCV = FALSE)
CIfun(Xtest)
##       lower    upper
## 1 -1.999178 3.985715
## 2 -2.169225 2.937253
## 3 -1.674621 4.120571
## 4 -1.618675 3.973736
## 5 -2.210159 3.077535
# Exact nested method
CIfun <- conformalIte(X, Y, T, alpha = 0.1, algo = "nest", exact = TRUE, type = "CQR",
                      quantiles = c(0.05, 0.95), outfun = "quantRF",  useCV = FALSE)
CIfun(Xtest)
##       lower    upper
## 1 -3.900474 5.701648
## 2 -4.104725 5.233109
## 3 -3.555527 7.109269
## 4 -3.706492 5.890286
## 5 -4.213421 5.004732
# naive method
CIfun <- conformalIte(X, Y, T, alpha = 0.1, algo = "naive", type = "CQR",
                      quantiles = c(0.05, 0.95), outfun = "quantRF",  useCV = FALSE)
CIfun(Xtest)
##       lower    upper
## 1 -3.486724 6.475139
## 2 -4.114607 5.295473
## 3 -3.303448 6.902150
## 4 -3.613348 6.254615
## 5 -3.992367 5.943778
# counterfactual method, Y and T needs to be observed
CIfun <- conformalIte(X, Y, T, alpha = 0.1, algo = "counterfactual", type = "CQR",
                      quantiles = c(0.05, 0.95), outfun = "quantRF",  useCV = FALSE)
CIfun(Xtest, Ytest, Ttest)
##        lower    upper
## 1 -2.4026089 2.885503
## 2 -1.4348278 2.344889
## 3 -0.7435855 4.744477
## 4  1.4073219 4.756660
## 5 -0.5229815 3.216475

References

Barber, Rina Foygel, Emmanuel J Candès, Aaditya Ramdas, and Ryan J Tibshirani. 2019. “Predictive Inference with the Jackknife+.” arXiv Preprint arXiv:1905.02928.

Lei, Lihua, and Emmanuel Candès. 2020a. “Conformal Inference of Counterfactuals and Individual Treatment Effects.” arXiv Preprint arXiv:2006.06138.

———. 2020b. “Theory of Weighted Conformal Inference.” Unpublished Manuscript.

Romano, Yaniv, Evan Patterson, and Emmanuel Candès. 2019. “Conformalized Quantile Regression.” In Advances in Neural Information Processing Systems, 3538–48.

Tibshirani, Julie, Susan Athey, and Stefan Wager. 2019. Grf: Generalized Random Forests. https://CRAN.R-project.org/package=grf.