cfcausal
packagecfcausal_demo.Rmd
The cfcausal
package implements weighted conformal inference-based procedures for counterfactuals and individual treatment effects proposed in Lei and Candès (2020a) and extended in Lei and Candès (2020b). It includes both the weighted split conformal inference described and the weighted cross-validation+, a weighted variant of Barber et al. (2019). For each type of conformal inference, both conformalized quantile regression (CQR) (Romano, Patterson, and Candès 2019) and standard conformal inference are supported. It provides a pool of convenient learners and allows flexible user-defined learners for conditional mean and quantiles. In this tutorial we illustrate the usage of conformalCf
for counterfactual inference and conformalIte
for inference of individual treatment effects.
# Install the "cfcausal" package from github.
# if (!require("devtools")){
# install.packages("devtools")
# }
# devtools::install_github("lihualei71/cfcausal")
library("cfcausal")
conformalCf
: inference of counterfactuals
We illustrate the usage of conformalCf
using the numerical study in Section 3.6 of Lei and Candès (2020a). Here we choose the simplest scenario with \(n = 1000, d = 10\), homoscedastic errors and independent covariates. We summarize the data-generating process below.
The covariates \(X_{ij}\) are i.i.d. generated from \(\mathrm{Unif}([0, 1])\).
\(Y_{i}(0)\equiv 0\)
\(\mathbb{E}[Y_{i}(1) \mid X_{i}] = f(X_{i1})f(X_{i2})\) where \[f(x) = \frac{2}{1 + \exp(-12(x - 0.5))}\]
\(Y_{i}(1) = \mathbb{E}[Y_{i}(1) \mid X_{i}] + \epsilon_{i}\) where \(\epsilon_{i}\stackrel{i.i.d.}{\sim}N(0, 1)\).
The propensity score \(e(x) \triangleq P(T_{i} = 1\mid X_{i} = x)\) is set as follows: \[e(x) = \frac{1}{4}(1 + \beta_{2, 4}(x),\] where \(\beta_{2, 4}(x)\) is the cdf of the Beta-distribution with parameters \(2\) and \(4\).
The outcome variable Y
used by conformalCf
should be a mixture of observed values and missing values while the covariate matrix X
should have no missing values. It can handle general missing value problems with ignorable missing mechanisms. The inference of counterfactuals under the potential outcome framework with ignorable treatment assignment is thus a special case.
# Generate data
set.seed(2020)
genY <- function(X){
2 / (1 + exp(-12 * (X[, 1] - 0.5))) * 2 / (1 + exp(-12 * (X[, 2] - 0.5))) + rnorm(n)
}
n <- 1000
d <- 10
X <- matrix(runif(n * d), nrow = n, ncol = d)
Y <- genY(X)
ps <- (1 + pbeta(X[, 1], 2, 4)) / 4
T <- as.numeric(ps < runif(n))
Y[!T] <- NA
summary(Y)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -2.5271 -0.3099 0.6025 0.8511 1.7124 5.8222 427
Let \((X, Y)\) denote a generic observation and \(T\) denote the indicator of \(Y\) being observed. conformalCf
can produce intervals with three types of coverage guarantees:
when estimand = "unconditional"
, \(\mathbb{P}(Y\in \hat{C}(X))\ge 1 - \alpha\);
when estimand = "nonmissing"
, \(\mathbb{P}(Y\in \hat{C}(X)\mid T = 1)\ge 1 - \alpha\);
when estimand = "missing"
, \(\mathbb{P}(Y\in \hat{C}(X)\mid T = 0)\ge 1 - \alpha\).
Throughout the tutorial, we focus on the first type of coverage guarantee. We start by using weighed split conformalized quantile regression (CQR) with the built-in quantile random forest to illustrate the usage. See ?conformalCf
for a list of built-in learners.
obj <- conformalCf(X, Y, type = "CQR",
quantiles = c(0.05, 0.95),
outfun = "quantRF", useCV = FALSE)
## Loading required namespace: grf
class(obj)
## [1] "conformalSplit"
obj
is a conformalSplit
object that is produced by the function conformal
. See ?conformal
for details. To generate counterfactual intervals, we first generate \(5\) testing points.
Then we call the generic predict
function on obj
to obtain the intervals. See ?predict.conformalSplit
for details.
CI <- predict(obj, Xtest, alpha = 0.1)
CI
## lower upper
## 1 -0.8851895 4.444696
## 2 -1.3931902 2.660598
## 3 -1.2131488 3.708961
## 4 -1.5857651 3.223314
## 5 -1.7173005 3.386824
The output is a data.frame with two columns with lower
being the lower confidence bound and upper
being the upper confidence bound. As a sanity check, we generate \(10000\) testing points with their outcomes and compute the empirical coverage. As expected the coverage is above \(1 - \alpha = 0.9\).
ntest_large <- 10000
Xtest_large <- matrix(runif(ntest_large * d), nrow = ntest_large, ncol = d)
Ytest_large <- genY(Xtest_large)
CI_large <- predict(obj, Xtest_large, alpha = 0.1)
mean(CI_large[, 1] <= Ytest_large & CI_large[, 2] >= Ytest_large)
## [1] 0.9166
Now we tweak the inputs of conformalCf
. First we can replace split-CQR with CQR-CV+ to improve the data efficiency by setting useCV = TRUE
. The default number of folds is nfolds = 10
. It will be much slower than the split-CQR due to the repeated fitting.
obj <- conformalCf(X, Y, type = "CQR",
quantiles = c(0.05, 0.95),
outfun = "quantRF", useCV = TRUE,
nfolds = 10)
predict(obj, Xtest, alpha = 0.1)
## lower upper
## 1 -0.6729283 4.421649
## 2 -1.3671891 2.381733
## 3 -1.1704087 3.497659
## 4 -1.6124339 2.884751
## 5 -1.5832323 2.585830
Second, we can replace CQR with the standard conformal inference that calibrates an estimate of conditional mean. In this case, the built-in "quantRF"
learner should be also be replaced by the built-in "RF"
learner. See ?conformalCf
for a list of available learners.
obj <- conformalCf(X, Y, type = "mean",
outfun = "RF", useCV = FALSE)
## Loading required namespace: randomForest
predict(obj, Xtest, alpha = 0.1)
## lower upper
## 1 1.1794250 4.735874
## 2 -1.5167359 2.039713
## 3 -0.7787671 2.777682
## 4 -2.2798321 1.276617
## 5 -1.9231280 1.633321
Third, conformalCf
produces two-sided intervals under the default setting. One-sided intervals can be produced by changing the argument side
to above
and below
. In both cases, the argument quantiles
should be a scalar.
obj <- conformalCf(X, Y, type = "CQR", side = "above",
quantiles = 0.95,
outfun = "quantRF", useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
## lower upper
## 1 -Inf 3.944405
## 2 -Inf 1.166249
## 3 -Inf 2.816787
## 4 -Inf 1.482637
## 5 -Inf 1.550704
obj <- conformalCf(X, Y, type = "CQR", side = "below",
quantiles = 0.05,
outfun = "quantRF", useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
## lower upper
## 1 -0.3188462 Inf
## 2 -1.1976574 Inf
## 3 -1.1758072 Inf
## 4 -1.2536275 Inf
## 5 -1.1932263 Inf
Next, we can replace the built-in learners with any user-defined function that satisfies certain minimal requirements. For CQR, outfun
should have at least four inputs: Y
for the outcome, X
for the covariates, Xtest
for the covariates of testing points and quantiles
that is either a vector of length 2 or a scalar, depending on the argument side
.The output of must be a matrix with two columns giving the conditional quantile estimates when is a vector of length 2; otherwise, it must be a vector giving the conditional quantile estimate. Here we re-define the quantile random forest from scratch. For this purpose we need to install grf
package (Tibshirani, Athey, and Wager (2019)).
# Install grf package
if (!require("grf")){
install.packages("grf")
}
## Loading required package: grf
# User-defined quantile random forest
quantRF <- function(Y, X, Xtest, quantiles, ...){
fit <- grf::quantile_forest(X, Y, quantiles = quantiles, ...)
res <- predict(fit, Xtest, quantiles = quantiles)$predictions
if (length(quantiles) == 1){
res <- as.numeric(res)
} else {
res <- as.matrix(res)
}
return(res)
}
# conformalCf with user-defined quantRF
obj <- conformalCf(X, Y, type = "CQR",
quantiles = c(0.05, 0.95),
outfun = quantRF, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
## lower upper
## 1 -1.013241 4.277715
## 2 -1.629999 3.206267
## 3 -1.629999 3.782109
## 4 -2.293557 3.762190
## 5 -2.042822 3.575441
For standard conformal inference, outfun
should have at least three inputs: Y
for the outcome, X
for the covariates, and Xtest
for the covariates of testing points. The output of must be a vector giving the conditional quantile estimate. Here we define a linear learner.
linearReg <- function(Y, X, Xtest){
X <- as.data.frame(X)
Xtest <- as.data.frame(Xtest)
data <- data.frame(Y = Y, X)
fit <- lm(Y ~ ., data = data)
as.numeric(predict(fit, Xtest))
}
obj <- conformalCf(X, Y, type = "mean",
outfun = linearReg, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
## lower upper
## 1 0.2000005 4.341969
## 2 -2.2590300 1.882938
## 3 -1.7579230 2.384045
## 4 -2.4456027 1.696365
## 5 -2.9286439 1.213324
Finally, we can replace the built-in propensity score method with any user-defined function that satisfies certain minimal requirements. Specifically psfun
should have at least three inputs: Y
for the binary outcome, X
for the covariates, Xtest
for the covariates of testing points. The output of must be a vector of predicted probabilities. Here we define a learner via logistic regression.
logitReg <- function(Y, X, Xtest, ...){
X <- as.data.frame(X)
Xtest <- as.data.frame(Xtest)
data <- data.frame(Y = Y, X)
fit <- glm(Y ~ ., data = data, family = "binomial", ...)
as.numeric(predict(fit, Xtest, type = "response"))
}
obj <- conformalCf(X, Y, type = "CQR", quantiles = c(0.05, 0.95),
outfun = "quantRF", psfun = logitReg, useCV = FALSE)
predict(obj, Xtest, alpha = 0.1)
## lower upper
## 1 -0.8949307 4.474130
## 2 -1.6331808 3.038276
## 3 -1.1362423 3.699452
## 4 -1.6357136 3.495265
## 5 -1.8677126 3.038276
conformalIte
: inference of individual treatment effects
We illustrate the usage of conformalCf
using the same numerical example as above except that \(Y(0)\) is generated as a standard normal variable.
# Generate training data
set.seed(2020)
genY <- function(X){
2 / (1 + exp(-12 * (X[, 1] - 0.5))) * 2 / (1 + exp(-12 * (X[, 2] - 0.5))) + rnorm(n)
}
n <- 1000
d <- 10
X <- matrix(runif(n * d), nrow = n, ncol = d)
Y1 <- genY(X)
Y0 <- rnorm(n)
ps <- (1 + pbeta(X[, 1], 2, 4)) / 4
T <- as.numeric(ps < runif(n))
Y <- ifelse(T == 1, Y1, Y0)
# Generate testing data
ntest <- 5
Xtest <- matrix(runif(ntest * d), nrow = ntest, ncol = d)
pstest <- (1 + pbeta(Xtest[, 1], 2, 4)) / 4
Ttest <- as.numeric(pstest < runif(ntest))
Y1test <- genY(Xtest)
Y0test <- rnorm(ntest)
Ytest <- ifelse(Ttest == 1, Y1test, Y0test)
conformalIte
supports four algorithms: the nested approach with exact and inexact calibration for cases with both potential outcomes missing, the naive approach for cases with both potential outcomes missing and the counterfactual inference for cases with only one potential outcome missing. The output of conformalIte
is a function that outputs the interval estimates on a given dataset. When algo = "nest"
or "naive"
, it take a single input X
; when algo = "counterfactual"
, it takes three inputs X
, Y
and T
.
# Inexact nested method
CIfun <- conformalIte(X, Y, T, alpha = 0.1, algo = "nest", exact = FALSE, type = "CQR",
quantiles = c(0.05, 0.95), outfun = "quantRF", useCV = FALSE)
CIfun(Xtest)
## lower upper
## 1 -1.999178 3.985715
## 2 -2.169225 2.937253
## 3 -1.674621 4.120571
## 4 -1.618675 3.973736
## 5 -2.210159 3.077535
# Exact nested method
CIfun <- conformalIte(X, Y, T, alpha = 0.1, algo = "nest", exact = TRUE, type = "CQR",
quantiles = c(0.05, 0.95), outfun = "quantRF", useCV = FALSE)
CIfun(Xtest)
## lower upper
## 1 -3.900474 5.701648
## 2 -4.104725 5.233109
## 3 -3.555527 7.109269
## 4 -3.706492 5.890286
## 5 -4.213421 5.004732
# naive method
CIfun <- conformalIte(X, Y, T, alpha = 0.1, algo = "naive", type = "CQR",
quantiles = c(0.05, 0.95), outfun = "quantRF", useCV = FALSE)
CIfun(Xtest)
## lower upper
## 1 -3.486724 6.475139
## 2 -4.114607 5.295473
## 3 -3.303448 6.902150
## 4 -3.613348 6.254615
## 5 -3.992367 5.943778
# counterfactual method, Y and T needs to be observed
CIfun <- conformalIte(X, Y, T, alpha = 0.1, algo = "counterfactual", type = "CQR",
quantiles = c(0.05, 0.95), outfun = "quantRF", useCV = FALSE)
CIfun(Xtest, Ytest, Ttest)
## lower upper
## 1 -2.4026089 2.885503
## 2 -1.4348278 2.344889
## 3 -0.7435855 4.744477
## 4 1.4073219 4.756660
## 5 -0.5229815 3.216475
Barber, Rina Foygel, Emmanuel J Candès, Aaditya Ramdas, and Ryan J Tibshirani. 2019. “Predictive Inference with the Jackknife+.” arXiv Preprint arXiv:1905.02928.
Lei, Lihua, and Emmanuel Candès. 2020a. “Conformal Inference of Counterfactuals and Individual Treatment Effects.” arXiv Preprint arXiv:2006.06138.
———. 2020b. “Theory of Weighted Conformal Inference.” Unpublished Manuscript.
Romano, Yaniv, Evan Patterson, and Emmanuel Candès. 2019. “Conformalized Quantile Regression.” In Advances in Neural Information Processing Systems, 3538–48.
Tibshirani, Julie, Susan Athey, and Stefan Wager. 2019. Grf: Generalized Random Forests. https://CRAN.R-project.org/package=grf.