The dependence-adjusted Benjamini-Hochberg (BH) procedure and step-up procedures for multivariate t-statistics

dBH_mvt computes the rejection set of $dBH_\gamma(\alpha)$ and $dBH^2_\gamma(\alpha)$, as well as $dSU_{\gamma, \Delta}(\alpha)$ and $dSU^2_{\gamma, \Delta}(\alpha)$ a broad class of step-up procedures for multivariate t-statistics $t = (t_1, \ldots, t_m)$ where $t_i = z_i / \hat{\sigma}$, $z\sim N(\mu, \Sigma)$, and $\hat{\sigma}^2 \times df\sim \chi^2(df)$. dBH_mvt can handle both one-sided tests $H_i: \mu_i \le 0$ or $H_i: \mu_i \ge 0$, and two-sided tests $H_i: \mu_i = 0$.

dBH_mvt(
  tvals,
  df,
  Sigma = NULL,
  Sigmafun = NULL,
  vars = NULL,
  side = c("right", "left", "two"),
  alpha = 0.05,
  gamma = NULL,
  tautype = "QC",
  niter = 1,
  avals = NULL,
  avals_type = c("BH", "geom", "bonf", "manual"),
  geom_fac = 2,
  eps = 0.05,
  qcap = 2,
  gridsize = 20,
  exptcap = 0.9,
  is_safe = NULL,
  verbose = FALSE
)

Arguments

tvals	a vector of t-values. The marginal variances of the corresponding z-values do not have to be 1.
df	the degree-of-freedom.
Sigma	the covariance matrix. For very large m, it is preferable to use `Sigmafun` while leave `Sigma = NULL` as its default. See Details.
Sigmafun	the function that takes the row index as the input and outputs the i-th row of the covariance matrix divided by $\Sigma_{i, i}$. This is an alternative for `Sigma` when m is very large. See Details.
vars	the vector of marginal variances $(\sigma_1^2, \ldots, \sigma_m^2)$. Used only with `Sigmafun`. See Details.
side	a string that takes values in {"right", "left", "two"}, with "right" for one-sided tests $H_i: \mu_i \le 0$, "left" for one-sided tests, $H_i: \mu_i \ge 0$, and "two" for two-sided tests $H_i: \mu_i = 0$.
alpha	the target FDR level.
gamma	the parameter for the dBH and dSU procedures. The default is `NULL`, which gives dBY or the safe dSU with gamma = 1 / Lm defined in Appendix C.1.
tautype	the type of tau function. In the current version, only "QC" (q-value-calibration) is supported with $\tau(c; X) = cR_{BH}(c) / m$.
niter	the number of iterations. In the current version it can only be 1, for dBH/dSU, or 2, for $dBH^2$/$dSU^2$.
avals	the a-values in the step-up procedures defined in Appendix C.1. The default is NULL, in which case `avals` is determined by `avals_type`. See Details.
avals_type	a string that takes values in {"BH", "geom", "bonf", "manual"}, which determines the type of thresholds in the step-up procedures. See Details.
geom_fac	a real number that is larger than 1. This is the growing rate of thresholds when `avals_type = "geom"`. See Details.
eps	a real number in [0, 1], which is used to determine $t_{hi}$ discussed in Section 4.2.
qcap	a real number that is larger than 1. It is used to filter out hypotheses with q-values above `qcap` X `alpha`, as discussed in Appendix C.2.2.
gridsize	an integer for the size of the grid used when `niter = 2`. For two-sided tests, `gridsize` knots will be used for both the positive and negative sides.
exptcap	a real number in [0, 1]. It is used by $dBH^2$/$dSU^2$ to filter out hypotheses with $g_i*(q_i \| S_i)$ below `exptcap` X `alpha` / m in their initializations, as discussed in Appendix C.2.5.
is_safe	a logical or NULL indicating whether the procedure is taken as safe. DON'T set `is_safe = TRUE` unless the covariance structure is known to be CPRDS. The default is `NULL`, which sets `is_safe = TRUE` if `gamma = NULL` or `gamma` is below 1 / Lm, and `is_safe = FALSE` otherwise.
verbose	a logical indicating whether a progress bar is shown.

Value

a list with the following attributes

rejs: the indices of rejected hypotheses (after the randomized pruning step if any);
initrejs: the indices of rejected hypotheses (before the randomized pruning step if any);
cand: the set of candidate hypotheses for which $g_i^{*}(q_i | S_i)$ is evaluated;
expt: $g_i^{*}(q_i | S_i)$ for each hypothesis in cand;
safe: TRUE iff the procedure is safe;
secBH: TRUE iff the randomzied pruning step (a.k.a. the secondary BH procedure) is invoked;
secBH_fac: a vector that gives $\hat{R}_i / R_{+}$ that is defined in Section 2.2. It only shows up in the output if secBH = TRUE.

Details

dBH_mvt supports two types of inputs for the covariance matrix.

When the covariance matrix fits into the memory, it can be inputted through Sigma. The diagonals do not have to be 1. In this case, Sigmafun and vars should be left as their default;
When the covariance matrix does not fit into the memory, it can be inputted through Sigmafun, for the correlation matrix, and vars, for marginal variances. Sigmafun should be a function with a single input i that gives the row index and a vector output $\Sigma_{i, }/\sqrt{\Sigma_{i, i}\Sigma_{j, j}}$ where $\Sigma_{i, }$ is the i-th row of $\Sigma$. The marginal variances $(\Sigma_{1, 1}, \ldots, \Sigma_{m, m})$ are inputted through vars. If all marginal variances are 1, vars can be left as its default.

dBH_mvt can handle all dSU procedures that are defined in Appendix C.1. with $$\Delta_{\alpha}(r) = \frac{\alpha a_{\ell}}{m}, r\in [a_{\ell}, a_{\ell + 1}), \ell = 0, 1, \ldots, L$$ for any set of integer a-values $1\le a_1 < \ldots < a_L\le m$ (with $a_0 = 0, a_{L+1} = m+1$). There are two-ways to input the a-values.

(Recommended) use avals_type while leave avals as its default. Three types of avals_type are supported.
- When avals_type = "BH", the BH procedure is used (i.e. $a_\ell = \ell, \ell = 1, \ldots, m$);
- When avals_type = "geom", the geometrically increasing a-values that are defined in Appendix C.1. are used with the growing rate specified by geom_fac, whose default is 2;
- When avals_type = "bonf", the Bonferroni procedure is used (i.e. $a_1 = 1, L = 1$).
(Not recommended) use avals while set avals_type = "manual". This option allows any set of increasing a-values. However, it can be much slower than the recommended approach.

Examples

# Generate mu and Sigma for an AR process
n <- 100
rho <- 0.8
Sigma <- rho^(abs(outer(1:n, 1:n, "-")))
mu1 <- 2.5
nalt <- 10
mu <- c(rep(mu1, nalt), rep(0, n - nalt))

# Generate the z-values
set.seed(1)
zvals <- rep(NA, n)
zvals[1] <- rnorm(1)
for (i in 2:n){
    zvals[i] <- zvals[i - 1] * rho + rnorm(1) * sqrt(1 - rho^2)
}
zvals <- zvals + mu

# Generate sigmahat and the t-values
df <- 50
sigmahat <- sqrt(rchisq(1, df = df) / df)
tvals <- zvals / sigmahat

# Run dBH_1(\alpha) for one-sided tests
alpha <- 0.05
res <- dBH_mvt(tvals = tvals, df = df, Sigma = Sigma, side = "right",
               alpha = alpha, gamma = 1, niter = 1, avals_type = "BH")

# Run dBH_1(\alpha) for two-sided tests
res <- dBH_mvt(tvals = tvals, df = df, Sigma = Sigma, side = "right",
               alpha = alpha, gamma = 1, niter = 1, avals_type = "BH") 

# Run dBH^2_1(\alpha) for one-sided tests
alpha <- 0.05
res <- dBH_mvt(tvals = tvals, df = df, Sigma = Sigma, side = "right",
               alpha = alpha, gamma = 1, niter = 2, avals_type = "BH")

# Run dBH^2_1(\alpha) for one-sided tests
res <- dBH_mvt(tvals = tvals, df = df, Sigma = Sigma, side = "right",
               alpha = alpha, gamma = 1, niter = 2, avals_type = "BH")

# Run dSU_1(\alpha) with the geometrically increasing a-values for one-sided tests
res <- dBH_mvt(tvals = tvals, df = df, Sigma = Sigma, side = "right",
               alpha = alpha, gamma = 1, niter = 1, avals_type = "geom",
               geom_fac = 2)

# Run dBY(\alpha) for one-sided tests
res <- dBH_mvt(tvals = tvals, df = df, Sigma = Sigma, side = "right",
               alpha = alpha, gamma = NULL, niter = 1, avals_type = "BH")

# Run dBY^2(\alpha) for one-sided tests
res <- dBH_mvt(tvals = tvals, df = df, Sigma = Sigma, side = "right",
               alpha = alpha, gamma = NULL, niter = 2, avals_type = "BH")

# Input the covariance matrix through \code{Sigmafun}
Sigmafun <- function(i){
   rho^(abs(1:n - i))
}
vars <- rep(1, n)
res1 <- dBH_mvt(tvals = tvals, df = df, Sigmafun = Sigmafun, vars = vars,
                side = "right", alpha = alpha, gamma = NULL, niter = 1,
                avals_type = "BH")
res2 <- dBH_mvt(tvals = tvals, df = df, Sigma = Sigma, side = "right",
                alpha = alpha, gamma = NULL, niter = 1, avals_type = "BH")
identical(res1, res2)
#> [1] TRUE

The dependence-adjusted Benjamini-Hochberg (BH) procedure and step-up procedures for multivariate t-statistics

Arguments

Value

Details

See also

Examples