This vignette uses the current causalBKMR API for a binary outcome. The workflow is the same as for a continuous outcome:

  1. Build Y, Z, and X.
  2. Convert them with prepare_gbkmr_data().
  3. Run the analysis with gbkmr_run(outcome_type = "binary").

Archived functions under R/old/ are not used by this workflow.

Data Layout

gbkmr_run() expects the wide data produced by prepare_gbkmr_data(). For T = 3, p = 2, one time-varying covariate, and one baseline covariate, the raw matrices are:

  • Y: length n binary vector with values 0 or 1.
  • Z: n x 6 matrix ordered as exposure 1 to 2 at time 0, then time 1, then time 2.
  • X: n x 4 matrix with the three time-varying covariate columns first and the baseline covariate column last.
library(causalBKMR)

set.seed(7)
n <- 300
T_points <- 3
p <- 2
Ldim <- 1
B <- 1

sex <- rbinom(n, 1, 0.5)
waist_0 <- rnorm(n, 85, 12)

Z <- matrix(rlnorm(n * p * T_points, meanlog = 1, sdlog = 0.5),
            nrow = n)

waist_t0 <- waist_0 + 0.10 * Z[, 1] + rnorm(n, 0, 2)
waist_t1 <- waist_t0 + 0.10 * Z[, 3] + rnorm(n, 0, 2)
waist_t2 <- waist_t1 + 0.10 * Z[, 5] + rnorm(n, 0, 2)

X <- cbind(waist_t0, waist_t1, waist_t2, sex)

linpred <- -3 +
  0.40 * sex +
  0.02 * waist_t2 +
  0.20 * log1p(Z[, 5]) +
  0.20 * log1p(Z[, 6])

Y <- rbinom(n, 1, plogis(linpred))

Prepare the Wide g-BKMR Data

prepared <- prepare_gbkmr_data(
  Y = Y,
  Z = Z,
  X = X,
  time_points = T_points,
  mixture_components = p,
  td_covariates = Ldim,
  baseline_covariates = B,
  td_covariate_names = "waist"
)

detect_variable_patterns(prepared, T = T_points)

Fit a Binary Outcome Model

Binary outcomes are supported with the standard bkmr engine. The outcome model uses bkmr::kmbayes(..., family = "binomial"), and causalBKMR returns counterfactual means on the probability scale.

fit_binary <- gbkmr_run(
  data = prepared,
  outcome = "Y",
  outcome_type = "binary",
  time_points = T_points,
  engine = "bkmr",
  n = nrow(prepared),
  iter = 15000,
  K = 1000,
  n_knots = 50
)

The causal effect is a risk difference:

fit_binary$causal_effect
fit_binary$counterfactual_means
fit_binary$diagnostics

If fit_binary$causal_effect$estimate is 0.04, the high exposure trajectory has an estimated four percentage point higher risk than the low exposure trajectory.

Engine Restrictions

fastbkmr does not currently support binary outcomes because the current public fbkmr::skmbayes() path is Gaussian-only.

gbkmr_run(
  data = prepared,
  outcome_type = "binary",
  time_points = T_points,
  engine = "fastbkmr"
)

That call intentionally stops with a clear error. Use engine = "bkmr" for binary outcomes.

Time-Varying Covariates

Binary time-varying covariates can be included in X, but the current mediator models treat time-varying covariates as Gaussian responses. For binary intermediate covariates, interpret the mediator part of the model as an approximation and check diagnostics carefully.