Diagnostics and Troubleshooting

This vignette covers diagnostics for the current causalBKMR workflow. The relevant user-facing functions are prepare_gbkmr_data(), detect_variable_patterns(), and gbkmr_run(). Archived code under R/old/ is not part of this workflow.

Check the Prepared Data

Before running MCMC, inspect the columns that gbkmr_run() will use.

patterns <- detect_variable_patterns(prepared, T = 3)
patterns

mixture_cols <- grep("^logM\\d+_\\d+$", names(prepared), value = TRUE)
mixture_cols

Confirm that:

patterns$p is the expected number of mixture components per time.
mixture_cols contains the expected logM*_t columns.
patterns$Ldim is the expected number of time-varying covariates per time.
patterns$td_covariate_names contains the expected covariate names.
patterns$td_vars_by_time maps the follow-up covariate columns correctly.
Y is present and points to the intended outcome column.

If the detected columns are wrong, return to the Y, Z, and X matrices passed into prepare_gbkmr_data(). The most common issue is an incorrect ordering of X: all time-varying covariates must come before baseline covariates.

Use the Input Audit

gbkmr_run(verbose = TRUE) prints an audit before fitting models.

fit <- gbkmr_run(
  data = prepared,
  outcome = "Y",
  outcome_type = "continuous",
  time_points = 3,
  engine = "auto",
  iter = 15000,
  K = 1000,
  verbose = TRUE
)

Read this audit before starting a long run. Check the outcome type, number of time points, number of mixture components, selected engine, sample size, and intervention contrast.

Inspect Stored Diagnostics

The result object stores convergence diagnostics when available.

fit$diagnostics

If warnings appear, rerun with more iterations and a later posterior selection window:

fit_long <- gbkmr_run(
  data = prepared,
  outcome = "Y",
  outcome_type = "continuous",
  time_points = 3,
  iter = 60000,
  sel = seq(floor(60000 * 0.8), 60000, by = 25),
  K = 1000
)

The sel argument must contain MCMC iteration indices no larger than iter. Use later iterations when early burn-in appears unstable.

Inspect Raw Fits

gbkmr_run() returns lower-level model objects in raw_results.

raw <- fit$raw_results

names(raw)
raw$fit_mediators
raw$fit_y
raw$meta

For the standard BKMR engine, raw$fit_y is a bkmrfit object. For fastBKMR, raw$fit_y is a list of subset fits.

Trace Plots

Use bkmr::TracePlot() for standard BKMR fits.

bkmr::TracePlot(fit = fit$raw_results$fit_y, par = "beta")
bkmr::TracePlot(fit = fit$raw_results$fit_y, par = "sigsq.eps")
bkmr::TracePlot(fit = fit$raw_results$fit_y, par = "r")

For fastBKMR, inspect subset fits:

fit_list <- fit$raw_results$fit_y

bkmr::TracePlot(fit = fit_list[[1]], par = "beta")

for (i in seq_along(fit_list)) {
  bkmr::TracePlot(fit = fit_list[[i]], par = "beta")
}

Long trends, abrupt jumps, or chains that remain stuck for long periods are signs that the run needs more iterations or a different sel window.

Causal Effect Draws

The g-computation stage stores posterior draws for the low and high counterfactual means.

raw <- fit$raw_results
ace_draws <- raw$Yastar - raw$Ya

plot(ace_draws, type = "l",
     xlab = "Posterior draw",
     ylab = "ACE draw")

hist(ace_draws, breaks = 30,
     main = "Posterior ACE",
     xlab = "Y(a*) - Y(a)")

fit$causal_effect

For binary outcomes, these draws are on the risk-difference scale.

Common Problems

Symptom	Likely cause	Fix
Mixture columns are missing	`Z` has the wrong number of columns	Use `mixture_components * time_points` columns
Baseline covariates detected incorrectly	`X` columns are in the wrong order	Put all time-varying columns before baseline columns
Binary outcome uses continuous model	`outcome_type` left at the default	Set `outcome_type = "binary"`
fastBKMR requested for a binary outcome	Unsupported engine/outcome combination	Use `engine = "bkmr"`
Warnings about low effective sample size	Too few stable posterior draws	Increase `iter` and move `sel` later
Counterfactual values look implausible	Contrast outside support	Use `a_probs`, or choose `a_vals`/`astar_vals` within observed ranges
Parallel fastBKMR fails	Worker startup failed	Set `n_cores = 1` or run on a compute node