Automatically detects variable structures in prepared g-BKMR data. Identifies exposure variables, time-dependent covariates, and their naming patterns. Detect variable patterns in g-BKMR data

Automatically detects the structure of exposure and time-dependent covariate variables in prepared g-BKMR data. This function is essential for the automated analysis pipeline.

detect_variable_patterns(data, T)

Arguments

data

Data frame containing the variables to analyze. Should be in g-BKMR format (wide format with proper variable naming).

T

Integer. Number of time points in the study.

Value

A list containing the detected variable structure:

p

Integer. Number of exposure variables per time point

Ldim

Integer. Number of time-dependent covariates per time point

td_covariate_names

Character vector. Names of time-dependent covariates

detected_pattern

Character. Pattern used for detection

baseline_td_vars

Character vector. Baseline time-dependent variables

td_vars_by_time

List. Time-dependent variables for each time point

Details

The function recognizes several naming patterns for time-dependent covariates:

  • known_with_underscore: bmi_0, bp_0, bmi_1, bp_1, etc.

  • known_ending_zero: bmi0, bp0, bmi1, bp1, etc.

  • generated_format: waist0_1, waist0_2, waist1_1, waist1_2, etc.

  • generic_format: td_covariate1_0, td_covariate2_0, etc.

For exposure variables, it looks for the pattern: logM1_0, logM2_0, logM1_1, logM2_1, etc.

Examples

if (FALSE) { # \dontrun{
# Create test data in g-BKMR format
test_data <- data.frame(
  id = 1:100,
  sex = rbinom(100, 1, 0.5),
  bmi_0 = rnorm(100, 25, 3),
  bp_0 = rnorm(100, 120, 15),
  logM1_0 = rnorm(100, 0, 1),
  logM2_0 = rnorm(100, 0, 1),
  logM1_1 = rnorm(100, 0, 1),
  logM2_1 = rnorm(100, 0, 1),
  bmi_1 = rnorm(100, 25, 3),
  bp_1 = rnorm(100, 120, 15),
  Y = rnorm(100, 0, 1)
)

# Detect variable patterns
detection_result <- detect_variable_patterns(test_data, T = 2)

# View results
print(detection_result)
} # }