
Longitudinal Taxa Association Test
Source:R/generate_taxa_association_test_long.R
generate_taxa_association_test_long.RdTests associations between taxa abundances and a grouping variable in longitudinal microbiome data using LinDA mixed-effects models.
Usage
generate_taxa_association_test_long(
data.obj,
subject.var,
group.var,
adj.vars = NULL,
prev.filter = 0,
abund.filter = 0,
feature.level,
feature.dat.type = c("count", "proportion", "other"),
...
)Arguments
- data.obj
A MicrobiomeStat data object, which is a list containing at minimum the following components:
feature.tab: A matrix of feature abundances (taxa/genes as rows, samples as columns)meta.dat: A data frame of sample metadata (samples as rows)
Optional components include:
feature.ann: A matrix/data frame of feature annotations (e.g., taxonomy)tree: A phylogenetic tree object (class "phylo")feature.agg.list: Pre-aggregated feature tables by taxonomy
Data objects can be created using converters like
mStat_convert_phyloseq_to_data_objor importers likemStat_import_qiime2_as_data_obj.- subject.var
Character string specifying the column name in meta.dat that uniquely identifies each subject or sample unit. Required for longitudinal and paired designs to track repeated measurements.
- group.var
Character string specifying the column name in meta.dat containing the grouping variable (e.g., treatment, condition, phenotype). Used for between-group comparisons.
- adj.vars
Character vector specifying column names in meta.dat to be used as covariates for adjustment in statistical models. These variables will be included as fixed effects.
- prev.filter
Numeric value between 0 and 1. Features with prevalence (proportion of non-zero samples) below this threshold will be excluded from analysis. Default is usually 0 (no filtering).
- abund.filter
Numeric value. Features with mean abundance below this threshold will be excluded from analysis. Default is usually 0 (no filtering).
- feature.level
Character vector specifying the taxonomic or annotation level(s) for analysis. Should match column names in feature.ann, such as "Phylum", "Family", "Genus", etc. Use "original" to analyze at the original feature level without aggregation.
- feature.dat.type
Character string specifying the data type of feature.tab. One of:
"count": Raw count data (will be normalized)
"proportion": Relative abundance data (should sum to 1 per sample)
"other": Pre-transformed data (no transformation applied)
- ...
Additional arguments passed to underlying functions.
Value
A nested list structure where:
First level: Named by
feature.level(e.g., "Phylum", "Genus")Second level: Named by tested comparisons between groups (e.g., "Level vs Reference (Reference)" for categorical variables, or variable name for continuous variables)
Each element is a data.frame with the following columns:
Variable: Feature/taxon nameCoefficient: Log2 fold change (categorical) or slope (continuous)SE: Standard error of the coefficientP.Value: Raw p-value from LinDA's statistical testAdjusted.P.Value: FDR-adjusted p-value (Benjamini-Hochberg)Mean.Abundance: Mean abundance across all samplesPrevalence: Proportion of samples where feature is present (non-zero)
Analysis uses LinDA mixed-effects models to test associations between taxa abundances and the grouping variable, accounting for subject-level random effects.
Details
Based on whether adj.vars is NULL, the formula tests:
- When adj.vars is NOT NULL: - Tests group.var and adj.vars main effects. - Adjusted for adj.vars.
- When adj.vars is NULL: - Tests group.var main effects only. - Unadjusted analysis.
Subject variability is accounted for through random effects.
Examples
if (FALSE) { # \dontrun{
# Example 1: Generate taxa association tests and volcano plots for the ecam dataset
data("ecam.obj")
test.list <- generate_taxa_association_test_long(
data.obj = ecam.obj,
subject.var = "studyid",
group.var = "delivery",
feature.level = c("Phylum", "Class"),
feature.dat.type = c("count")
)
volcano_plots_ecam <- generate_taxa_volcano_single(
data.obj = ecam.obj,
group.var = "delivery",
test.list = test.list,
feature.sig.level = 0.1,
feature.mt.method = "fdr"
)
# Example 2: Generate taxa association tests and volcano plots for a subset of the T2D dataset
data("subset_T2D.obj")
test.list_T2D <- generate_taxa_association_test_long(
data.obj = subset_T2D.obj,
subject.var = "subject_id",
feature.level = "Genus",
group.var = "subject_race",
feature.dat.type = c("count"),
prev.filter = 0.1,
abund.filter = 0.001
)
volcano_plots_T2D <- generate_taxa_volcano_single(
data.obj = subset_T2D.obj,
group.var = "subject_race",
test.list = test.list_T2D,
feature.sig.level = 0.1,
feature.mt.method = "none"
)
} # }