
Differential Abundance Testing for Single Time Point
Source:R/generate_taxa_test_single.R
generate_taxa_test_single.RdPerforms differential abundance analysis using LinDA (for count/proportion data) or standard linear models (for other data types). Automatically detects whether the predictor is categorical or continuous.
Usage
generate_taxa_test_single(
data.obj,
time.var = NULL,
t.level = NULL,
group.var,
ref.level = NULL,
adj.vars = NULL,
prev.filter = 0,
abund.filter = 0,
feature.level,
feature.dat.type = c("count", "proportion", "other"),
...
)Arguments
- data.obj
A MicrobiomeStat data object, which is a list containing at minimum the following components:
feature.tab: A matrix of feature abundances (taxa/genes as rows, samples as columns)meta.dat: A data frame of sample metadata (samples as rows)
Optional components include:
feature.ann: A matrix/data frame of feature annotations (e.g., taxonomy)tree: A phylogenetic tree object (class "phylo")feature.agg.list: Pre-aggregated feature tables by taxonomy
Data objects can be created using converters like
mStat_convert_phyloseq_to_data_objor importers likemStat_import_qiime2_as_data_obj.- time.var
Character string specifying the column name in meta.dat containing the time variable. Required for longitudinal and paired analyses. Should be a factor or character with meaningful time point labels.
- t.level
Character string specifying the time level/value to subset data to. Default NULL does not subset data.
- group.var
Character string specifying the column name in meta.dat containing the grouping variable (e.g., treatment, condition, phenotype). Used for between-group comparisons.
- ref.level
Character string specifying the reference level for categorical group.var. If NULL, the first level alphabetically is used. Ignored for continuous variables.
- adj.vars
Character vector specifying column names in meta.dat to be used as covariates for adjustment in statistical models. These variables will be included as fixed effects.
- prev.filter
Numeric value between 0 and 1. Features with prevalence (proportion of non-zero samples) below this threshold will be excluded from analysis. Default is usually 0 (no filtering).
- abund.filter
Numeric value. Features with mean abundance below this threshold will be excluded from analysis. Default is usually 0 (no filtering).
- feature.level
Character vector specifying the taxonomic or annotation level(s) for analysis. Should match column names in feature.ann, such as "Phylum", "Family", "Genus", etc. Use "original" to analyze at the original feature level without aggregation.
- feature.dat.type
Character string specifying the data type of feature.tab. One of:
"count": Raw count data (will be normalized)
"proportion": Relative abundance data (should sum to 1 per sample)
"other": Pre-transformed data (no transformation applied)
- ...
Additional arguments passed to the linda function.
Value
A nested list structure where:
First level: Named by
feature.level(e.g., "Phylum", "Genus")Second level: Named by comparisons
For categorical
group.var: One element per non-reference level, named as "Level vs Reference (Reference)" (e.g., "Treatment vs Control (Reference)")For continuous
group.var: One element named by the variable name (e.g., "age")
Each element is a data.frame with the following columns:
Variable: Feature/taxon nameCoefficient: Log2 fold change (for categorical) or slope coefficient (for continuous)SE: Standard error of the coefficientP.Value: Raw p-value from the statistical testAdjusted.P.Value: FDR-adjusted p-value using Benjamini-Hochberg methodMean.Abundance: Mean abundance of the feature across all samplesPrevalence: Proportion of samples where the feature is present (non-zero)
Details
## Statistical Methods
The function automatically selects the appropriate statistical method based on feature.dat.type:
**For "count" or "proportion" data:** Uses LinDA (Linear models for Differential Abundance), which:
Handles compositional nature of microbiome data
Performs internal normalization (no pre-normalization needed)
Handles zero-inflation via pseudo-count or imputation
Corrects bias from centered log-ratio transformation
**For "other" data:** Uses standard linear regression models, suitable for:
Pre-transformed data (e.g., log-transformed, CLR-transformed)
Data where compositional adjustments are not needed
## Variable Type Handling
The function automatically detects the type of group.var:
**Categorical variables (factor or character):**
Uses dummy coding with k-1 coefficients for k levels
First level (alphabetically) becomes the reference
Each coefficient compares a level to the reference
Example: For treatment groups A, B, C, output includes "B vs A (Reference)" and "C vs A (Reference)"
**Continuous variables (numeric or integer):**
Uses single slope coefficient
Tests for linear association with abundance
Coefficient represents effect per unit increase
Example: For age, coefficient shows abundance change per year
## Analysis Workflow
1. **Data Subsetting**: If time.var and t.level provided, subsets to that timepoint
2. **Feature Aggregation**: Aggregates features to specified taxonomic level if not "original"
3. **Filtering**: Applies prevalence and abundance filters
4. **Statistical Testing**: Runs LinDA or linear models depending on feature.dat.type
5. **Results Compilation**: Extracts coefficients, standard errors, p-values, and calculates FDR-adjusted p-values using Benjamini-Hochberg method
6. **Metadata Addition**: Adds mean abundance and prevalence for each feature
The function supports covariate adjustment via adj.vars and allows taxonomic
aggregation at multiple levels for customized analyses.
Examples
if (FALSE) { # \dontrun{
data(peerj32.obj)
test.list <- generate_taxa_test_single(
data.obj = peerj32.obj,
time.var = "time",
t.level = "2",
group.var = "group",
adj.vars = "sex",
feature.dat.type = "count",
feature.level = c("Phylum","Genus","Family"),
prev.filter = 0.1,
abund.filter = 0.0001,
)
plot.list <- generate_taxa_volcano_single(
data.obj = peerj32.obj,
group.var = "group",
test.list = test.list,
feature.sig.level = 0.1,
feature.mt.method = "none"
)
} # }