
Generate Longitudinal Microbiome Analysis Report
Source:R/mStat_generate_report_long.R
mStat_generate_report_long.RdGenerates a comprehensive PDF/HTML report for longitudinal microbiome analysis including alpha diversity, beta diversity, and taxonomic composition over time.
Usage
mStat_generate_report_long(
data.obj,
group.var,
vis.adj.vars = NULL,
test.adj.vars = NULL,
strata.var = NULL,
subject.var,
time.var,
t0.level = NULL,
ts.levels = NULL,
depth = NULL,
alpha.obj = NULL,
alpha.name = c("shannon", "observed_species"),
dist.obj = NULL,
dist.name = c("BC", "Jaccard"),
pc.obj = NULL,
prev.filter = 0.1,
abund.filter = 1e-04,
bar.area.feature.no = 30,
heatmap.feature.no = 30,
vis.feature.level = NULL,
test.feature.level = NULL,
feature.dat.type = c("count", "proportion", "other"),
feature.analysis.rarafy = TRUE,
feature.change.func = "relative change",
feature.mt.method = "none",
feature.sig.level = 0.1,
feature.box.axis.transform = c("sqrt"),
base.size = 16,
theme.choice = "bw",
custom.theme = NULL,
palette = NULL,
pdf = TRUE,
file.ann = NULL,
pdf.wid = 11,
pdf.hei = 8.5,
output.file,
output.format = c("pdf", "html")
)Arguments
- data.obj
A MicrobiomeStat data object, which is a list containing at minimum the following components:
feature.tab: A matrix of feature abundances (taxa/genes as rows, samples as columns)meta.dat: A data frame of sample metadata (samples as rows)
Optional components include:
feature.ann: A matrix/data frame of feature annotations (e.g., taxonomy)tree: A phylogenetic tree object (class "phylo")feature.agg.list: Pre-aggregated feature tables by taxonomy
Data objects can be created using converters like
mStat_convert_phyloseq_to_data_objor importers likemStat_import_qiime2_as_data_obj.- group.var
Character string specifying the column name in meta.dat containing the grouping variable (e.g., treatment, condition, phenotype). Used for between-group comparisons.
- vis.adj.vars
Character vector of covariate names to visualize in plots.
- test.adj.vars
Character vector of covariate names for statistical adjustment.
- strata.var
Character string specifying the column name in meta.dat for stratification. When provided, analyses and visualizations will be performed separately within each stratum (e.g., by site, batch, or sex).
- subject.var
Character string specifying the column name in meta.dat that uniquely identifies each subject or sample unit. Required for longitudinal and paired designs to track repeated measurements.
- time.var
Character string specifying the column name in meta.dat containing the time variable. Required for longitudinal and paired analyses. Should be a factor or character with meaningful time point labels.
- t0.level
Character or numeric value specifying the baseline time point for longitudinal or paired analyses. Should match a value in the time.var column.
- ts.levels
Character vector specifying the follow-up time points for longitudinal or paired analyses. Should match values in the time.var column.
- depth
Numeric value or NULL. Rarefaction depth for diversity calculations. If NULL, uses minimum sample depth or no rarefaction.
- alpha.obj
A list containing pre-calculated alpha diversity indices. If NULL and alpha diversity is needed, it will be calculated automatically. Names should match the alpha.name parameter (e.g., "shannon", "simpson"). See
mStat_calculate_alpha_diversity.- alpha.name
Character vector specifying which alpha diversity indices to analyze. Options include:
"shannon": Shannon diversity index
"simpson": Simpson diversity index
"observed_species": Observed species richness
"chao1": Chao1 richness estimator
"ace": ACE richness estimator
"pielou": Pielou's evenness
- dist.obj
A list of pre-calculated distance matrices. If NULL and distances are needed, they will be calculated automatically. List names should match dist.name (e.g., "BC" for Bray-Curtis). See
mStat_calculate_beta_diversity.- dist.name
Character vector specifying which distance metrics to use. Options depend on available methods:
"BC": Bray-Curtis dissimilarity
"Jaccard": Jaccard distance
"UniFrac": Unweighted UniFrac (requires tree)
"GUniFrac": Generalized UniFrac (requires tree)
"WUniFrac": Weighted UniFrac (requires tree)
"JS": Jensen-Shannon divergence
- pc.obj
Pre-calculated PCoA/PCA results from mStat_calculate_PC. If NULL, computed automatically.
- prev.filter
Numeric value between 0 and 1. Features with prevalence (proportion of non-zero samples) below this threshold will be excluded from analysis. Default is usually 0 (no filtering).
- abund.filter
Numeric value. Features with mean abundance below this threshold will be excluded from analysis. Default is usually 0 (no filtering).
- bar.area.feature.no
Number of top features to show in bar/area plots (default 20).
- heatmap.feature.no
Number of top features to show in heatmaps (default 20).
- vis.feature.level
Taxonomic level(s) for visualization.
- test.feature.level
Taxonomic level(s) for statistical testing.
- feature.dat.type
Character string specifying the data type of feature.tab. One of:
"count": Raw count data (will be normalized)
"proportion": Relative abundance data (should sum to 1 per sample)
"other": Pre-transformed data (no transformation applied)
- feature.analysis.rarafy
Logical, whether to rarefy data for feature analysis (default TRUE).
- feature.change.func
Method for calculating change: "relative change", "absolute change", or "log fold change".
- feature.mt.method
Multiple testing correction: "fdr" or "none" (default "fdr").
- feature.sig.level
Significance threshold for highlighting features (default 0.1).
- feature.box.axis.transform
Y-axis transformation for boxplots: "identity", "sqrt", or "log".
- base.size
Numeric value specifying the base font size for plot text elements. Default is typically 16.
- theme.choice
Character string specifying the ggplot2 theme to use. Options include:
"bw": Black and white theme (theme_bw)
"classic": Classic theme (theme_classic)
"minimal": Minimal theme (theme_minimal)
"prism": GraphPad Prism-like theme
"nature": Nature journal style
"light": Light theme (theme_light)
Can also use a custom ggplot2 theme object via custom.theme.
- custom.theme
A custom ggplot2 theme object to override theme.choice. Should be created using ggplot2::theme() or a complete theme function.
- palette
Character vector of colors or a named palette for the plot. If NULL, uses default MicrobiomeStat color scheme. Can be:
A vector of color codes (e.g., c("#E41A1C", "#377EB8"))
A palette name recognized by the plotting function
Logical. If TRUE, saves the plot(s) to PDF file(s) in the current working directory. Default is TRUE.
- file.ann
Character string for additional annotation to append to output filenames. Useful for distinguishing multiple outputs.
- pdf.wid
Numeric value specifying the width of PDF output in inches. Default is typically 11.
- pdf.hei
Numeric value specifying the height of PDF output in inches. Default is typically 8.5.
- output.file
Output report filename (required).
- output.format
Output format: "pdf" or "html".
Value
A PDF report containing:
- Summary statistics table: Sample size, covariates, time points
- Alpha diversity boxplots: Boxplots colored by time and groups
- Alpha diversity spaghetti plots: Line plots of trajectories
- Alpha diversity trends: Tables with LME model results
- Alpha volatility: Tables with LM model results
- Beta diversity PCoA plots: Ordination plots colored by time and groups
- Beta distance boxplots: Boxplots colored by time and groups
- Beta spaghetti plots: Individual line plots
- Beta diversity trends: Tables with LME results on distances
- Beta volatility: Tables with LM results on distances
- Feature area plots: Stacked area plots showing composition
- Feature heatmaps: Heatmaps colored by relative abundance
- Feature volcano plots: With trend/volatility significance
- Feature boxplots: Distribution by time and groups
- Feature spaghetti plots: Individual line plots
Details
This function generates a comprehensive longitudinal microbiome report with:
1. Summary Statistics - Table with sample size, number of timepoints, covariates
2. Alpha Diversity Analysis - Boxplots of alpha diversity vs time, colored by groups - Spaghetti plots of alpha trajectories for each subject - Linear mixed effects model results for alpha diversity trend - Linear model results for alpha diversity volatility
3. Beta Diversity Analysis - PCoA plots colored by time points and groups - Boxplots of PCoA coordinate 1 vs time, colored by groups - Spaghetti plots of distance from baseline vs time - Linear mixed effects models for beta diversity distance trend - Linear models for beta diversity volatility
4. Taxonomic Composition Analysis - Stacked area plots of average composition by time and groups - Heatmaps of relative abundance colored from low (blue) to high (red) - Volcano plots highlighting significant taxa in trend and volatility - Boxplots of significant taxa by time and groups - Spaghetti plots for significant taxa vs time
Author
Chen Yang [email protected]
Examples
if (FALSE) { # \dontrun{
data(subset_T2D.obj)
mStat_generate_report_long(
data.obj = subset_T2D.obj,
group.var = "subject_race",
strata.var = NULL,
test.adj.vars = NULL,
vis.adj.vars = NULL,
subject.var = "subject_id",
time.var = "visit_number_num",
t0.level = NULL,
ts.levels = NULL,
alpha.obj = NULL,
alpha.name = c("shannon","observed_species"),
dist.obj = NULL,
dist.name = c("BC",'Jaccard'),
pc.obj = NULL,
feature.mt.method = "none",
feature.sig.level = 0.3,
vis.feature.level = c("Family","Genus"),
test.feature.level = c("Family"),
feature.change.func = "relative change",
feature.dat.type = "count",
prev.filter = 0.1,
abund.filter = 1e-4,
bar.area.feature.no = 40,
heatmap.feature.no = 40,
feature.box.axis.transform = "sqrt",
theme.choice = "bw",
base.size = 20,
output.file = "/Users/apple/Research/MicrobiomeStat/result/per_time_test_report.pdf",
output.format = "pdf"
)
mStat_generate_report_long(
data.obj = subset_T2D.obj,
group.var = "subject_race",
strata.var = NULL,
test.adj.vars = NULL,
vis.adj.vars = NULL,
subject.var = "subject_id",
time.var = "visit_number_num",
t0.level = NULL,
ts.levels = NULL,
alpha.obj = NULL,
alpha.name = c("shannon","observed_species"),
dist.obj = NULL,
dist.name = c("BC",'Jaccard'),
pc.obj = NULL,
feature.mt.method = "none",
feature.sig.level = 0.3,
vis.feature.level = c("Family","Genus"),
test.feature.level = c("Family"),
feature.change.func = "relative change",
feature.dat.type = "count",
prev.filter = 0.1,
abund.filter = 1e-4,
bar.area.feature.no = 40,
heatmap.feature.no = 40,
feature.box.axis.transform = "sqrt",
theme.choice = "bw",
base.size = 20,
output.file = "/Users/apple/Research/MicrobiomeStat/result/per_time_test_report.html",
output.format = "html"
)
data(ecam.obj)
mStat_generate_report_long(
data.obj = ecam.obj,
group.var = "antiexposedall",
strata.var = NULL,
test.adj.vars = "delivery",
vis.adj.vars = "delivery",
subject.var = "subject.id",
time.var = "month_num",
t0.level = NULL,
ts.levels = NULL,
alpha.obj = NULL,
alpha.name = c("shannon","observed_species"),
dist.obj = NULL,
dist.name = c("BC",'Jaccard'),
pc.obj = NULL,
feature.mt.method = "none",
feature.sig.level = 0.3,
vis.feature.level = c("Family","Genus"),
test.feature.level = c("Family"),
feature.change.func = "relative change",
feature.dat.type = "proportion",
prev.filter = 0.1,
abund.filter = 1e-4,
bar.area.feature.no = 40,
heatmap.feature.no = 40,
feature.box.axis.transform = "sqrt",
theme.choice = "bw",
base.size = 20,
output.file = "/Users/apple/Research/MicrobiomeStat/result/ecam_obj_report_long.pdf"
)
mStat_generate_report_long(
data.obj = ecam.obj,
group.var = "antiexposedall",
strata.var = NULL,
test.adj.vars = "delivery",
vis.adj.vars = "delivery",
subject.var = "subject.id",
time.var = "month_num",
t0.level = NULL,
ts.levels = NULL,
alpha.obj = NULL,
alpha.name = c("shannon","observed_species"),
dist.obj = NULL,
dist.name = c("BC",'Jaccard'),
pc.obj = NULL,
feature.mt.method = "none",
feature.sig.level = 0.3,
vis.feature.level = c("Family","Genus"),
test.feature.level = c("Family"),
feature.change.func = "relative change",
feature.dat.type = "proportion",
prev.filter = 0.1,
abund.filter = 1e-4,
bar.area.feature.no = 40,
heatmap.feature.no = 40,
feature.box.axis.transform = "sqrt",
theme.choice = "bw",
base.size = 20,
output.file = "/Users/apple/Research/MicrobiomeStat/result/ecam_obj_report_long.html",
output.format = "html"
)
} # }