Workflows Overview

MetaGEAR provides several specialized workflows for metagenomic analysis.

Each workflow is designed for specific analysis goals and can be used independently or in combination. Choose the workflows that match your analysis needs, or follow the recommended order for comprehensive analysis.


🗂️ Download Databases

Purpose: Download and set up required reference databases
Prerequisites: None (run this first)
Output: Reference databases for all other workflows

🧬 QC DNA/RNA

Purpose: Quality control for DNA and RNA sequencing data
Prerequisites: Download databases
Output: High-quality, host-decontaminated reads (DNA) or rRNA-depleted reads (RNA)

🦠 Microbial Profiles

Purpose: Taxonomic and functional profiling of microbial communities
Prerequisites: QC DNA or quality-controlled reads
Output: Species abundance profiles and functional pathway analysis

🧬 Gene Analysis

Purpose: Comprehensive gene-centric analysis of metagenomic data
Prerequisites: QC DNA or quality-controlled reads
Output: Gene and protein profiles

# 1. Set up databases (run once)
metagear download_databases

# 2. Quality control your data
metagear qc_dna --input raw_samples.csv    # For DNA sequencing data
# OR
metagear qc_rna --input raw_samples.csv    # For RNA sequencing data

# 3. Choose your analysis approach:

# Option A: Species-centric analysis
metagear microbial_profiles --input clean_samples.csv

# Option B: Gene-centric analysis
metagear gene_analysis --input clean_samples.csv

# Option C: Both approaches (recommended for comprehensive analysis)
metagear microbial_profiles --input clean_samples.csv
metagear gene_analysis --input clean_samples.csv

Input File Format

All workflows use a standard CSV input format:

sample,fastq_1,fastq_2
SAMPLE-01,/path/to/sample1_R1.fastq.gz,/path/to/sample1_R2.fastq.gz
SAMPLE-02,/path/to/sample2_R1.fastq.gz,/path/to/sample2_R2.fastq.gz
SAMPLE-03,/path/to/sample3_R1.fastq.gz,/path/to/sample3_R2.fastq.gz

Workflow Comparison

Workflow Input Analysis Type Output Computational Requirements
Download Databases None Database setup Reference databases Low (download only)
QC DNA/RNA Raw FASTQ Quality control Clean reads Medium
Microbial Profiles Clean reads Taxonomic/functional Species profiles Medium-High
Gene Analysis Clean reads Gene-centric Gene catalogs High

← Back to Home


Table of contents