RNA-Seq Data Analysis
Welcome to CDI Applied RNA-Seq Analysis
What this guide is
What this guide is
not
Who this guide is for
How the guide is structured
Free track — Foundations
Premium track — Applied analysis and interpretation
Reproducibility philosophy (CDI standard)
How to use this guide
A note on scope and evolution
1
Installation and Environment
1.1
Learning outcomes
1.2
Why the environment matters
1.3
What you need
1.4
Verify your R installation
1.5
Install required R packages (CDI pattern)
1.6
Verify package loading
1.7
CDI project structure
1.8
Record session information
1.9
Common issues and fixes
1.10
Takeaway
2
RNA-Seq Study Design and Metadata
2.1
Why study design comes first
2.2
Key concepts in RNA-Seq study design
2.3
Sample metadata structure
2.4
Loading demo metadata
2.5
Inspecting metadata
2.6
Common metadata variables
2.7
Preparing metadata for analysis
2.8
Takeaway
3
Data Intake and Basic QC Checks
3.1
Learning outcomes
3.2
Why we do intake + QC before analysis
3.3
Load demo data
3.4
Inspect structure
3.5
Validate sample alignment (namespace-safe)
3.6
Compute basic QC summaries
3.7
Save QC table for downstream analysis
3.8
QC Visualization and Interpretation
3.9
Learning outcomes
3.10
Load QC results
3.11
Quick sanity checks
3.12
Notebook initialization
3.13
Plot 1: library size distribution
3.14
Plot 2: detected genes per sample
3.15
Plot 3: zero fraction per sample
3.16
Interpreting QC in context
3.17
Takeaway
4
Quantification and Count Matrix Concepts
4.1
Learning outcomes
4.2
What is RNA-Seq quantification?
4.3
From reads to counts (conceptual overview)
4.4
The demo count matrix used in this guide
4.5
Gene identifiers and sample columns
4.6
Properties of raw counts
4.7
Library size and sequencing depth
4.8
Why normalization is required
4.9
Counts vs transformed values
4.10
Preview: rlog matrix
4.11
Common pitfalls
4.12
Takeaway
5
Normalization and Exploratory Data Analysis (Computation)
5.1
Learning outcomes
5.2
Why normalization before EDA
5.3
Load required data
5.4
Prepare count matrix
5.5
Load rlog-transformed matrix
5.6
Prepare rlog matrix for EDA
5.7
Principal component analysis (PCA)
5.8
Save PCA results for visualization
5.9
Takeaway
5.10
Exploratory Data Analysis and Interpretation
5.11
Learning outcomes
5.12
Load PCA results
5.13
Initialize CDI visualization
5.14
PCA: PC1 vs PC2
5.15
Interpreting PCA structure
5.16
Takeaway
6
Differential Expression Modeling Concepts
6.1
Learning outcomes
6.2
What is differential expression?
6.3
Differential expression is a modeling problem
6.4
Why raw counts are modeled (not transformed values)
6.5
The basic ingredients of a DE model
6.6
Experimental conditions and contrasts
6.7
What a DE result represents
6.8
Common assumptions in RNA-Seq DE models
6.9
Why exploratory analysis comes first
6.10
What we are
not
doing yet
6.11
Takeaway
🎉 Congratulations on Completing the CDI Free Track!
🚀 What You’ve Accomplished
🌟 Your Next Step: Unlock the Full Premium Track
🔥 Advanced Skills You’ll Gain
💼 Why Upgrade Now?
References
Explore More at Complex Data Insights
Applied RNA-Seq Analysis
References