Differential Expression Analysis (DESeq2) & RNA-seq Processing Pipeline Across Soft Tissue Sarcoma Groups
This repo contains scripts for preprocessing and DESeq2 analysis of sarcoma RNA-seq data.
Core flow: raw counts → filtered counts → DESeq2 inputs → analysis outputs.
- data/ → input and output files
counts_filtered.csv
→ gene x sample matrix after QC filteringsarcoma_counts.rds
→ DESeq2-readyDESeqDataSet
objectsarcoma_colData.csv
→ sample metadata (design/condition/type)sarcoma_rowData.csv
→ gene annotations
- scripts/ → R scripts for filtering, sanity matrices, DESeq2 setup, analysis
- results/ → DESeq2 outputs, plots, differential expression tables
- renv/ → environment snapshot for reproducibility
counts_filtered.csv
→ raw count matrix (genes x samples) after QC filtering.sarcoma_counts.rds
→ DESeq2-readyDESeqDataSet
object.sarcoma_colData.csv
→ sample metadata (design, condition, type).sarcoma_rowData.csv
→ gene annotations linked to counts.
- Filtering: remove low-expression genes, save
counts_filtered.csv
. - Alignment sanity checks: verify count matrix integrity.
- DESeq2 Prep: build
sarcoma_counts.rds
with matchingcolData
+rowData
. - Analysis: run DESeq2 differential expression, diagnostics, downstream plots.
Use renv
to restore exact R package versions:
renv::restore()