methylation-aggregationlisted
Install: claude install-skill ammawla/encode-toolkit
# Aggregate DNA Methylation Data Across Studies
## When to Use
- User wants to build a tissue-level DNA methylation landscape from multiple WGBS experiments
- User asks "where is DNA methylated in brain?" or "find hypomethylated regions across donors"
- User needs to identify HMRs (hypomethylated regions), UMRs, or PMDs from aggregated WGBS data
- User wants per-CpG weighted methylation averages from multiple experiments
- Example queries: "aggregate WGBS data for liver", "build methylation map across donors", "find unmethylated CpG islands in pancreas"
Build a comprehensive methylation landscape for a tissue/cell type by merging WGBS bedMethyl files from multiple ENCODE experiments.
## Scientific Rationale
**The question**: "What is the DNA methylation state across the genome in my tissue?"
DNA methylation is **fundamentally different** from histone marks and accessibility:
| Property | Histone/Accessibility | DNA Methylation |
|----------|----------------------|-----------------|
| Signal type | Binary (bound/open or not) | Continuous (0-100% methylated) |
| Default state | Unmarked | ~70-80% methylated (CpG context) |
| Biology of interest | Where marks ARE present | Where methylation is ABSENT or REDUCED |
| Aggregation approach | Union of peak calls | Average/median of methylation levels per CpG |
**The key insight**: Unlike histone ChIP-seq where we want the union of all peaks, for methylation we want the **average methylation level per CpG site** across individ