alterlab-pysam

Solid

Read and write genomic alignment and variant files in Python with pysam (htslib bindings) — SAM/BAM/CRAM alignments, VCF/BCF variants, and FASTA/FASTQ sequences, plus region extraction and per-base coverage/pileup. Use when scripting NGS data-processing pipelines that parse, filter, index, or compute coverage over BAM/CRAM/VCF files. Part of the AlterLab Academic Skills suite.

AI & Automation 27 stars 4 forks Updated today MIT

Install

View on GitHub

Quality Score: 87/100

Stars 20%
48
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Pysam ## Overview Pysam is a Python module for reading, manipulating, and writing genomic datasets. Read/write SAM/BAM/CRAM alignment files, VCF/BCF variant files, and FASTA/FASTQ sequences with a Pythonic interface to htslib. Query tabix-indexed files, perform pileup analysis for coverage, and execute samtools/bcftools commands. ## When to Use This Skill This skill should be used when: - Working with sequencing alignment files (BAM/CRAM) - Analyzing genetic variants (VCF/BCF) - Extracting reference sequences or gene regions - Processing raw sequencing data (FASTQ) - Calculating coverage or read depth - Implementing bioinformatics analysis pipelines - Quality control of sequencing data - Variant calling and annotation workflows ## Quick Start ### Installation ```bash uv pip install pysam ``` ### Basic Examples **Read alignment file:** ```python import pysam # Open BAM file and fetch reads in region samfile = pysam.AlignmentFile("example.bam", "rb") for read in samfile.fetch("chr1", 1000, 2000): print(f"{read.query_name}: {read.reference_start}") samfile.close() ``` **Read variant file:** ```python # Open VCF file and iterate variants vcf = pysam.VariantFile("variants.vcf") for variant in vcf: print(f"{variant.chrom}:{variant.pos} {variant.ref}>{variant.alts}") vcf.close() ``` **Query reference sequence:** ```python # Open FASTA and extract sequence fasta = pysam.FastaFile("reference.fasta") sequence = fasta.fetch("chr1", 1000, 2000) print(sequence) fasta....

Details

Author
AlterLab-IEU
Repository
AlterLab-IEU/AlterLab-Academic-Skills
Created
2 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category