biopython-sequence-analysis

Solid

Biopython sequence analysis: parse FASTA/FASTQ/GenBank/GFF (SeqIO), NCBI Entrez (esearch/efetch/elink), remote/local BLAST, pairwise/MSA alignment (PairwiseAligner, MUSCLE/ClustalW), phylogenetic trees (Phylo). Use for gene family studies, phylogenomics, comparative genomics, NCBI pipelines. For PCR/restriction/cloning use biopython-molecular-biology; for SAM/BAM use pysam.

Data & Documents 286 stars 26 forks Updated 4 days ago NOASSERTION

Install

View on GitHub

Quality Score: 82/100

Stars 20%

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Biopython: Sequence Analysis Toolkit ## Overview Biopython provides a comprehensive suite of modules for sequence-centric bioinformatics: reading and writing every major biological file format (FASTA, FASTQ, GenBank, GFF), querying NCBI databases programmatically, running BLAST searches and parsing results, aligning sequences pairwise or in multiple-sequence alignments, and building and visualizing phylogenetic trees. This skill focuses on analysis workflows — from NCBI data retrieval through alignment to phylogenetic inference. For PCR primer design, restriction enzyme digestion, cloning simulation, protein structure analysis (Bio.PDB), and molecular weight/Tm calculations, see **biopython-molecular-biology**. ## When to Use - Download a gene family from NCBI Nucleotide/Protein, align sequences, and construct a phylogenetic tree - Parse GenBank or GFF3 annotation files and extract CDS sequences for a set of features - Run a BLAST search against NCBI `nt` or `nr`, filter significant hits, and fetch their full sequences - Compute pairwise sequence identities or score alignments with BLOSUM62/PAM250 matrices - Index a large multi-FASTA or FASTQ file with `SeqIO.index()` for random-access retrieval without loading all sequences into RAM - Convert between sequence formats (FASTA ↔ GenBank ↔ FASTQ ↔ PHYLIP) in a single call - Traverse, root, prune, and annotate a Newick or Nexus phylogenetic tree programmatically - Use **pysam** instead when working with SAM/BAM/CRAM alignm...

Details

Author: jaechang-hits
Repository: jaechang-hits/SciAgent-Skills
Created: 5 months ago
Last Updated: 4 days ago
Language: Python
License: NOASSERTION

Bundled in these plugins

sciagent-skills

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

biopython-molecular-biology

Molecular biology toolkit: sequence manipulation, FASTA/GenBank/PDB I/O, NCBI Entrez, BLAST automation, pairwise/MSA alignment, Bio.PDB, phylogenetic trees. Use for batch processing, custom pipelines, format conversion, PubMed/GenBank queries. For quick gene lookups use gget; for multi-service REST APIs use bioservices.

286 Updated 4 days ago

jaechang-hits

AI & Automation Featured

biopython

Comprehensive molecular biology toolkit. Use for sequence manipulation, file parsing (FASTA/GenBank/PDB), phylogenetics, and programmatic NCBI/PubMed access (Bio.Entrez). Best for batch processing, custom bioinformatics pipelines, BLAST automation. For quick lookups use gget; for multi-service integration use bioservices.

726 Updated 1 weeks ago

LeonChaoX

Data & Documents Listed

biopython

13 Updated yesterday

tassiovale