mamba-architecture

Featured

State-space model with O(n) complexity vs Transformers' O(n²). 5× faster inference, million-token sequences, no KV cache. Selective SSM with hardware-aware design. Mamba-1 (d_state=16) and Mamba-2 (d_state=128, multi-head). Models 130M-2.8B on HuggingFace.

AI & Automation 6,478 stars 505 forks Updated 1 months ago MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%

100

Recency 20%

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Mamba - Selective State Space Models ## Quick start Mamba is a state-space model architecture achieving O(n) linear complexity for sequence modeling. **Installation**: ```bash # Install causal-conv1d (optional, for efficiency) pip install causal-conv1d>=1.4.0 # Install Mamba pip install mamba-ssm # Or both together pip install mamba-ssm[causal-conv1d] ``` **Prerequisites**: Linux, NVIDIA GPU, PyTorch 1.12+, CUDA 11.6+ **Basic usage** (Mamba block): ```python import torch from mamba_ssm import Mamba batch, length, dim = 2, 64, 16 x = torch.randn(batch, length, dim).to("cuda") model = Mamba( d_model=dim, # Model dimension d_state=16, # SSM state dimension d_conv=4, # Conv1d kernel size expand=2 # Expansion factor ).to("cuda") y = model(x) # O(n) complexity! assert y.shape == x.shape ``` ## Common workflows ### Workflow 1: Language model with Mamba-2 **Complete LM with generation**: ```python from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel from mamba_ssm.models.config_mamba import MambaConfig import torch # Configure Mamba-2 LM config = MambaConfig( d_model=1024, # Hidden dimension n_layer=24, # Number of layers vocab_size=50277, # Vocabulary size ssm_cfg=dict( layer="Mamba2", # Use Mamba-2 d_state=128, # Larger state for Mamba-2 headdim=64, # Head dimension ngroups=1 # Number of groups ) ) model ...

Details

Author: Orchestra-Research
Repository: Orchestra-Research/AI-Research-SKILLs
Created: 6 months ago
Last Updated: 1 months ago
Language: TeX
License: MIT

Integrates with

Hugging Face · AI

Related Skills

AI & Automation Featured

videodb

See, Understand, Act on video and audio. See- ingest from local files, URLs, RTSP/live feeds, or live record desktop; return realtime context and playable stream links. Understand- extract frames, build visual/semantic/temporal indexes, and search moments with timestamps and auto-clips. Act- transcode and normalize (codec, fps, resolution, aspect ratio), perform timeline edits (subtitles, text/image overlays, branding, audio overlays, dubbing, translation), generate media assets (image, audio, video), and create real time alerts for events from live streams or desktop capture.

192,199 Updated today

affaan-m

AI & Automation Featured

ck

Persistent per-project memory for Claude Code. Auto-loads project context on session start, tracks sessions with git activity, and writes to native memory. Commands run deterministic Node.js scripts — behavior is consistent across model versions.

192,199 Updated today

affaan-m

AI & Automation Featured

browser

Web browser automation with AI-optimized snapshots for claude-flow agents

55,035 Updated today

ruvnet