model-merging

Solid

Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + coding + chat), improving performance beyond single models, or experimenting rapidly with model variants. Covers SLERP, TIES-Merging, DARE, Task Arithmetic, linear merging, and production deployment strategies.

AI & Automation 9,609 stars 724 forks Updated 1 months ago MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%
100
Recency 20%
75
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Model Merging: Combining Pre-trained Models ## When to Use This Skill Use Model Merging when you need to: - **Combine capabilities** from multiple fine-tuned models without retraining - **Create specialized models** by blending domain-specific expertise (math + coding + chat) - **Improve performance** beyond single models (often +5-10% on benchmarks) - **Reduce training costs** - no GPUs needed, merges run on CPU - **Experiment rapidly** - create new model variants in minutes, not days - **Preserve multiple skills** - merge without catastrophic forgetting **Success Stories**: Marcoro14-7B-slerp (best on Open LLM Leaderboard 02/2024), many top HuggingFace models use merging **Tools**: mergekit (Arcee AI), LazyMergekit, Model Soup ## Installation ```bash # Install mergekit git clone https://github.com/arcee-ai/mergekit.git cd mergekit pip install -e . # Or via pip pip install mergekit # Optional: Transformer library pip install transformers torch ``` ## Quick Start ### Simple Linear Merge ```yaml # config.yml - Merge two models with equal weights merge_method: linear models: - model: mistralai/Mistral-7B-v0.1 parameters: weight: 0.5 - model: teknium/OpenHermes-2.5-Mistral-7B parameters: weight: 0.5 dtype: bfloat16 ``` ```bash # Run merge mergekit-yaml config.yml ./merged-model --cuda # Use merged model python -m transformers.models.auto --model_name_or_path ./merged-model ``` ### SLERP Merge (Best for 2 Models) ```yaml # config.yml - Sph...

Details

Author
Orchestra-Research
Repository
Orchestra-Research/AI-Research-SKILLs
Created
7 months ago
Last Updated
1 months ago
Language
TeX
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category