finetuning

Solid

Generates a Jupyter notebook that fine-tunes a base model using SageMaker serverless training jobs. Use when the user says "start training", "fine-tune my model", "I'm ready to train", or when the plan reaches the finetuning step. Supports SFT, DPO, and RLVR trainers, including RLVR Lambda reward function creation.

AI & Automation 765 stars 108 forks Updated 2 days ago Apache-2.0

Install

View on GitHub

Quality Score: 95/100

Stars 20%

Recency 20%

100

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Prerequisites Before starting this workflow, verify: 1. A `use_case_spec.md` file exists - If missing: Activate the `use-case-specification` skill first, then resume - DON'T EVER offer to create a use case spec without activating the use-case-specification skill. 2. A fine-tuning technique (SFT, DPO, or RLVR) and base model have already been selected - If missing: Activate the `finetuning-setup` skill to collect what's missing, then resume - Don't make recommendations on the spot. You MUST activate the finetuning-setup skill. 3. A base model name available on SageMakerHub has been identified - If missing: Activate the `finetuning-setup` skill to get it - **Important:** Only use the model name that `finetuning-setup` retrieves, as it may differ from other commonly used names for the same model # Critical Rules ## Code Generation Rules - ✅ Use EXACTLY the imports shown in each cell template - ❌ Do NOT add additional imports even if they seem helpful - ❌ Do NOT create variables before they're needed in that cell - 📋 Copy the code structure precisely - no improvisation - 🎯 Follow the minimal code principle strictly - ✅ When writing a notebook cell, make sure the indentation and f strings are correct ## User Communication Rules - ❌ NEVER offer to run the notebook for the user (you don't have the tools) - ❌ NEVER offer to move on to a downstream skill while training is in progress (logically impossible) - ❌ NEVER set ACCEPT_EULA to True yourself (user...

Details

Author: awslabs
Repository: awslabs/agent-plugins
Created: 3 months ago
Last Updated: 2 days ago
Language: Shell
License: Apache-2.0

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

finetuning-setup

Selects a base model and fine-tuning technique (SFT, DPO, or RLVR) for the user's use case by querying SageMaker Hub. Use when the user asks which model or technique to use, wants to start fine-tuning, or mentions a model name or family (e.g., "Llama", "Mistral") — always activate even for known model names because the exact Hub model ID must be resolved. Queries available models, validates technique compatibility, and confirms selections.

765 Updated 2 days ago

awslabs

AI & Automation Solid

fine-tuning-expert

Use when fine-tuning LLMs, training custom models, or adapting foundation models for specific tasks. Invoke for configuring LoRA/QLoRA adapters, preparing JSONL training datasets, setting hyperparameters for fine-tuning runs, adapter training, transfer learning, finetuning with Hugging Face PEFT, OpenAI fine-tuning, instruction tuning, RLHF, DPO, or quantizing and deploying fine-tuned models. Trigger terms include: LoRA, QLoRA, PEFT, finetuning, fine-tuning, adapter tuning, LLM training, model training, custom model.

9,537 Updated 1 weeks ago

Jeffallan

AI & Automation Listed

fine-tuning-expert

Use when fine-tuning LLMs, training custom models, or optimizing model performance for specific tasks. Invoke for parameter-efficient methods, dataset preparation, or model adaptation.

2 Updated today

zacklecon

AI & Automation Listed

mission-control-keras-finetuning

Route Keras fine-tuning and transfer-learning work through Mission Control with explicit baseline, unfreeze, and evaluation steps.

1 Updated today

MN755

AI & Automation Featured

together-core-workflow-b

Together AI core workflow b for inference, fine-tuning, and model deployment. Use when working with Together AI's OpenAI-compatible API. Trigger: "together core workflow b".

2,274 Updated today

jeremylongshore