together-prod-checklist

Solid

Together AI prod checklist for inference, fine-tuning, and model deployment. Use when working with Together AI's OpenAI-compatible API. Trigger: "together prod checklist".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Together AI Production Checklist ## Overview Together AI provides OpenAI-compatible inference across 100+ open-source models (Llama, Mixtral, Qwen, FLUX) plus fine-tuning and batch processing. A production integration routes completions, embeddings, or image generation through Together's API. Failures mean inference latency spikes, model availability gaps, or unexpected cost overruns from uncontrolled batch jobs. ## Authentication & Secrets - [ ] `TOGETHER_API_KEY` stored in secrets manager (not source code) - [ ] API key restricted to production workspace - [ ] Key rotation schedule documented (90-day cycle) - [ ] Separate keys for dev/staging/prod environments - [ ] Fine-tuning job tokens scoped separately from inference tokens ## API Integration - [ ] Production base URL configured (`https://api.together.xyz/v1`) - [ ] Rate limit handling with exponential backoff - [ ] Model IDs validated against `client.models.list()` before deployment - [ ] Completion streaming implemented for real-time use cases - [ ] Embedding batch size optimized (max 2048 inputs per request) - [ ] Batch inference configured for non-real-time workloads (50% cost savings) - [ ] Fallback model configured if primary model is unavailable ## Error Handling & Resilience - [ ] Circuit breaker configured for Together API outages - [ ] Retry with backoff for 429/5xx responses - [ ] Model-not-found errors caught before user-facing requests - [ ] Token usage tracked per request to prevent budget overru...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category