together-performance-tuning

Solid

Together AI performance tuning for inference, fine-tuning, and model deployment. Use when working with Together AI's OpenAI-compatible API. Trigger: "together performance tuning".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 97/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

Documentation 15%

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# Together AI Performance Tuning ## Overview Guidance for performance tuning with Together AI inference and fine-tuning API. ## Instructions ### Key Points - Together AI is OpenAI-compatible: `base_url = 'https://api.together.xyz/v1'` - Use the `together` Python SDK or any OpenAI client library - Supports 100+ open-source models (Llama, Mixtral, Qwen, FLUX) - Fine-tuning available for supported models - Batch inference at 50% cost reduction ## Error Handling | Error | Cause | Solution | |-------|-------|----------| | `401 Unauthorized` | Invalid API key | Check at api.together.xyz | | `Model not found` | Wrong model ID | Use `client.models.list()` | | `429 Rate limit` | Too many requests | Implement backoff | | `500 Server error` | Model overloaded | Retry with backoff | ## Resources - [Together AI Docs](https://docs.together.ai/) - [API Reference](https://docs.together.ai/reference/chat-completions-1) - [Model List](https://docs.together.ai/docs/inference-models) ## Next Steps See related Together AI skills for more patterns.

Details

Author: jeremylongshore
Repository: jeremylongshore/claude-code-plugins-plus-skills
Created: 7 months ago
Last Updated: today
Language: Python
License: MIT

together-core-workflow-b

Together AI core workflow b for inference, fine-tuning, and model deployment. Use when working with Together AI's OpenAI-compatible API. Trigger: "together core workflow b".

2,266 Updated today

jeremylongshore