perplexity-performance-tuning

Featured

Optimize Perplexity Sonar API performance with caching, streaming, model routing, and batching. Use when experiencing slow API responses, implementing caching strategies, or optimizing request throughput for Perplexity integrations. Trigger with phrases like "perplexity performance", "optimize perplexity", "perplexity latency", "perplexity caching", "perplexity slow".

AI & Automation 2,266 stars 315 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Perplexity Performance Tuning ## Overview Optimize Perplexity Sonar API for latency, throughput, and cost. Key insight: every Perplexity call performs a live web search, so response times are inherently variable. Typical latencies: sonar 1-3s, sonar-pro 3-8s, sonar-deep-research 10-60s. ## Latency Benchmarks | Model | Typical Latency | Max Tokens | Best For | |-------|----------------|------------|----------| | `sonar` | 1-3s | 4096 | Quick answers, simple facts | | `sonar-pro` | 3-8s | 8192 | Deep research, many citations | | `sonar-reasoning-pro` | 5-15s | 8192 | Multi-step analysis | | `sonar-deep-research` | 10-60s | 8192 | Comprehensive reports | ## Prerequisites - Perplexity API key configured - Understanding of search-augmented generation latency patterns - Cache infrastructure (Redis or in-memory LRU) ## Instructions ### Step 1: Smart Model Routing ```typescript import OpenAI from "openai"; const perplexity = new OpenAI({ apiKey: process.env.PERPLEXITY_API_KEY, baseURL: "https://api.perplexity.ai", }); type QueryComplexity = "simple" | "standard" | "deep"; function classifyQuery(query: string): QueryComplexity { const words = query.split(/\s+/).length; const simplePatterns = [/^what is/i, /^who is/i, /^when did/i, /^define/i, /^how many/i]; const deepPatterns = [/compare.*vs/i, /analysis of/i, /comprehensive/i, /pros and cons/i, /in-depth/i]; if (simplePatterns.some((p) => p.test(query)) && words < 15) return "simple"; if (deepPatterns.some((...

Details

Author
jeremylongshore
Repository
jeremylongshore/claude-code-plugins-plus-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Featured

perplexity-cost-tuning

Optimize Perplexity costs through model routing, caching, token limits, and budget monitoring. Use when analyzing Perplexity billing, reducing API costs, or implementing budget alerts for Perplexity Sonar API. Trigger with phrases like "perplexity cost", "perplexity billing", "reduce perplexity costs", "perplexity pricing", "perplexity budget".

2,266 Updated today
jeremylongshore
AI & Automation Featured

perplexity-observability

Set up monitoring for Perplexity Sonar API with latency, cost, citation quality, and error tracking. Use when implementing monitoring dashboards, setting up alerts, or tracking Perplexity API health in production. Trigger with phrases like "perplexity monitoring", "perplexity metrics", "perplexity observability", "monitor perplexity", "perplexity dashboard".

2,266 Updated today
jeremylongshore
AI & Automation Featured

perplexity-architecture-variants

Choose and implement Perplexity architecture blueprints for different scales: direct search widget, cached research layer, and multi-query pipeline. Trigger with phrases like "perplexity architecture", "perplexity blueprint", "how to structure perplexity", "perplexity project layout".

2,266 Updated today
jeremylongshore
AI & Automation Featured

perplexity-reliability-patterns

Implement reliability patterns for Perplexity Sonar API: circuit breaker, model fallback, streaming timeout, and citation validation. Trigger with phrases like "perplexity reliability", "perplexity circuit breaker", "perplexity fallback", "perplexity resilience", "perplexity timeout".

2,266 Updated today
jeremylongshore
AI & Automation Featured

perplexity-reference-architecture

Implement Perplexity reference architecture with model routing, citation pipeline, and research automation. Use when designing new Perplexity integrations, reviewing project structure, or establishing architecture for search-augmented apps. Trigger with phrases like "perplexity architecture", "perplexity project structure", "how to organize perplexity", "perplexity design patterns".

2,266 Updated today
jeremylongshore