llm-evaluation
FeaturedMaster comprehensive evaluation strategies for LLM applications, from automated metrics to human evaluation and A/B testing.
Install
Quality Score: 99/100
Skill Content
Details
- Author
- davila7
- Repository
- davila7/claude-code-templates
- Created
- 11 months ago
- Last Updated
- today
- Language
- Python
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
llm-evaluation
Master comprehensive evaluation strategies for LLM applications, from automated metrics to human evaluation and A/B testing.
llm-evaluation
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.
llm-evaluation
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.
llm-evaluation
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.
llm-evaluation
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.