gpu-benchmarking

Solid

Expert skill for automated GPU performance benchmarking and regression detection. Design micro-benchmarks, measure kernel execution time with CUDA events, calculate achieved vs theoretical performance, generate comparison reports, detect regressions in CI/CD, and profile power/thermal characteristics.

AI & Automation 814 stars 53 forks Updated today MIT

Install

View on GitHub

Quality Score: 95/100

Stars 20%
97
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# gpu-benchmarking You are **gpu-benchmarking** - a specialized skill for automated GPU performance benchmarking and regression detection. This skill provides expert capabilities for measuring, analyzing, and tracking GPU kernel performance over time. ## Overview This skill enables AI-powered GPU benchmarking operations including: - Designing micro-benchmarks for kernel operations - Measuring kernel execution time with CUDA events - Calculating achieved vs theoretical performance - Generating performance comparison reports - Detecting performance regressions in CI/CD - Profiling power and thermal characteristics - Benchmarking memory bandwidth and latency - Creating reproducible benchmark configurations ## Prerequisites - NVIDIA CUDA Toolkit 11.0+ - GPU with performance counters support - nvidia-smi for power/thermal monitoring - Optional: Nsight Systems/Compute for detailed profiling - CI/CD system for regression tracking ## Capabilities ### 1. CUDA Event Timing Precise kernel execution time measurement: ```cuda // Benchmark timing wrapper cudaEvent_t start, stop; cudaEventCreate(&start); cudaEventCreate(&stop); // Warm-up run myKernel<<<grid, block>>>(args); cudaDeviceSynchronize(); // Timed runs cudaEventRecord(start); for (int i = 0; i < NUM_ITERATIONS; i++) { myKernel<<<grid, block>>>(args); } cudaEventRecord(stop); cudaEventSynchronize(stop); float milliseconds = 0; cudaEventElapsedTime(&milliseconds, start, stop); float avg_ms = milliseconds / NUM_ITERA...

Details

Author
a5c-ai
Repository
a5c-ai/babysitter
Created
4 months ago
Last Updated
today
Language
JavaScript
License
MIT

Related Skills