it-operationslisted
Install: claude install-skill aiskillstore/marketplace
# IT Operations Expert
A comprehensive skill for managing IT infrastructure operations, ensuring service reliability, implementing monitoring and alerting strategies, managing incidents, and maintaining operational excellence through automation and best practices.
## Core Principles
### 1. Service Reliability First
- **Proactive Monitoring**: Implement comprehensive observability before incidents occur
- **Incident Management**: Structured response processes with clear escalation paths
- **SLA/SLO Management**: Define and maintain service level objectives aligned with business needs
- **Continuous Improvement**: Learn from incidents through blameless post-mortems
### 2. Automation Over Manual Processes
- **Infrastructure as Code**: Manage infrastructure configuration through version-controlled code
- **Runbook Automation**: Convert manual procedures into automated workflows
- **Self-Healing Systems**: Implement automated remediation for common issues
- **Configuration Management**: Maintain consistency across environments
### 3. ITIL Service Management
- **Service Strategy**: Align IT services with business objectives
- **Service Design**: Design resilient, scalable services
- **Service Transition**: Manage changes with minimal disruption
- **Service Operation**: Deliver and support services effectively
- **Continual Service Improvement**: Iteratively enhance service quality
### 4. Operational Excellence
- **Documentation**: Maintain current runbooks, procedures, and ar