it-operations

Solid

Manages IT infrastructure, monitoring, incident response, and service reliability. Provides frameworks for ITIL service management, observability strategies, automation, backup/recovery, capacity planning, and operational excellence practices.

DevOps & Infrastructure 27,681 stars 2854 forks Updated today MIT

Install

View on GitHub

Quality Score: 93/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# IT Operations Expert A comprehensive skill for managing IT infrastructure operations, ensuring service reliability, implementing monitoring and alerting strategies, managing incidents, and maintaining operational excellence through automation and best practices. ## Core Principles ### 1. Service Reliability First - **Proactive Monitoring**: Implement comprehensive observability before incidents occur - **Incident Management**: Structured response processes with clear escalation paths - **SLA/SLO Management**: Define and maintain service level objectives aligned with business needs - **Continuous Improvement**: Learn from incidents through blameless post-mortems ### 2. Automation Over Manual Processes - **Infrastructure as Code**: Manage infrastructure configuration through version-controlled code - **Runbook Automation**: Convert manual procedures into automated workflows - **Self-Healing Systems**: Implement automated remediation for common issues - **Configuration Management**: Maintain consistency across environments ### 3. ITIL Service Management - **Service Strategy**: Align IT services with business objectives - **Service Design**: Design resilient, scalable services - **Service Transition**: Manage changes with minimal disruption - **Service Operation**: Deliver and support services effectively - **Continual Service Improvement**: Iteratively enhance service quality ### 4. Operational Excellence - **Documentation**: Maintain current runbooks, procedures, and ar...

Details

Author
davila7
Repository
davila7/claude-code-templates
Created
11 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

DevOps & Infrastructure Listed

it-operations

Manages IT infrastructure, monitoring, incident response, and service reliability. Provides frameworks for ITIL service management, observability strategies, automation, backup/recovery, capacity planning, and operational excellence practices.

335 Updated today
aiskillstore
DevOps & Infrastructure Listed

devops

DevOps practices, CI/CD, and infrastructure management

0 Updated today
murtazatouqeer
DevOps & Infrastructure Solid

devops-sre-master

DevOps 与站点可靠性工程 (SRE) — 平台 / 基础设施 / 可靠性工程师的认知操作系统, 覆盖软件交付 + 运维全生命周期 (CI/CD 与发布工程 trunk-based + 渐进式发布 canary/blue-green/feature flag + GitOps Argo CD/Flux / 基础设施即代码 Terraform/OpenTofu/Pulumi/Ansible + policy-as-code OPA / 容器与编排 Docker/Kubernetes + Helm/Kustomize + service mesh Istio/Linkerd / 可观测性 Prometheus + Loki + OpenTelemetry + Honeycomb + eBPF + RED/USE / SLO-SLI-error budget 与可靠性工程 Google SRE 学科 + 容量规划 + 优雅降级 / 事件管理与 on-call 事件指挥 + PagerDuty + runbook + 无指责复盘 + MTTR / 云平台与 FinOps AWS/GCP/Azure + 成本优化 + 弹性伸缩 / 平台工程与开发者体验 IDP + Backstage + golden path + Team Topologies / DevSecOps 与供应链安全 shift-left + SBOM + SLSA + sigstore + Vault / 韧性与混沌工程 fault injection + game day + 安全科学 / DORA 指标与工程效能 部署频率 + 变更前置时间 + 变更失败率 + Accelerate 研究 / 数据库与有状态运维 schema 迁移 + 备份容灾) — 不含 通用应用开发 / 纯云销售认证速成 / 'DevOps = 跑 Jenkins 的岗位' 窄化误解 / ITIL 工单文化传统运维 (旧范式仅做边界) / 把手工运维 ClickOps 当稳态 (是 toil, 本 skill 核心反模式) (DevOps & Site Reliability Engineering — the cognitive operating system of platform / infrastructure / reliability practitioners

34 Updated 3 days ago
swaylq
DevOps & Infrastructure Solid

devops-iac-engineer

Implements infrastructure as code using Terraform, Kubernetes, and cloud platforms. Designs scalable architectures, CI/CD pipelines, and observability solutions. Provides security-first DevOps practices and site reliability engineering guidance.

27,681 Updated today
davila7
DevOps & Infrastructure Listed

devops-iac-engineer

Implements infrastructure as code using Terraform, Kubernetes, and cloud platforms. Designs scalable architectures, CI/CD pipelines, and observability solutions. Provides security-first DevOps practices and site reliability engineering guidance.

335 Updated today
aiskillstore