Use Checkly’s template to reflect on and learn from incidents in a structured, blameless way. Fill it out within 48 hours of incident resolution.
Title:
e.g. Checkout failure on EU production cluster
Date:
YYYY-MM-DD
Duration:
Start Time → End Time (e.g. 14:37–15:19 CET)
Severity Level:
SEV1 / SEV2 / SEV3
Impact:
Describe the user/business impact (e.g. 40% of users unable to complete payment)
| Time | Event |
|---|---|
| 14:37 CET | Alert fired on API latency spike |
| 14:39 CET | On-call engineer paged via PagerDuty |
What caused the incident?
Explain the technical or process failure—not just the symptom.
Example: Primary database cluster in the EU region became overloaded due to increased traffic. Load balancer failed to redirect traffic to backup nodes due to a misconfiguration.