-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Description
We should produce the following metrics as Prometheus metrics on a per-PromotionStrategy basis. These form the basis that DORA metrics can be built on.
- Deployments to production (counter): increment every time a change is merged to the active branch of the terminal environment.
- Lead Time for Changes (guage in seconds)
- Time starts at the DRY commit time
- Time ends after a change is merged to the terminal environment and after the active commit status of that environment goes successful
- If a change does not make it to the terminal environment and become successful before a new commit arrives, then the time to prod should start at the commit time of the incomplete release instead of the new commit. This likely means having a status field to keep track of that commit time despite new commits arriving.
- We should produce an event and log line any time an imcomplete release is interrupted by a new commit.
- Change Failure Rate (counter): increment any time the active commit status gets marked failed on the environment, but only produce it once per commit sha. And don't increment after the environment goes healthy, even if it later becomes failed. Produce this for all environments and label them to differentiate. Add a label to be set to designate the terminal environment so the user doesn't have to know the environment name.
- Mean Time to Restore (guage in seconds): if a commit goes to a failed state in any environment, track the time it takes to get the environment back to a healthy state. Like failure rate, track pretty environment and have a label to identify the terminal environment.
Produce a log line and an event any time one of these metrics is updated.
Write docs to explain how to use these metrics to produce DORA metrics.
MrFreezeexCopilot and Voigtus
Metadata
Metadata
Assignees
Labels
No labels