mirror of
https://github.com/prometheus/prometheus.git
synced 2025-07-03 02:53:24 +00:00
notifier: fix increment of metric prometheus_notifications_errors_total
Previously, prometheus_notifications_errors_total was incremented by one whenever a batch of alerts was affected by an error during sending to a specific alertmanager. However, the corresponding metric prometheus_notifications_sent_total, counting all alerts that were sent (including those where the sent ended in error), is incremented by the batch size, i.e. the number of alerts. Therefore, the ratio used in the mixin for the PrometheusErrorSendingAlertsToSomeAlertmanagers alert is inconsistent. This commit changes the increment of prometheus_notifications_errors_total to the number of alerts that were sent in the attempt that ended in an error. It also adjusts the metrics help string accordingly and makes the wording in the alert in the mixin more precise. Signed-off-by: beorn7 <beorn@grafana.com>
This commit is contained in:
parent
a6fb16fcb4
commit
e01c5cefac
3 changed files with 7 additions and 6 deletions
|
@ -84,8 +84,8 @@
|
|||
severity: 'warning',
|
||||
},
|
||||
annotations: {
|
||||
summary: 'Prometheus has encountered more than 1% errors sending alerts to a specific Alertmanager.',
|
||||
description: '{{ printf "%%.1f" $value }}%% errors while sending alerts from Prometheus %(prometheusName)s to Alertmanager {{$labels.alertmanager}}.' % $._config,
|
||||
summary: 'More than 1% of alerts sent by Prometheus to a specific Alertmanager were affected by errors.',
|
||||
description: '{{ printf "%%.1f" $value }}%% of alerts sent by Prometheus %(prometheusName)s to Alertmanager {{$labels.alertmanager}} were affected by errors.' % $._config,
|
||||
},
|
||||
},
|
||||
{
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue