Rules

backup

7.744s ago

594.5us

Rule State Error Last Evaluation Evaluation Time
alert: Backup age expr: time() - last_backup > 60 * 60 * 48 for: 2m labels: severity: important annotations: summary: '{{ $labels.instance }} last backup age is >48 hours' ok 7.745s ago 582.7us

builds.sr.ht

1.018s ago

4.444ms

Rule State Error Last Evaluation Evaluation Time
alert: High rate of build job submission expr: increase(buildsrht_builds_started_total[5m]) > 25 labels: severity: important annotations: summary: Unusual rate of build job submissions on {{$labels.instance}} ok 1.019s ago 366.7us
alert: High number of builds timing out expr: increase(buildsrht_builds_finished_total{status="timeout"}[1d]) > 10 labels: severity: important annotations: summary: High number of builds are timing out ok 1.019s ago 4.068ms

meta.sr.ht

319ms ago

981.1us

Rule State Error Last Evaluation Evaluation Time
alert: High rate of login failures expr: delta(meta_logins_failed_total[10m]) > 5 labels: security: "true" severity: important annotations: summary: Unusual number of failed logins ok 319ms ago 375.8us
alert: High rate of password resets expr: delta(meta_pw_resets_total[10m]) > 5 labels: security: "true" severity: urgent annotations: summary: Unusual number of failed logins ok 319ms ago 278.4us
alert: High rate of user registrations expr: delta(meta_registrations_total[10m]) > 5 labels: severity: interesting annotations: summary: High rate of user registrations ok 319ms ago 313.7us

node

1.406s ago

10.78ms

Rule State Error Last Evaluation Evaluation Time
alert: Instance down expr: up == 0 for: 2m labels: severity: urgent annotations: summary: Instance {{ $labels.instance }} is down ok 1.406s ago 697.6us
alert: Instance rebooted expr: node_boot_time_seconds < 60 labels: severity: interesting annotations: summary: Instance {{ $labels.instance }} was rebooted ok 1.406s ago 458.6us
alert: Read-only filesystem expr: node_filesystem_readonly{mountpoint=~"/|/var"} != 0 labels: severity: urgent annotations: summary: Instance {{ $labels.instance }} read-only filesystem on {{ $labels.mountpoint }} ok 1.406s ago 570.4us
alert: High disk usage expr: (node_filesystem_size_bytes{mountpoint=~"/|/var"} - node_filesystem_avail_bytes{mountpoint=~"/|/var"}) / node_filesystem_size_bytes{mountpoint=~"/|/var"} > 0.9 labels: severity: important annotations: summary: Instance {{ $labels.instance }} has high disk usage on {{ $labels.mountpoint }} ok 1.406s ago 1.366ms
alert: High tmpfs usage expr: (node_filesystem_size_bytes{mountpoint=~"/tmp"} - node_filesystem_avail_bytes{mountpoint=~"/tmp"}) / node_filesystem_size_bytes{mountpoint=~"/tmp"} > 0.8 for: 5m labels: severity: urgent annotations: summary: Instance {{ $labels.instance }} has tmpfs usage ok 1.404s ago 263.7us
alert: High CPU usage expr: rate(node_cpu_seconds_total{mode="user"}[2m]) > 0.75 for: 5m labels: severity: interesting annotations: summary: Instance {{ $labels.instance }} is under high CPU usage ok 1.404s ago 4.274ms
alert: Sustained high CPU usage expr: cpu_gt_75pct for: 20m labels: severity: important annotations: summary: Instance {{ $labels.instance }} is under sustained high CPU usage ok 1.4s ago 79.04us
alert: Prolonged high CPU usage expr: cpu_gt_75pct for: 1h labels: severity: urgent annotations: summary: Instance {{ $labels.instance }} is under sustained high CPU usage ok 1.401s ago 53.17us
alert: High network activity expr: (rate(node_network_receive_bytes_total{device=~"eth0|ens3|enp.*"}[5m]) / 1024 ^ 2 > 10) or (rate(node_network_transmit_bytes_total{device=~"eth0|ens3|enp.*"}[5m]) / 1024 ^ 2 > 10) for: 5m labels: severity: interesting annotations: summary: Instance {{ $labels.instance }} >10 MiB/s network use ok 1.401s ago 1.471ms
alert: Sustained high network activity expr: net_gt_10mibsec for: 20m labels: severity: important annotations: summary: Instance {{ $labels.instance }} sustained >10 MiB/s network use ok 1.4s ago 88.76us
alert: Prolonged high network activity expr: net_gt_10mibsec for: 1h labels: severity: urgent annotations: summary: Instance {{ $labels.instance }} prolonged >10 MiB/s network use ok 1.4s ago 55.96us
alert: High disk I/O expr: (rate(node_disk_read_bytes_total{device=~"sd.*|vd.*"}[5m]) / 1024 ^ 2 > 5) or (rate(node_disk_write_bytes_total{device=~"sd.*|vd.*"}[5m]) / 1024 ^ 2 > 5) for: 5m labels: severity: interesting annotations: summary: Instance {{ $labels.instance }} >2 MiB/s disk I/O ok 1.4s ago 1.197ms
alert: Sustained high disk I/O expr: disk_gt_5mibsec for: 20m labels: severity: important annotations: summary: Instance {{ $labels.instance }} sustained >2 MiB/s disk I/O ok 1.399s ago 71.69us
alert: Prolonged high disk I/O expr: disk_gt_5mibsec for: 1h labels: severity: urgent annotations: summary: Instance {{ $labels.instance }} prolonged >2 MiB/s disk I/O ok 1.399s ago 89.77us

ssl

6.034s ago

789.2us

Rule State Error Last Evaluation Evaluation Time
alert: SSL expiration expr: (certificate_expiration - time()) / 60 / 60 / 24 < 7 for: 2m labels: severity: important annotations: summary: '{{ $labels.instance }} SSL certificate expires in < 1 week' ok 6.034s ago 779.5us