Promethues 添加Nginx_Export监控

2023年 6月 9日 58.7k 0

Nginx 通过 stub_status 页面暴露了部分监控指标。Nginx Prometheus Exporter 会采集单个 Nginx 实例指标,并将其转化为 Prometheus 可用的监控数据, 最终通过 HTTP 协议暴露给 Prometheus 服务进行采集。我们可以通过 Exporter 上报重点关注的监控指标,用于异常报警和大盘展示。

本次基于http_stub_status_module模块获取数据源

Nginx 安装

如果之前没有安装,我这里有对应脚本,直接运行即可

wget https://d.frps.cn/file/tools/nginx/nginx_install.sh
sh nginx_install.sh

#版本后续会迭代更新,本次的版本为nginx-1.22

配置stub_status

确认 stub_status 模块启用之后,修改 Nginx 的配置文件指定 status 页面的 URL

[root@ops conf.d]# cat nginx.conf
server {
  listen       80;
  server_name  localhost;
     location /nginx_status {
         stub_status;
         access_log off;
         allow 127.0.0.1;
         deny all;
     }
}

配置完毕检查

[root@ops conf.d]# curl localhost/nginx_status
Active connections: 1
server accepts handled requests
 1 1 1
Reading: 0 Writing: 1 Waiting: 0

Nginx Exporter 安装

项目地址:https://github.com/nginxinc/nginx-prometheus-exporter/releases

  • Docker环境
  • $ docker run -p 9113:9113 nginx/nginx-prometheus-exporter:0.10.0 -nginx.scrape-uri=http://<nginx>:8080/stub_status
  • 宿主机环境
  • http://localhost:80/nginx_status 地址为Nginx status地址

    #下载nginx exporter
    [root@ops ~]# wget https://github.com/nginxinc/nginx-prometheus-exporter/releases/download/v0.11.0/nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz
    
    #启动测试
    [root@ops ~]# ./nginx-prometheus-exporter -nginx.scrape-uri http://localhost:80/nginx_status
    NGINX Prometheus Exporter version=0.11.0 commit=e4a6810d4f0b776f7fde37fea1d84e4c7284b72a date=2022-09-07T21:09:51Z, dirty=false, arch=linux/amd64, go=go1.19
    2023/06/07 14:31:35 Starting...
    2023/06/07 14:31:35 Listening on :9113
    2023/06/07 14:31:35 NGINX Prometheus Exporter has successfully started
    
    #设置后台启动
    [root@ops ~]# nohup ./nginx-prometheus-exporter -nginx.scrape-uri http://localhost:80/nginx_status &

    检查一下metric

    [root@ops ~]# curl localhost:9113/metrics
    # HELP nginx_connections_accepted Accepted client connections
    # TYPE nginx_connections_accepted counter
    nginx_connections_accepted 4
    # HELP nginx_connections_active Active client connections
    # TYPE nginx_connections_active gauge
    nginx_connections_active 1
    # HELP nginx_connections_handled Handled client connections
    # TYPE nginx_connections_handled counter
    nginx_connections_handled 4
    # HELP nginx_connections_reading Connections where NGINX is reading the request header
    # TYPE nginx_connections_reading gauge
    nginx_connections_reading 0
    # HELP nginx_connections_waiting Idle client connections
    # TYPE nginx_connections_waiting gauge
    nginx_connections_waiting 0
    # HELP nginx_connections_writing Connections where NGINX is writing the response back to the client
    # TYPE nginx_connections_writing gauge
    nginx_connections_writing 1
    # HELP nginx_http_requests_total Total http requests
    # TYPE nginx_http_requests_total counter
    nginx_http_requests_total 5
    # HELP nginx_up Status of the last metric scrape
    # TYPE nginx_up gauge
    nginx_up 1
    # HELP nginxexporter_build_info Exporter build information
    # TYPE nginxexporter_build_info gauge
    nginxexporter_build_info{arch="linux/amd64",commit="e4a6810d4f0b776f7fde37fea1d84e4c7284b72a",date="2022-09-07T21:09:51Z",dirty="false",go="go1.19",version="0.11.0"} 1

    metric相关参数值说明如下
    这里的指标都是来自stub_status模块

    名称
    说明
    标签
    nginx_connections_accepted 已接受的客户端连接 []
    nginx_connections_active 活动的客户端连接 []
    nginx_connections_handled 处理客户端连接 []
    nginx_connections_reading NGINX 正在读取请求标头的连接 []
    nginx_connections_waiting 空闲客户端连接 []
    nginx_connections_writing NGINX 将响应写回客户端的连接 []
    nginx_http_requests_total http 请求总数 []
    nginx_up NGINX状态;1表示抓取成功, 0表示抓取失败。 []

    配置 Prometheus 的抓取 Job

    Exporter 和 Nginx 并非共同运行,所以数据上报的 instance 并不能真实描述是哪个实例,为了方便数据的检索和观察,我们可以修改 instance 标签,使用真实的 IP 进行替换以便更加直观

      - job_name: 'abcdocker_nginx_exporter'
        static_configs:
        - targets: ['192.168.31.101:9113']
        relabel_configs:
         - source_labels: [__address__]
           regex: '.*'
           target_label: instance
           replacement: '192.168.31.101:80'

    1686121698533.png

    Grafana 添加

    将dashboard.json下载完毕上传到Grafana中即可

    https://github.com/nginxinc/nginx-prometheus-exporter/blob/main/grafana/dashboard.json
    备份站:https://d.frps.cn/file/tools/nginx/nginx_exporter_dashboard.json

    1686124567754.png

    Alertmanager 告警

    [root@prometheus ~]# cat /etc/prometheus/rules/nginx_exporter.yaml
    groups:
      - name: Nginx Export监控
        rules:
          - alert: NginxHighHttp4xxErrorRate
            expr: sum(rate(nginx_http_requests_total{status=~"^4.."}[1m])) / sum(rate(nginx_http_requests_total[1m])) * 100 > 5
            for: 1m
            labels:
              severity: critical
            annotations:
              summary: Nginx high HTTP 4xx error rate (instance {{ $labels.instance }})
              description: "Too many HTTP requests with status 4xx (> 5%)n  VALUE = {{ $value }}n  LABELS = {{ $labels }}"
          - alert: NginxHighHttp5xxErrorRate
            expr: sum(rate(nginx_http_requests_total{status=~"^5.."}[1m])) / sum(rate(nginx_http_requests_total[1m])) * 100 > 5
            for: 1m
            labels:
              severity: critical
            annotations:
              summary: Nginx high HTTP 5xx error rate (instance {{ $labels.instance }})
              description: "Too many HTTP requests with status 5xx (> 5%)n  VALUE = {{ $value }}n  LABELS = {{ $labels }}"
          - alert: NginxStatus
            expr: nginx_up != 1
            for: 1m
            labels:
              severity: critical
            annotations:
             summary: Nginx 服务停止 (instance {{ $labels.instance }})
             description: "NGINX 服务已停止  VALUE = {{ $value }}n  LABELS = {{ $labels }}"

    添加完成后,我们停止nginx做演示
    效果图1686129786569.png1686130611105.png

    **********告警通知**********
    告警类型: NginxStatus
    告警级别: critical
    =====================
    告警主题: Nginx 服务停止 (instance 192.168.31.101:80)
    告警详情: NGINX 服务已停止  VALUE = 0
      LABELS = map[__name__:nginx_up instance:192.168.31.101:80 job:abcdocker_nginx_exporter]
    故障时间: 2023-06-07 17:23:48.61 +0800 CST
    故障实例: 192.168.31.101:80

    相关文章:

    1. nginx 基于tcp/udp代理
    2. Prometheus Operator
    3. Prometheus Operator 监控ETCD集群
    4. Nginx宝塔反向代理OpenAI接口

    相关文章

    LeaferJS 1.0 重磅发布:强悍的前端 Canvas 渲染引擎
    10分钟搞定支持通配符的永久有效免费HTTPS证书
    300 多个 Microsoft Excel 快捷方式
    一步步配置基于kubeadmin的kubevip高可用
    istio全链路传递cookie和header灰度
    REST Web 服务版本控制

    发布评论