Prometheus Operator 监控k8s组件

2023年 5月 4日 78.9k 0

默认情况下,prometheus operator已经可以监控我们的集群,但是无法监控kube-controller-manager和kube-scheduler。 这里我们将这2个组件进行监控,并将prometheus和grafana添加traefik。通过ingress进行访问
关于operator介绍相关可以参考之前的文章

Prometheus Operator

新闻联播老司机

  • 19年7月11日
  • 喜欢:0
  • 浏览:4.5k
  • 分类文件

    这里将operator文件进行分类

    wget -P /root/ http://down.i4t.com/abcdocker-prometheus-operator.yaml.zip
    cd /root/
    unzip abcdocker-prometheus-operator.yaml.zip
    mkdir kube-prom
    cp -a kube-prometheus-master/manifests/* kube-prom/
    cd kube-prom/
    mkdir -p node-exporter alertmanager grafana kube-state-metrics prometheus serviceMonitor adapter operator
    mv *-serviceMonitor* serviceMonitor/
    mv setup operator/
    mv grafana-* grafana/
    mv kube-state-metrics-* kube-state-metrics/
    mv alertmanager-* alertmanager/
    mv node-exporter-* node-exporter/
    mv prometheus-adapter* adapter/
    mv prometheus-* prometheus/
    mv 0prometheus-operator-* operator/
    mv 00namespace-namespace.yaml operator/
    
    
    ## 安装顺序也需要改变 (之前已经安装也可以跳过)
    [root@k8s-01 kube-prom]# kubectl apply -f operator/
    namespace/monitoring created
    customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
    customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
    clusterrole.rbac.authorization.k8s.io/prometheus-operator created
    clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
    deployment.apps/prometheus-operator created
    service/prometheus-operator created
    serviceaccount/prometheus-operator created
    
    Pod启动了就可以执行剩下的
    [root@k8s-01 kube-prom]# kubectl -n monitoring get pod
    NAME                                   READY   STATUS    RESTARTS   AGE
    prometheus-operator-69bd579bf9-7kpd7   1/1     Running   0          7s
    
    #剩下步骤
    kubectl apply -f adapter/
    kubectl apply -f alertmanager/
    kubectl apply -f node-exporter/
    kubectl apply -f kube-state-metrics/
    kubectl apply -f grafana/
    kubectl apply -f prometheus/
    kubectl apply -f serviceMonitor/
    
    执行完检查没问题就可以结束了
    [root@k8s-01 kube-prom]# kubectl get -n monitoring all
    

    配置Ingress

    首先需要先安装traefik,node-port方式效率不行,建议使用traefik

    Kubernetes Traefik Ingress

    新闻联播老司机

  • 20年2月5日
  • 喜欢:0
  • 浏览:3.9k
  • 环境初始化
    首先我们需要将prometheus operator中的svc类型都修改为ClusterIP,如果默认没有修改的话,默认就是ClusterIP

    [root@k8s-01 ingress]# kubectl get pod,svc -n monitoring
    NAME                                       READY   STATUS    RESTARTS   AGE
    pod/alertmanager-main-0                    2/2     Running   0          88s
    pod/alertmanager-main-1                    2/2     Running   0          77s
    pod/alertmanager-main-2                    2/2     Running   0          69s
    pod/grafana-558647b59-mj85j                1/1     Running   0          96s
    pod/kube-state-metrics-5bfc7db74d-kpgh2    4/4     Running   0          96s
    pod/node-exporter-5kz8x                    2/2     Running   0          94s
    pod/node-exporter-jnmr7                    2/2     Running   0          94s
    pod/node-exporter-pztln                    2/2     Running   0          93s
    pod/node-exporter-ts455                    2/2     Running   0          94s
    pod/prometheus-adapter-57c497c557-6tscz    1/1     Running   0          91s
    pod/prometheus-k8s-0                       3/3     Running   1          78s
    pod/prometheus-k8s-1                       3/3     Running   1          78s
    pod/prometheus-operator-69bd579bf9-rrf96   1/1     Running   1          98s
    
    NAME                            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
    service/alertmanager-main       ClusterIP   10.254.201.109           9093/TCP            99s
    service/alertmanager-operated   ClusterIP   None                     9093/TCP,6783/TCP   89s
    service/grafana                 ClusterIP   10.254.19.174            3000/TCP            97s
    service/kube-state-metrics      ClusterIP   None                     8443/TCP,9443/TCP   96s
    service/node-exporter           ClusterIP   None                     9100/TCP            95s
    service/prometheus-adapter      ClusterIP   10.254.197.151           443/TCP             93s
    service/prometheus-k8s          ClusterIP   10.254.120.188           9090/TCP            89s
    service/prometheus-operated     ClusterIP   None                     9090/TCP            78s
    service/prometheus-operator     ClusterIP   None                     8080/TCP            99s
    

    接下来我们为prometheus ui和grafana以及alertmanager创建ingress
    (可以分开写,不写在一个文件里面)

    vim ingress.yaml
    
    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata:
      name: prometheus-ing
      namespace: monitoring
    spec:
      rules:
      - host: prometheus.i4t.com
        http:
          paths:
          - backend:
              serviceName: prometheus-k8s
              servicePort: 9090
    ---
    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata:
      name: grafana-ing
      namespace: monitoring
    spec:
      rules:
      - host: grafana.i4t.com
        http:
          paths:
          - backend:
              serviceName: grafana
              servicePort: 3000
    ---
    apiVersion: extensions/v1beta1
    kind: Ingress
    metadata:
      name: alertmanager-ing
      namespace: monitoring
    spec:
      rules:
      - host: alertmanager.i4t.com
        http:
          paths:
          - backend:
              serviceName: alertmanager-main
              servicePort: 9093
    
    
    ## host为域名,serviceName是prometheus的svc名称和端口
    [root@k8s-01 ingress]# kubectl apply -f ingress.yaml
    ingress.extensions/prometheus-operator created
    
    
    [root@k8s-01 ingress]# kubectl get ingress -n monitoring
    NAME               HOSTS                  ADDRESS   PORTS   AGE
    alertmanager-ing   alertmanager.i4t.com             80      13s
    grafana-ing        grafana.i4t.com                  80      13s
    prometheus-ing     prometheus.i4t.com               80      13s
    

    我们也可以在ui界面查看traefik
    image_1e2ct189i103rfb71tg8fm1hoc9.png-162.5kB
    接下来进行域名解析 (我这里使用修改host方式演示)

    #mac
    ➜  ~ sudo vim /etc/hosts
    Password:
    
    #windows
    C:WindowsSystem32driversetc
    

    image_1e2ct6jp8ebi2v51ihj7ct1nomm.png-126.7kB

    监控k8s组件

    这里我们可以看到,prometheus operator并没有监控到kube-controller-manager和scheduler由于我这里是二进制安装,所以并没有获取到相关的信息
    image_1e2ct8c7j3eo159ejd1qgt1n7l13.png-122.1kB
    这是由于serverMonitor根据label去选取svc的,我们可以看到对应的serviceMonitor选取的范围是kube-system

    [root@k8s-01 manifests]#  grep -2 selector prometheus-serviceMonitorKube*
    prometheus-serviceMonitorKubeControllerManager.yaml-    matchNames:
    prometheus-serviceMonitorKubeControllerManager.yaml-    - kube-system
    prometheus-serviceMonitorKubeControllerManager.yaml:  selector:
    prometheus-serviceMonitorKubeControllerManager.yaml-    matchLabels:
    prometheus-serviceMonitorKubeControllerManager.yaml-      k8s-app: kube-controller-manager
    --
    prometheus-serviceMonitorKubelet.yaml-    matchNames:
    prometheus-serviceMonitorKubelet.yaml-    - kube-system
    prometheus-serviceMonitorKubelet.yaml:  selector:
    prometheus-serviceMonitorKubelet.yaml-    matchLabels:
    prometheus-serviceMonitorKubelet.yaml-      k8s-app: kubelet
    --
    prometheus-serviceMonitorKubeScheduler.yaml-    matchNames:
    prometheus-serviceMonitorKubeScheduler.yaml-    - kube-system
    prometheus-serviceMonitorKubeScheduler.yaml:  selector:
    prometheus-serviceMonitorKubeScheduler.yaml-    matchLabels:
    prometheus-serviceMonitorKubeScheduler.yaml-      k8s-app: kube-scheduler
    

    而kube-system默认里也没有符合标签的label

    [root@k8s-01 manifests]# kubectl get svc -n kube-system
    NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                       AGE
    kube-dns                  ClusterIP   10.254.0.2               53/UDP,53/TCP,9153/TCP        31d
    kubelet                   ClusterIP   None                     10250/TCP                     2d8h
    kubernetes-dashboard      NodePort    10.254.194.101           80:30000/TCP                  31d
    traefik-ingress-service   NodePort    10.254.160.25            80:23633/TCP,8080:15301/TCP   38m
    

    但是却有endpoint (我这里二进制安装有)

    [root@k8s-01 manifests]# kubectl get ep -n kube-system
    NAME                      ENDPOINTS                                                              AGE
    kube-controller-manager                                                                    31d
    kube-dns                  172.30.248.2:53,172.30.72.4:53,172.30.248.2:53 + 3 more...             31d
    kube-scheduler                                                                             31d
    kubelet                   192.168.0.10:10255,192.168.0.11:10255,192.168.0.12:10255 + 9 more...   2d8h
    kubernetes-dashboard      172.30.232.2:8443                                                      31d
    traefik-ingress-service   172.30.232.5:80,172.30.232.5:8080                                      39m
    

    解决办法
    这里创建两个管理组件的svc,将svc的label设置为k8s-app: {kube-controller-manager、kube-scheduler},这样就可以被servicemonitor选中
    二进制安装解决方法

    Kubernetes 1.14 二进制集群安装

    新闻联播老司机

  • 19年8月13日
  • 喜欢:1
  • 浏览:18.6k
  • 创建一个svc用来绑定

    apiVersion: v1
    kind: Service
    metadata:
      namespace: kube-system
      name: kube-controller-manager
      labels:
        k8s-app: kube-controller-manager
    spec:
      selector:
        component: kube-controller-manager
      type: ClusterIP
      clusterIP: None
      ports:
      - name: http-metrics
        port: 10252
        targetPort: 10252
        protocol: TCP
    ---
    apiVersion: v1
    kind: Service
    metadata:
      namespace: kube-system
      name: kube-scheduler
      labels:
        k8s-app: kube-scheduler
    spec:
      selector:
        component: kube-scheduler
      type: ClusterIP
      clusterIP: None
      ports:
      - name: http-metrics
        port: 10251
        targetPort: 10251
        protocol: TCP
    

    手动填写svc对应的ep的属性,ep的名称要和svc名称和属性对应上

    apiVersion: v1
    kind: Endpoints
    metadata:
      labels:
        k8s-app: kube-controller-manager
      name: kube-controller-manager
      namespace: kube-system
    subsets:
    - addresses:
      - ip: 192.168.0.10
      - ip: 192.168.0.11
      - ip: 192.168.0.12
      ports:
      - name: http-metrics
        port: 10252
        protocol: TCP
    ---
    apiVersion: v1
    kind: Endpoints
    metadata:
      labels:
        k8s-app: kube-scheduler
      name: kube-scheduler
      namespace: kube-system
    subsets:
    - addresses:
      - ip: 192.168.0.10
      - ip: 192.168.0.11
      - ip: 192.168.0.12
      ports:
      - name: http-metrics
        port: 10251
        protocol: TCP
    

    我们查看一下svc,已经和我们ep进行绑定

    [root@k8s-01 test]# kubectl get svc -n kube-system
    NAME                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                       AGE
    kube-controller-manager   ClusterIP   None                     10252/TCP                     64s
    kube-dns                  ClusterIP   10.254.0.2               53/UDP,53/TCP,9153/TCP        31d
    kube-scheduler            ClusterIP   None                     10251/TCP                     64s
    kubelet                   ClusterIP   None                     10250/TCP                     2d9h
    kubernetes-dashboard      NodePort    10.254.194.101           80:30000/TCP                  31d
    traefik-ingress-service   NodePort    10.254.160.25            80:23633/TCP,8080:15301/TCP   126m
    [root@k8s-01 test]# kubectl describe svc -n kube-system kube-scheduler
    Name:              kube-scheduler
    Namespace:         kube-system
    Labels:            k8s-app=kube-scheduler
    Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                         {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"k8s-app":"kube-scheduler"},"name":"kube-scheduler","namespace"...
    Selector:          component=kube-scheduler
    Type:              ClusterIP
    IP:                None
    Port:              http-metrics  10251/TCP
    TargetPort:        10251/TCP
    Endpoints:         192.168.0.10:10251,192.168.0.11:10251,192.168.0.12:10251
    Session Affinity:  None
    Events:            
    

    我这里master就3个所以scheduler和kube-controller-manager就只有3个
    image_1e2d2lojpag063p19cbs731g8k1g.png-305.5kB
    image_1e2d4ferpud11hkmufa159d1jqd1t.png-314.8kB
    针对kubeadm可以参考下面的解决方法,由于我这里没有环境所以不进行演示

    apiVersion: v1
    kind: Endpoints
    metadata:
      labels:
        k8s-app: kubelet
      name: kubelet
      namespace: kube-system
    subsets:
    - addresses:
      - ip: 172.16.0.14
        targetRef:
          kind: Node
          name: k8s-n2
      - ip: 172.16.0.18
        targetRef:
          kind: Node
          name: k8s-n3
      - ip: 172.16.0.2
        targetRef:
          kind: Node
          name: k8s-m1
      - ip: 172.16.0.20
        targetRef:
          kind: Node
          name: k8s-n4
      - ip: 172.16.0.21
        targetRef:
          kind: Node
          name: k8s-n5
      ports:
      - name: http-metrics
        port: 10255
        protocol: TCP
      - name: cadvisor
        port: 4194
        protocol: TCP
      - name: https-metrics
        port: 10250
        protocol: TCP
    

    如果我们添加监控后提示ip:10251 Connection refused

  • 二进制安装
  • 需要修改scheduler的配置文件

    在启动文件中添加
    --bind-address=0.0.0.0
    
  • kubeadm安装
  • 需要在在修改Pod中添加,我不太了解kubeadm这里不过多说明

    相关文章:

    1. Kubernetes 1.14 二进制集群安装
    2. Kubenetes 1.13.5 集群二进制安装
    3. Kuerbernetes 1.11 集群二进制安装
    4. Prometheus Operator

    相关文章

    对接alertmanager创建钉钉卡片(1)
    手把手教你搭建OpenFalcon监控系统
    无需任何魔法即可使用 Ansible 的神奇变量“hostvars”
    openobseve HA本地单集群模式
    基于k8s上loggie/vector/openobserve日志收集
    openobseve单节点和查询语法

    发布评论