Prometheus Operator 监控k8s组件
默认情况下,prometheus operator已经可以监控我们的集群,但是无法监控kube-controller-manager和kube-scheduler。 这里我们将这2个组件进行监控,并将prometheus和grafana添加traefik。通过ingress进行访问
关于operator介绍相关可以参考之前的文章


Prometheus Operator
新闻联播老司机
分类文件
这里将operator文件进行分类
wget -P /root/ http://down.i4t.com/abcdocker-prometheus-operator.yaml.zip cd /root/ unzip abcdocker-prometheus-operator.yaml.zip mkdir kube-prom cp -a kube-prometheus-master/manifests/* kube-prom/ cd kube-prom/ mkdir -p node-exporter alertmanager grafana kube-state-metrics prometheus serviceMonitor adapter operator mv *-serviceMonitor* serviceMonitor/ mv setup operator/ mv grafana-* grafana/ mv kube-state-metrics-* kube-state-metrics/ mv alertmanager-* alertmanager/ mv node-exporter-* node-exporter/ mv prometheus-adapter* adapter/ mv prometheus-* prometheus/ mv 0prometheus-operator-* operator/ mv 00namespace-namespace.yaml operator/ ## 安装顺序也需要改变 (之前已经安装也可以跳过) [root@k8s-01 kube-prom]# kubectl apply -f operator/ namespace/monitoring created customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created clusterrole.rbac.authorization.k8s.io/prometheus-operator created clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created deployment.apps/prometheus-operator created service/prometheus-operator created serviceaccount/prometheus-operator created Pod启动了就可以执行剩下的 [root@k8s-01 kube-prom]# kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE prometheus-operator-69bd579bf9-7kpd7 1/1 Running 0 7s #剩下步骤 kubectl apply -f adapter/ kubectl apply -f alertmanager/ kubectl apply -f node-exporter/ kubectl apply -f kube-state-metrics/ kubectl apply -f grafana/ kubectl apply -f prometheus/ kubectl apply -f serviceMonitor/ 执行完检查没问题就可以结束了 [root@k8s-01 kube-prom]# kubectl get -n monitoring all
配置Ingress
首先需要先安装traefik,node-port方式效率不行,建议使用traefik


Kubernetes Traefik Ingress
新闻联播老司机
环境初始化
首先我们需要将prometheus operator中的svc类型都修改为ClusterIP,如果默认没有修改的话,默认就是ClusterIP
[root@k8s-01 ingress]# kubectl get pod,svc -n monitoring NAME READY STATUS RESTARTS AGE pod/alertmanager-main-0 2/2 Running 0 88s pod/alertmanager-main-1 2/2 Running 0 77s pod/alertmanager-main-2 2/2 Running 0 69s pod/grafana-558647b59-mj85j 1/1 Running 0 96s pod/kube-state-metrics-5bfc7db74d-kpgh2 4/4 Running 0 96s pod/node-exporter-5kz8x 2/2 Running 0 94s pod/node-exporter-jnmr7 2/2 Running 0 94s pod/node-exporter-pztln 2/2 Running 0 93s pod/node-exporter-ts455 2/2 Running 0 94s pod/prometheus-adapter-57c497c557-6tscz 1/1 Running 0 91s pod/prometheus-k8s-0 3/3 Running 1 78s pod/prometheus-k8s-1 3/3 Running 1 78s pod/prometheus-operator-69bd579bf9-rrf96 1/1 Running 1 98s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/alertmanager-main ClusterIP 10.254.201.109 9093/TCP 99s service/alertmanager-operated ClusterIP None 9093/TCP,6783/TCP 89s service/grafana ClusterIP 10.254.19.174 3000/TCP 97s service/kube-state-metrics ClusterIP None 8443/TCP,9443/TCP 96s service/node-exporter ClusterIP None 9100/TCP 95s service/prometheus-adapter ClusterIP 10.254.197.151 443/TCP 93s service/prometheus-k8s ClusterIP 10.254.120.188 9090/TCP 89s service/prometheus-operated ClusterIP None 9090/TCP 78s service/prometheus-operator ClusterIP None 8080/TCP 99s
接下来我们为prometheus ui和grafana以及alertmanager创建ingress
(可以分开写,不写在一个文件里面)
vim ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: prometheus-ing
namespace: monitoring
spec:
rules:
- host: prometheus.i4t.com
http:
paths:
- backend:
serviceName: prometheus-k8s
servicePort: 9090
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: grafana-ing
namespace: monitoring
spec:
rules:
- host: grafana.i4t.com
http:
paths:
- backend:
serviceName: grafana
servicePort: 3000
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: alertmanager-ing
namespace: monitoring
spec:
rules:
- host: alertmanager.i4t.com
http:
paths:
- backend:
serviceName: alertmanager-main
servicePort: 9093
## host为域名,serviceName是prometheus的svc名称和端口
[root@k8s-01 ingress]# kubectl apply -f ingress.yaml
ingress.extensions/prometheus-operator created
[root@k8s-01 ingress]# kubectl get ingress -n monitoring
NAME HOSTS ADDRESS PORTS AGE
alertmanager-ing alertmanager.i4t.com 80 13s
grafana-ing grafana.i4t.com 80 13s
prometheus-ing prometheus.i4t.com 80 13s
我们也可以在ui界面查看traefik

接下来进行域名解析 (我这里使用修改host方式演示)
#mac ➜ ~ sudo vim /etc/hosts Password: #windows C:WindowsSystem32driversetc

监控k8s组件
这里我们可以看到,prometheus operator并没有监控到kube-controller-manager和scheduler由于我这里是二进制安装,所以并没有获取到相关的信息

这是由于serverMonitor根据label去选取svc的,我们可以看到对应的serviceMonitor选取的范围是kube-system
[root@k8s-01 manifests]# grep -2 selector prometheus-serviceMonitorKube* prometheus-serviceMonitorKubeControllerManager.yaml- matchNames: prometheus-serviceMonitorKubeControllerManager.yaml- - kube-system prometheus-serviceMonitorKubeControllerManager.yaml: selector: prometheus-serviceMonitorKubeControllerManager.yaml- matchLabels: prometheus-serviceMonitorKubeControllerManager.yaml- k8s-app: kube-controller-manager -- prometheus-serviceMonitorKubelet.yaml- matchNames: prometheus-serviceMonitorKubelet.yaml- - kube-system prometheus-serviceMonitorKubelet.yaml: selector: prometheus-serviceMonitorKubelet.yaml- matchLabels: prometheus-serviceMonitorKubelet.yaml- k8s-app: kubelet -- prometheus-serviceMonitorKubeScheduler.yaml- matchNames: prometheus-serviceMonitorKubeScheduler.yaml- - kube-system prometheus-serviceMonitorKubeScheduler.yaml: selector: prometheus-serviceMonitorKubeScheduler.yaml- matchLabels: prometheus-serviceMonitorKubeScheduler.yaml- k8s-app: kube-scheduler
而kube-system默认里也没有符合标签的label
[root@k8s-01 manifests]# kubectl get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.254.0.2 53/UDP,53/TCP,9153/TCP 31d kubelet ClusterIP None 10250/TCP 2d8h kubernetes-dashboard NodePort 10.254.194.101 80:30000/TCP 31d traefik-ingress-service NodePort 10.254.160.25 80:23633/TCP,8080:15301/TCP 38m
但是却有endpoint (我这里二进制安装有)
[root@k8s-01 manifests]# kubectl get ep -n kube-system NAME ENDPOINTS AGE kube-controller-manager 31d kube-dns 172.30.248.2:53,172.30.72.4:53,172.30.248.2:53 + 3 more... 31d kube-scheduler 31d kubelet 192.168.0.10:10255,192.168.0.11:10255,192.168.0.12:10255 + 9 more... 2d8h kubernetes-dashboard 172.30.232.2:8443 31d traefik-ingress-service 172.30.232.5:80,172.30.232.5:8080 39m
解决办法
这里创建两个管理组件的svc,将svc的label设置为k8s-app: {kube-controller-manager、kube-scheduler},这样就可以被servicemonitor选中
二进制安装解决方法


Kubernetes 1.14 二进制集群安装
新闻联播老司机
创建一个svc用来绑定
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-controller-manager
labels:
k8s-app: kube-controller-manager
spec:
selector:
component: kube-controller-manager
type: ClusterIP
clusterIP: None
ports:
- name: http-metrics
port: 10252
targetPort: 10252
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler
labels:
k8s-app: kube-scheduler
spec:
selector:
component: kube-scheduler
type: ClusterIP
clusterIP: None
ports:
- name: http-metrics
port: 10251
targetPort: 10251
protocol: TCP
手动填写svc对应的ep的属性,ep的名称要和svc名称和属性对应上
apiVersion: v1
kind: Endpoints
metadata:
labels:
k8s-app: kube-controller-manager
name: kube-controller-manager
namespace: kube-system
subsets:
- addresses:
- ip: 192.168.0.10
- ip: 192.168.0.11
- ip: 192.168.0.12
ports:
- name: http-metrics
port: 10252
protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
labels:
k8s-app: kube-scheduler
name: kube-scheduler
namespace: kube-system
subsets:
- addresses:
- ip: 192.168.0.10
- ip: 192.168.0.11
- ip: 192.168.0.12
ports:
- name: http-metrics
port: 10251
protocol: TCP
我们查看一下svc,已经和我们ep进行绑定
[root@k8s-01 test]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-controller-manager ClusterIP None 10252/TCP 64s
kube-dns ClusterIP 10.254.0.2 53/UDP,53/TCP,9153/TCP 31d
kube-scheduler ClusterIP None 10251/TCP 64s
kubelet ClusterIP None 10250/TCP 2d9h
kubernetes-dashboard NodePort 10.254.194.101 80:30000/TCP 31d
traefik-ingress-service NodePort 10.254.160.25 80:23633/TCP,8080:15301/TCP 126m
[root@k8s-01 test]# kubectl describe svc -n kube-system kube-scheduler
Name: kube-scheduler
Namespace: kube-system
Labels: k8s-app=kube-scheduler
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"k8s-app":"kube-scheduler"},"name":"kube-scheduler","namespace"...
Selector: component=kube-scheduler
Type: ClusterIP
IP: None
Port: http-metrics 10251/TCP
TargetPort: 10251/TCP
Endpoints: 192.168.0.10:10251,192.168.0.11:10251,192.168.0.12:10251
Session Affinity: None
Events:
我这里master就3个所以scheduler和kube-controller-manager就只有3个


针对kubeadm可以参考下面的解决方法,由于我这里没有环境所以不进行演示
apiVersion: v1
kind: Endpoints
metadata:
labels:
k8s-app: kubelet
name: kubelet
namespace: kube-system
subsets:
- addresses:
- ip: 172.16.0.14
targetRef:
kind: Node
name: k8s-n2
- ip: 172.16.0.18
targetRef:
kind: Node
name: k8s-n3
- ip: 172.16.0.2
targetRef:
kind: Node
name: k8s-m1
- ip: 172.16.0.20
targetRef:
kind: Node
name: k8s-n4
- ip: 172.16.0.21
targetRef:
kind: Node
name: k8s-n5
ports:
- name: http-metrics
port: 10255
protocol: TCP
- name: cadvisor
port: 4194
protocol: TCP
- name: https-metrics
port: 10250
protocol: TCP
如果我们添加监控后提示ip:10251 Connection refused
需要修改scheduler的配置文件
在启动文件中添加 --bind-address=0.0.0.0
需要在在修改Pod中添加,我不太了解kubeadm这里不过多说明
相关文章:
- Kubernetes 1.14 二进制集群安装
- Kubenetes 1.13.5 集群二进制安装
- Kuerbernetes 1.11 集群二进制安装
- Prometheus Operator