默认情况下,prometheus operator已经可以监控我们的集群,但是无法监控kube-controller-manager和kube-scheduler。 这里我们将这2个组件进行监控,并将prometheus和grafana添加traefik。通过ingress进行访问
关于operator介绍相关可以参考之前的文章
Prometheus Operator
新闻联播老司机
分类文件
这里将operator文件进行分类
wget -P /root/ http://down.i4t.com/abcdocker-prometheus-operator.yaml.zip cd /root/ unzip abcdocker-prometheus-operator.yaml.zip mkdir kube-prom cp -a kube-prometheus-master/manifests/* kube-prom/ cd kube-prom/ mkdir -p node-exporter alertmanager grafana kube-state-metrics prometheus serviceMonitor adapter operator mv *-serviceMonitor* serviceMonitor/ mv setup operator/ mv grafana-* grafana/ mv kube-state-metrics-* kube-state-metrics/ mv alertmanager-* alertmanager/ mv node-exporter-* node-exporter/ mv prometheus-adapter* adapter/ mv prometheus-* prometheus/ mv 0prometheus-operator-* operator/ mv 00namespace-namespace.yaml operator/ ## 安装顺序也需要改变 (之前已经安装也可以跳过) [root@k8s-01 kube-prom]# kubectl apply -f operator/ namespace/monitoring created customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created clusterrole.rbac.authorization.k8s.io/prometheus-operator created clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created deployment.apps/prometheus-operator created service/prometheus-operator created serviceaccount/prometheus-operator created Pod启动了就可以执行剩下的 [root@k8s-01 kube-prom]# kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE prometheus-operator-69bd579bf9-7kpd7 1/1 Running 0 7s #剩下步骤 kubectl apply -f adapter/ kubectl apply -f alertmanager/ kubectl apply -f node-exporter/ kubectl apply -f kube-state-metrics/ kubectl apply -f grafana/ kubectl apply -f prometheus/ kubectl apply -f serviceMonitor/ 执行完检查没问题就可以结束了 [root@k8s-01 kube-prom]# kubectl get -n monitoring all
配置Ingress
首先需要先安装traefik,node-port方式效率不行,建议使用traefik
Kubernetes Traefik Ingress
新闻联播老司机
环境初始化
首先我们需要将prometheus operator中的svc类型都修改为ClusterIP,如果默认没有修改的话,默认就是ClusterIP
[root@k8s-01 ingress]# kubectl get pod,svc -n monitoring NAME READY STATUS RESTARTS AGE pod/alertmanager-main-0 2/2 Running 0 88s pod/alertmanager-main-1 2/2 Running 0 77s pod/alertmanager-main-2 2/2 Running 0 69s pod/grafana-558647b59-mj85j 1/1 Running 0 96s pod/kube-state-metrics-5bfc7db74d-kpgh2 4/4 Running 0 96s pod/node-exporter-5kz8x 2/2 Running 0 94s pod/node-exporter-jnmr7 2/2 Running 0 94s pod/node-exporter-pztln 2/2 Running 0 93s pod/node-exporter-ts455 2/2 Running 0 94s pod/prometheus-adapter-57c497c557-6tscz 1/1 Running 0 91s pod/prometheus-k8s-0 3/3 Running 1 78s pod/prometheus-k8s-1 3/3 Running 1 78s pod/prometheus-operator-69bd579bf9-rrf96 1/1 Running 1 98s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/alertmanager-main ClusterIP 10.254.201.109 9093/TCP 99s service/alertmanager-operated ClusterIP None 9093/TCP,6783/TCP 89s service/grafana ClusterIP 10.254.19.174 3000/TCP 97s service/kube-state-metrics ClusterIP None 8443/TCP,9443/TCP 96s service/node-exporter ClusterIP None 9100/TCP 95s service/prometheus-adapter ClusterIP 10.254.197.151 443/TCP 93s service/prometheus-k8s ClusterIP 10.254.120.188 9090/TCP 89s service/prometheus-operated ClusterIP None 9090/TCP 78s service/prometheus-operator ClusterIP None 8080/TCP 99s
接下来我们为prometheus ui和grafana以及alertmanager创建ingress
(可以分开写,不写在一个文件里面)
vim ingress.yaml apiVersion: extensions/v1beta1 kind: Ingress metadata: name: prometheus-ing namespace: monitoring spec: rules: - host: prometheus.i4t.com http: paths: - backend: serviceName: prometheus-k8s servicePort: 9090 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: grafana-ing namespace: monitoring spec: rules: - host: grafana.i4t.com http: paths: - backend: serviceName: grafana servicePort: 3000 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: alertmanager-ing namespace: monitoring spec: rules: - host: alertmanager.i4t.com http: paths: - backend: serviceName: alertmanager-main servicePort: 9093 ## host为域名,serviceName是prometheus的svc名称和端口 [root@k8s-01 ingress]# kubectl apply -f ingress.yaml ingress.extensions/prometheus-operator created [root@k8s-01 ingress]# kubectl get ingress -n monitoring NAME HOSTS ADDRESS PORTS AGE alertmanager-ing alertmanager.i4t.com 80 13s grafana-ing grafana.i4t.com 80 13s prometheus-ing prometheus.i4t.com 80 13s
我们也可以在ui界面查看traefik
接下来进行域名解析 (我这里使用修改host方式演示)
#mac ➜ ~ sudo vim /etc/hosts Password: #windows C:WindowsSystem32driversetc
监控k8s组件
这里我们可以看到,prometheus operator并没有监控到kube-controller-manager和scheduler由于我这里是二进制安装,所以并没有获取到相关的信息
这是由于serverMonitor根据label去选取svc的,我们可以看到对应的serviceMonitor选取的范围是kube-system
[root@k8s-01 manifests]# grep -2 selector prometheus-serviceMonitorKube* prometheus-serviceMonitorKubeControllerManager.yaml- matchNames: prometheus-serviceMonitorKubeControllerManager.yaml- - kube-system prometheus-serviceMonitorKubeControllerManager.yaml: selector: prometheus-serviceMonitorKubeControllerManager.yaml- matchLabels: prometheus-serviceMonitorKubeControllerManager.yaml- k8s-app: kube-controller-manager -- prometheus-serviceMonitorKubelet.yaml- matchNames: prometheus-serviceMonitorKubelet.yaml- - kube-system prometheus-serviceMonitorKubelet.yaml: selector: prometheus-serviceMonitorKubelet.yaml- matchLabels: prometheus-serviceMonitorKubelet.yaml- k8s-app: kubelet -- prometheus-serviceMonitorKubeScheduler.yaml- matchNames: prometheus-serviceMonitorKubeScheduler.yaml- - kube-system prometheus-serviceMonitorKubeScheduler.yaml: selector: prometheus-serviceMonitorKubeScheduler.yaml- matchLabels: prometheus-serviceMonitorKubeScheduler.yaml- k8s-app: kube-scheduler
而kube-system默认里也没有符合标签的label
[root@k8s-01 manifests]# kubectl get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.254.0.2 53/UDP,53/TCP,9153/TCP 31d kubelet ClusterIP None 10250/TCP 2d8h kubernetes-dashboard NodePort 10.254.194.101 80:30000/TCP 31d traefik-ingress-service NodePort 10.254.160.25 80:23633/TCP,8080:15301/TCP 38m
但是却有endpoint (我这里二进制安装有)
[root@k8s-01 manifests]# kubectl get ep -n kube-system NAME ENDPOINTS AGE kube-controller-manager 31d kube-dns 172.30.248.2:53,172.30.72.4:53,172.30.248.2:53 + 3 more... 31d kube-scheduler 31d kubelet 192.168.0.10:10255,192.168.0.11:10255,192.168.0.12:10255 + 9 more... 2d8h kubernetes-dashboard 172.30.232.2:8443 31d traefik-ingress-service 172.30.232.5:80,172.30.232.5:8080 39m
解决办法
这里创建两个管理组件的svc,将svc的label设置为k8s-app: {kube-controller-manager、kube-scheduler},这样就可以被servicemonitor选中
二进制安装解决方法
Kubernetes 1.14 二进制集群安装
新闻联播老司机
创建一个svc用来绑定
apiVersion: v1 kind: Service metadata: namespace: kube-system name: kube-controller-manager labels: k8s-app: kube-controller-manager spec: selector: component: kube-controller-manager type: ClusterIP clusterIP: None ports: - name: http-metrics port: 10252 targetPort: 10252 protocol: TCP --- apiVersion: v1 kind: Service metadata: namespace: kube-system name: kube-scheduler labels: k8s-app: kube-scheduler spec: selector: component: kube-scheduler type: ClusterIP clusterIP: None ports: - name: http-metrics port: 10251 targetPort: 10251 protocol: TCP
手动填写svc对应的ep的属性,ep的名称要和svc名称和属性对应上
apiVersion: v1 kind: Endpoints metadata: labels: k8s-app: kube-controller-manager name: kube-controller-manager namespace: kube-system subsets: - addresses: - ip: 192.168.0.10 - ip: 192.168.0.11 - ip: 192.168.0.12 ports: - name: http-metrics port: 10252 protocol: TCP --- apiVersion: v1 kind: Endpoints metadata: labels: k8s-app: kube-scheduler name: kube-scheduler namespace: kube-system subsets: - addresses: - ip: 192.168.0.10 - ip: 192.168.0.11 - ip: 192.168.0.12 ports: - name: http-metrics port: 10251 protocol: TCP
我们查看一下svc,已经和我们ep进行绑定
[root@k8s-01 test]# kubectl get svc -n kube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-controller-manager ClusterIP None 10252/TCP 64s kube-dns ClusterIP 10.254.0.2 53/UDP,53/TCP,9153/TCP 31d kube-scheduler ClusterIP None 10251/TCP 64s kubelet ClusterIP None 10250/TCP 2d9h kubernetes-dashboard NodePort 10.254.194.101 80:30000/TCP 31d traefik-ingress-service NodePort 10.254.160.25 80:23633/TCP,8080:15301/TCP 126m [root@k8s-01 test]# kubectl describe svc -n kube-system kube-scheduler Name: kube-scheduler Namespace: kube-system Labels: k8s-app=kube-scheduler Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"k8s-app":"kube-scheduler"},"name":"kube-scheduler","namespace"... Selector: component=kube-scheduler Type: ClusterIP IP: None Port: http-metrics 10251/TCP TargetPort: 10251/TCP Endpoints: 192.168.0.10:10251,192.168.0.11:10251,192.168.0.12:10251 Session Affinity: None Events:
我这里master就3个所以scheduler和kube-controller-manager就只有3个
针对kubeadm可以参考下面的解决方法,由于我这里没有环境所以不进行演示
apiVersion: v1 kind: Endpoints metadata: labels: k8s-app: kubelet name: kubelet namespace: kube-system subsets: - addresses: - ip: 172.16.0.14 targetRef: kind: Node name: k8s-n2 - ip: 172.16.0.18 targetRef: kind: Node name: k8s-n3 - ip: 172.16.0.2 targetRef: kind: Node name: k8s-m1 - ip: 172.16.0.20 targetRef: kind: Node name: k8s-n4 - ip: 172.16.0.21 targetRef: kind: Node name: k8s-n5 ports: - name: http-metrics port: 10255 protocol: TCP - name: cadvisor port: 4194 protocol: TCP - name: https-metrics port: 10250 protocol: TCP
如果我们添加监控后提示ip:10251 Connection refused
需要修改scheduler的配置文件
在启动文件中添加 --bind-address=0.0.0.0
需要在在修改Pod中添加,我不太了解kubeadm这里不过多说明
相关文章:
- Kubernetes 1.14 二进制集群安装
- Kubenetes 1.13.5 集群二进制安装
- Kuerbernetes 1.11 集群二进制安装
- Prometheus Operator