prometheus
metrics-server可以对核心指标进行监控。 除开节点,podCPU内存之外的其他指标是无法获取的,就要借助prometheus。
prometheus提供的资源指标是不能够被k8s解析的,要想能在k8s上使用prometheus, 就需要额外加一个prometheus的资源转换,转成k8s api能够兼容的格式,才能被当作指标数据使用。
架构形式如下:
prometheus本身就是一个监控系统,有一个代理插件,如zabbix和zabbix-agent。
我们先假设prometheus是server端。prometheus要从一个被监控主机节点来获取数据,如果这个节点不是一个pod,而是一个虚拟机,物理机,就部署一个专门的软件 -> node_exoorter,而node_exoorter就相当于客户端。node_exoorter能够让prometheus来采集指标数据的组件,但是这个组件只是用来去暴露,输出,采集当前节点的节点级别的数据。如果要采集其他的,比如haproxy,就需要haproxy的exoorter。如果要去采集pod,pod也有专门的exoorter接口。
而且,容器的日志也在节点之上的/var/log/containers/
中,这些日志从接口被输出来。事实上只需要获取到节点上的这些日志,就能够很容易的获取节点上pod容器中的日志。
简单的说prometheus就是通过metrics rul到各个pod获取数据,这些数据被采集后通过一个PromQL查询语句进行查询,PromQL支持restfull风格接口的查询条件表达式。通过这种restfull接口方式可以监控到采集到各种指标数据,但是这些指标不能够被k8s的api server所解释,因为这两种是不互相兼容的。
如果期望通过k8s的api server像在api中请求获取数据一样获取指标数据,并且是prometheus采集的指标数据,就必须把PromQL的数据转换成k8s中自定义定义的的查询接口格式(costom metics API),就需要在k8s的(costom metics AP)接口下嵌套一个组件,这个组件称为k8s-promethues-adpater,是有第三方提供。而kube-state-metrics 负责转换数据,k8s-promethues-adpater负责接收转换的数据。如下:
PromQL语句完成从Prometheus查询到语句,并转为k8s api上的指标格式数据,并支持通过api获取。前提是将custom Metrics API聚合到api server,就能正常从api-versions看到这个api。
promethues是有状态应用,并且自身就是一个时间序列数据库
prometheus-configmap.yaml 统计pod数据和定义运行环境的
kube-state-metrics-deployment.yaml 部署k8s-prometheus-adpater,使自定义的指标数据被系统使用
alertmanager-pvc.yaml 警报器管理器,使用到pvc
prometheus-statefulset.yaml 配置需求
kube-state-metrics-deployment.yaml 这些是将prometheus数据转换
kube-state-metrics-rbac.yaml
kube-state-metrics-service.yaml
要聚合成api server中的功能 ,就需要在部署一个组件,k8s-prometheus-adpater
promethues是有状态应用,因此,使用statefulset控制。如果一个pod server不够就需要多个,这也就是使用statefulset的原因。
使用https://github.com/iKubernetes/k8s-prom
路径下的文件进行,完成部署实验
- 创建一个prom的名称空间,运行prometheus
k8s-prometheus-adapter 完成自定义数据应用
kube-state-metrics # kube-state部署相关
namespace.yaml 创建prom名称空间
node_exporter 收集节点指标
prometheus 使用deploy部署prometheus
克隆
[root@linuxea opt]# git clone https://github.com/iKubernetes/k8s-prom.git
Cloning into 'k8s-prom'...
remote: Enumerating objects: 46, done.
remote: Total 46 (delta 0), reused 0 (delta 0), pack-reused 46
Unpacking objects: 100% (46/46), done.
[root@linuxea opt]# tree k8s-prom/
-bash: tree: command not found
[root@linuxea opt]# cd k8s-prom/
创建名称空间
[root@linuxea k8s-prom]# kubectl apply -f namespace.yaml
namespace/prom created
部署node_exporter
apply node_exporter
[root@linuxea k8s-prom]# kubectl apply -f node_exporter/
daemonset.apps/prometheus-node-exporter created
service/prometheus-node-exporter created
而后在prom中的名称空间下的pod已经运行起来,这些pod的kind是DaemonSet
,每个节点会运行一个pod,收集Node主机资源。包括主节点
这里面的node-exporter pod
版本用的是prom/node-exporter:v0.16.0
[root@linuxea k8s-prom]# kubectl get pods,svc -n prom
NAME READY STATUS RESTARTS AGE IP
prometheus-node-exporter-b7f2s 1/1 Running 0 2m4s 10.10.240.161
prometheus-node-exporter-cqnwh 1/1 Running 0 2m4s 10.10.240.142
prometheus-node-exporter-k2q7f 1/1 Running 0 2m4s 10.10.240.203
prometheus-node-exporter-x86b4 1/1 Running 0 2m4s 10.10.240.202
prometheus-node-exporter-znhb8 1/1 Running 0 2m4s 10.10.240.143
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus-node-exporter ClusterIP None <none> 9100/TCP 2m31s
部署prometheus
创建prometheus
[root@linuxea k8s-prom]# kubectl apply -f prometheus/
configmap/prometheus-config created
deployment.apps/prometheus-server created
clusterrole.rbac.authorization.k8s.io/prometheus created
serviceaccount/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
service/prometheus created
prometheus是以30090端口暴露出来的
[root@linuxea k8s-prom]# kubectl get pods,svc -n prom
NAME READY STATUS RESTARTS AGE
pod/prometheus-node-exporter-4xq5h 1/1 Running 0 37m
pod/prometheus-node-exporter-8qh82 1/1 Running 0 37m
pod/prometheus-node-exporter-b5kwx 1/1 Running 0 37m
pod/prometheus-node-exporter-dgvfv 1/1 Running 0 37m
pod/prometheus-node-exporter-gm9pv 1/1 Running 0 37m
pod/prometheus-server-5f8cd4755-ns5l7 1/1 Running 0 11m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus NodePort 10.105.244.208 <none> 9090:30090/TCP 11m
service/prometheus-node-exporter ClusterIP None <none> 9100/TCP 37m
而后使用30090访问
部署kube-state-metrics
部署使用kube-state-metrics
[root@linuxea k8s-prom]# kubectl apply -f kube-state-metrics/
deployment.apps/kube-state-metrics created
serviceaccount/kube-state-metrics created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
service/kube-state-metrics created
kube-state-metrics输出的service监听在8080端口之上,向外提供服务的。
[root@linuxea k8s-prom]# kubectl get all -n prom
NAME READY STATUS RESTARTS AGE
pod/kube-state-metrics-68d7c699c6-gstgg 1/1 Running 0 9m34s
pod/prometheus-node-exporter-4xq5h 1/1 Running 0 84m
pod/prometheus-node-exporter-8qh82 1/1 Running 0 84m
pod/prometheus-node-exporter-b5kwx 1/1 Running 0 84m
pod/prometheus-node-exporter-dgvfv 1/1 Running 0 84m
pod/prometheus-node-exporter-gm9pv 1/1 Running 0 84m
pod/prometheus-server-5f8cd4755-ns5l7 1/1 Running 0 59m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-state-metrics ClusterIP 10.98.44.126 <none> 8080/TCP 9m35s
service/prometheus NodePort 10.105.244.208 <none> 9090:30090/TCP 59m
service/prometheus-node-exporter ClusterIP None <none> 9100/TCP 84m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prometheus-node-exporter 5 5 5 5 5 <none> 84m
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/kube-state-metrics 1 1 1 1 9m35s
deployment.apps/prometheus-server 1 1 1 1 59m
NAME DESIRED CURRENT READY AGE
replicaset.apps/kube-state-metrics-68d7c699c6 1 1 1 9m35s
replicaset.apps/prometheus-server-5f8cd4755 1 1 1 59m
部署prometheus-adapter
k8s-prometheus-adapter需要给予https提供服务,而默认情况下k8s-prometheus-adapter是http协议的。需要提供一个证书,使其运行成https,并且需要被此k8s服务器认可的ca签署才可以。我们自制一个即可。
而后创建secret,名称和custom-metrics-apiserver-deployment.yaml
中一样
secret:
secretName: cm-adapter-serving-certs
自制证书
[root@linuxea ~]# cd /etc/kubernetes/pki
生成私钥
[root@linuxea pki]# (umask 077;openssl genrsa -out serving.key 2048)
Generating RSA private key, 2048 bit long modulus
..............+++
...............................+++
e is 65537 (0x10001)
生成签署请求
[root@linuxea pki]# openssl req -new -key serving.key -out serving.csr -subj "/CN=serving"
[root@linuxea pki]# ll serving.*
-rw-r--r--. 1 root root 887 Nov 10 09:38 serving.csr
-rw-------. 1 root root 1675 Nov 10 09:36 serving.key
签证
[root@linuxea pki]# openssl x509 -req -in serving.csr -CA ./ca.crt -CAkey ./ca.key -CAcreateserial -out serving.crt -days 3650
Signature ok
subject=/CN=serving
Getting CA Private Key
[root@linuxea pki]# ll serving.*
-rw-r--r--. 1 root root 977 Nov 10 09:39 serving.crt
-rw-r--r--. 1 root root 887 Nov 10 09:38 serving.csr
-rw-------. 1 root root 1675 Nov 10 09:36 serving.key
创建secret
在prom名称空间中创建secret
[root@linuxea pki]# kubectl create secret generic cm-adapter-serving-certs --from-file=serving.crt=./serving.crt --from-file=serving.key=./serving.key -n prom
secret/cm-adapter-serving-certs created
[root@linuxea pki]# kubectl get secret -n prom
NAME TYPE DATA AGE
cm-adapter-serving-certs Opaque 2 4s
default-token-x8t89 kubernetes.io/service-account-token 3 144m
kube-state-metrics-token-6n8rj kubernetes.io/service-account-token 3 26m
prometheus-token-8lqgk kubernetes.io/service-account-token 3 76m
prometheus-adapter
在应用之前,我们修改几个参数
先替换到custom-metrics-apiserver-deployment.yaml
文件
[root@linuxea k8s-prom]# mv /opt/k8s-prom/k8s-prometheus-adapter/custom-metrics-apiserver-deployment.yaml /opt/
下载k8s-prometheus-adapter
[root@linuxea k8s-prom]# curl -Lks https://raw.githubusercontent.com/DirectXMan12/k8s-prometheus-adapter/master/deploy/manifests/custom-metrics-apiserver-deployment.yaml -o /opt/k8s-prom/k8s-prometheus-adapter/custom-metrics-apiserver-deployment.yaml
替换namespace为prom
而后apply
[root@linuxea k8s-prom]# kubectl apply -f k8s-prometheus-adapter/
clusterrolebinding.rbac.authorization.k8s.io/custom-metrics:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/custom-metrics-auth-reader created
deployment.apps/custom-metrics-apiserver created
clusterrolebinding.rbac.authorization.k8s.io/custom-metrics-resource-reader created
serviceaccount/custom-metrics-apiserver created
service/custom-metrics-apiserver created
apiservice.apiregistration.k8s.io/v1beta1.custom.metrics.k8s.io created
clusterrole.rbac.authorization.k8s.io/custom-metrics-server-resources created
clusterrole.rbac.authorization.k8s.io/custom-metrics-resource-reader created
clusterrolebinding.rbac.authorization.k8s.io/hpa-controller-custom-metrics created
确保在api-versions中出现custom.metrics.k8s.io/v1beta1
[root@linuxea k8s-prom]# kubectl api-versions|grep custom.metrics.k8s.io/v1beta1
custom.metrics.k8s.io/v1beta1
打开 一个代理进行curl这个api,通过这个端口curl获取指标数据
[root@linuxea ~]# kubectl proxy --port=1808
Starting to serve on 127.0.0.1:1808
[root@linuxea k8s-prometheus-adapter]# curl localhost:1808/apis/custom.metrics.k8s.io/v1beta1
grafana
仍然修改namespace为prom,并且网络类型中添加type: NodePort
如果influxdb,可以注释掉
[root@linuxea metrics]# kubectl apply -f grafana.yaml
deployment.apps/monitoring-grafana created
service/monitoring-grafana created
查看下端口是否能够进行访问
[root@linuxea metrics]# kubectl get pods,svc -n prom
NAME READY STATUS RESTARTS AGE
pod/custom-metrics-apiserver-65f545496-pj89v 1/1 Running 0 18m
pod/kube-state-metrics-58dffdf67d-7sv77 1/1 Running 0 20m
pod/monitoring-grafana-ffb4d59bd-hjtc4 1/1 Running 0 30s
pod/prometheus-node-exporter-m74w9 1/1 Running 0 21m
pod/prometheus-node-exporter-xgpqs 1/1 Running 0 21m
pod/prometheus-server-65f5d59585-nx5b2 1/1 Running 0 21m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/custom-metrics-apiserver ClusterIP 10.97.212.117 <none> 443/TCP 18m
service/kube-state-metrics ClusterIP 10.100.137.12 <none> 8080/TCP 21m
service/monitoring-grafana NodePort 10.98.244.37 <none> 80:30980/TCP 30s
service/prometheus NodePort 10.109.127.27 <none> 9090:30090/TCP 21m
service/prometheus-node-exporter ClusterIP None <none> 9100/TCP 21m
URL位置使用的是service名称路径,那端口就是prometheus的9090端口.http://prometheus.prom.svc:9090
下载一个cluster的模板1,pod模板2,node模板3