简介
上一批文章写了,基于CPU指标的弹性伸缩,资源指标只包含CPU、内存,一般来说也够了。但如果想根据自定义指标:如请求qps/5xx错误数来实现HPA,就需要使用自定义指标了,目前比较成熟的实现是 Prometheus Custom Metrics。自定义指标由Prometheus来提供,再利用k8s-prometheus-adpater聚合到apiserver,实现和核心指标(metric-server)同样的效果。
下面我们就来演示基于prometheus监控自定义指标实现k8s pod基于qps的弹性伸缩
这里已经部署好prometheus环境
准备扩容应用
部署一个应用,且应用需要允许被prometheus采集数据
apiVersion: apps/v1 kind: Deployment metadata: labels: app: metrics-app name: metrics-app spec: replicas: 3 selector: matchLabels: app: metrics-app template: metadata: labels: app: metrics-app annotations: prometheus.io/scrape: "true" # 设置允许被prometheus采集 prometheus.io/port: "80" # prometheus 采集的端口 prometheus.io/path: "/metrics" # prometheus 采集的路径 spec: containers: - image: metrics-app name: metrics-app ports: - name: web containerPort: 80 resources: requests: cpu: 200m memory: 256Mi readinessProbe: httpGet: path: / port: 80 initialDelaySeconds: 3 periodSeconds: 5 livenessProbe: httpGet: path: / port: 80 initialDelaySeconds: 3 periodSeconds: 5 --- apiVersion: v1 kind: Service metadata: name: metrics-app labels: app: metrics-app spec: ports: - name: web port: 80 targetPort: 80 selector: app: metrics-app
应用部署之后,通过prometheus web端验证是否已经自动发现
验证指标能否正常采集
这样我们部署的扩容应用就准备完成了,接下来的内容就是HPA如何获取其qps进行扩容的配置了
部署 Custom Metrics Adapter
prometheus采集到的metrics并不能直接给k8s用,因为两者数据格式不兼容,还需要另外一个组件(k8s-prometheus-adpater),将prometheus的metrics 数据格式转换成k8s API接口能识别的格式,转换以后,因为是自定义API,所以还需要用Kubernetes aggregator在主APIServer中注册,以便直接通过/apis/来访问。
prometheus-adapter GitHub地址:https://github.com/DirectXMan12/k8s-prometheus-adapter
该 PrometheusAdapter 有一个稳定的Helm Charts,我们直接使用,这里使用helm 3.0版本,使用微软云的镜像
先准备下helm环境(如已有可忽略):
wget https://get.helm.sh/helm-v3.0.0-linux-amd64.tar.gz tar zxvf helm-v3.0.0-linux-amd64.tar.gz mv linux-amd64/helm /usr/bin/ helm repo add stable http://mirror.azure.cn/kubernetes/charts helm repo update helm repo list
部署prometheus-adapter,指定prometheus地址:
# helm install prometheus-adapter stable/prometheus-adapter --namespace kube-system --set prometheus.url=http://prometheus.kube-system,prometheus.port=9090 # helm list -n kube-system
验证部署成功
# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE prometheus-adapter-86574f7ff4-t9px4 1/1 Running 0 65s
确保适配器注册到APIServer:
# kubectl get apiservices |grep custom # kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1"
创建HPA策略
我们配置的扩容规则策略为每秒QPS超过0.8个就进行扩容
[root@k8s-master-02 ~]# cat app-hpa-v2.yml apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: metrics-app-hpa namespace: default spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: metrics-app minReplicas: 1 maxReplicas: 10 metrics: - type: Pods pods: metric: name: http_requests_per_second target: type: AverageValue averageValue: 800m # 800m 即0.8个/秒,如果是阀值设置为每秒10个,这里的值就应该填写10000m [root@k8s-master-02 ~]# kubectl apply -f app-hpa-v2.yml
配置prometheus自定义指标
当创建好HPA还没数据,因为适配器还不知道你要什么指标(http_requests_per_second),HPA也就获取不到Pod提供指标,接下来我们就要解决监控值没有正常获取的问题,即配置自定义指标
[root@k8s-master-02 ~]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE metrics-app-hpa Deployment/metrics-app /800m 1 10 3 36s
ConfigMap在default名称空间中编辑prometheus-adapter ,并seriesQuery在该rules: 部分的顶部添加一个新的配置:(获取该服务的所有pod 2分钟之内的http_requests监控值并计算平均值,然后相加对外提供数据)
[root@k8s-master-02 ~]# kubectl edit cm prometheus-adapter -n kube-system apiVersion: v1 data: config.yaml: | rules: - seriesQuery: 'http_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}' resources: overrides: kubernetes_namespace: {resource: "namespace"} kubernetes_pod_name: {resource: "pod"} name: matches: "^(.*)_total" as: "${1}_per_second" metricsQuery: 'sum(rate({}[2m])) by ()' ……
配置好之后,因为adapter pod不支持配置动态加载,所以我们修改了配置,需要删除一下pod,让他重新加载一个新的生效配置
[root@k8s-master-02 ~]# kubectl delete pod prometheus-adapter-86574f7ff4-t9px4 -n kube-system
删除pod重新生成配置之后,大约一两分钟在观察hpa的值就正常了
[root@k8s-master-02 ~]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE metrics-app-hpa Deployment/metrics-app 416m/800m 1 10 2 17m
验证基于自定义指标的扩容
接下来通过压测验证扩容缩容
[root@k8s-master-02 ~]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.0.0.1 443/TCP 84d metrics-app ClusterIP 10.0.0.80 80/TCP 3d2h [root@k8s-master-02 ~]# ab -n 100000 -c 100 http://10.0.0.80/metrics
压测过程中观察hpa和pod状态发现pod已经自动扩容到了10个pod
[root@k8s-master-02 ~]# kubectl get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE metrics-app-hpa Deployment/metrics-app 289950m/800m 1 10 10 38m [root@k8s-master-02 ~]# kubectl get pods NAME READY STATUS RESTARTS AGE metrics-app-7674cfb699-2gznv 1/1 Running 0 40s metrics-app-7674cfb699-2hk6r 1/1 Running 0 40s metrics-app-7674cfb699-5926q 1/1 Running 0 40s metrics-app-7674cfb699-5qgg2 1/1 Running 0 2d2h metrics-app-7674cfb699-9zkk4 1/1 Running 0 25s metrics-app-7674cfb699-dx8cj 1/1 Running 0 56s metrics-app-7674cfb699-fmgpp 1/1 Running 0 56s metrics-app-7674cfb699-k9thm 1/1 Running 0 25s metrics-app-7674cfb699-wzxhk 1/1 Running 0 2d2h metrics-app-7674cfb699-zdbtg 1/1 Running 0 40s
停止压测一段时间之后,pod数量就会根据HPA策略自动缩容成1个,说明我们的配置是成功的