kubeprometheus远程存储victoriametrics

云计算 2023-07-15 泡泡手机阅读

我们知道，在使用promentheus的过程中，默认的数据量一旦到一个量级后，查询区间的数据会非常缓慢，甚至一个查询就可能导致promentheus的崩溃，尽管我们不需要存储多久的数据，但是集群pod在一定的数量后，短期的数据仍然非常多，对于Promentheus本身的存储引擎来讲，仍是一个不小的问题，而使用外部存储就显得很有必要。早期流行的influxDB，由于社区对Promentheus并不友好，因此早些就放弃。

此前，尝试了Prometheus远程存储Promscale和TimescaleDB测试，而后在讨论中发现VictoriaMetrics是更可取的方式。而VictoriaMetrics也有自己的一套系统监控。

而在官方的介绍中，VictoriaMetrics强烈diss了TimescaleDB

It provides high data compression, so up to 70x more data points may be crammed into limited storage comparing to TimescaleDB and up to 7x less storage space is required compared to Prometheus, Thanos or Cortex.

VictoriaMetrics可用于 Prometheus 监控数据做长期远程存储的时序数据库之一，而在github上是这样介绍的，截取部分如下

可以直接用于 Grafana 作为 Prometheus 数据源使用
指标数据摄取和查询具备高性能和良好的可扩展性，性能比 InfluxDB 和 TimescaleDB 高出 20 倍
内存方面也做了优化，比 InfluxDB 少 10x 倍，比 Prometheus、Thanos 或 Cortex 少 7 倍

其他有能够理解的部分话术

针对具有高延迟 IO 和低 IOPS 的存储进行了优化
提供全局的查询视图，多个 Prometheus 实例或任何其他数据源可能会将数据摄取到 VictoriaMetrics
VictoriaMetrics 由一个没有外部依赖的小型可执行文件组成
所有的配置都是通过明确的命令行标志和合理的默认值完成的
所有数据都存储在 - storageDataPath 命令行参数指向的目录中
可以使用 vmbackup/vmrestore 工具轻松快速地从实时快照备份到 S3 或 GCS 对象存储中
支持从第三方时序数据库获取数据源
由于存储架构原因，它可以保护存储在非正常关机（即 OOM、硬件重置或 kill -9）时免受数据损坏
同样支持指标的 relabel 操作

注意

VictoriaMetrics 不支持prometheus本身读取，但是为了解决报警的问题，开发人员建议配置--storage.tsdb.retention.time=24h保留24小时的数据在prometheus中，而其他的数据写入到远程VictoriaMetrics ，通过grafana展示。

VictoriaMetrics wiki说不支持prometheus读取，因为它发送的数据量很大； remote_read api 可以解决警报问题。我们可以启动一个 prometheus 实例，它只有 remote_read 配置部分和规则部分。victoriaMetrics 警报非常好！

由于Prometheus中的这个问题，Prometheus 远程读取 API 不是为读取由其他 Prometheus 实例写入远程存储的数据而设计的。

至于 Prometheus 中的警报，则将 Prometheus 本地存储保留设置为涵盖所有已配置警报规则的持续时间。通常 24 小时就足够了：--storage.tsdb.retention.time=24h. 在这种情况下，Prometheus 将对本地存储的数据执行警报规则，同时remote_write像往常一样将所有数据复制到配置的 url。

而这些在github的wiki中以及为什么 VictoriaMetrics 不支持Prometheus 远程读取 API？有过说明

远程读取 API 需要在给定时间范围内传输所有请求指标的所有原始数据。例如，如果一个查询包含 1000 个指标，每个指标有 10K 个值，那么远程读取 API 必须1000*10K向 Prometheus 返回 =10M 个指标值。这是缓慢且昂贵的。Prometheus 的远程读取 API 不适用于查询外部数据——也就是global query view. 有关详细信息，请参阅此问题。

因此，只需通过vmui、Prometheus Querying API 或Grafana 中的 Prometheus 数据源直接查询 VictoriaMetrics 。

VictoriaMetrics

在VictoriaMetrics 中介绍如下

VictoriaMetrics uses their modified version of LSM tree (Logging Structure Merge Tree). All the tables and indexes on the disk are immutable once created. When it's making the snapshot, they just create the hard link to the immutable files.

VictoriaMetrics stores the data in MergeTree, which is from ClickHouse and similar to LSM. The MergeTree has particular design decision compared to canonical LSM.

MergeTree is column-oriented. Each column is stored separately. And the data is sorted by the "primary key", and the "primary key" doesn't have to be unique. It speeds up the look-up through the "primary key", and gets the better compression ratio. The "parts" is similar to SSTable in LSM; it can be merged into bigger parts. But it doesn't have strict levels.

The Inverted Index is built on "mergeset" (A data structure built on top of MergeTree ideas). It's used for fast lookup by given the time-series selector.

提到的技术点， LSM 树，以及MergeTree

VictoriaMetrics 将数据存储在 MergeTree 中，MergeTree 来自 ClickHouse，类似于 LSM。与规范 LSM 相比，MergeTree 具有特定的设计决策。

MergeTree 是面向列的。每列单独存储。并且数据按“主键”排序，“主键”不必是唯一的。它通过“主键”加快查找速度，获得更好的压缩比。“部分”类似于 LSM 中的 SSTable；它可以合并成更大的部分。但它没有严格的等级。

倒排索引建立在“mergeset”（建立在 MergeTree 思想之上的数据结构）之上。通过给定时间序列选择器，它用于快速查找。

为了能够有更多的理解，可以参考LSM Tree原理详解)：https://www.jianshu.com/p/b43b856e09bb

应用到kube-prometheus

对照如下kubernetes版本安装对应的kube-prometheus版本

kube-prometheus stack	Kubernetes 1.19	Kubernetes 1.20	Kubernetes 1.21	Kubernetes 1.22	Kubernetes 1.23
`release-0.7`	✔	✔	✗	✗	✗
`release-0.8`	✗	✔	✔	✗	✗
`release-0.9`	✗	✗	✔	✔	✗
`release-0.10`	✗	✗	✗	✔	✔
`main`	✗	✗	✗	✔	✔

Quickstart

找到符合集群对应的版本进行安装，如果你是ack，需要卸载ack-arms-prometheus

替换镜像

k8s.gcr.io/prometheus-adapter/prometheus-adapter:v0.9.1
v5cn/prometheus-adapter:v0.9.1

k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.4.2
bitnami/kube-state-metrics:2.4.2

quay.io/brancz/kube-rbac-proxy:v0.12.0
bitnami/kube-rbac-proxy:0.12.0

开始部署

$ cd kube-prometheus
$ git checkout main
kubectl.exe create -f .manifestssetup
kubectl.exe create -f .manifests

配置ingress-nginx

> kubectl.exe -n monitoring get svc
NAME                    TYPE        CLUSTER-IP      PORT(S)                   
alertmanager-main       ClusterIP   192.168.31.49   9093/TCP,8080/TCP         
alertmanager-operated   ClusterIP   None            9093/TCP,9094/TCP,9094/UDP
blackbox-exporter       ClusterIP   192.168.31.69   9115/TCP,19115/TCP        
grafana                 ClusterIP   192.168.130.3   3000/TCP                  
kube-state-metrics      ClusterIP   None            8443/TCP,9443/TCP         
node-exporter           ClusterIP   None            9100/TCP                  
prometheus-adapter      ClusterIP   192.168.13.123  443/TCP                   
prometheus-k8s          ClusterIP   192.168.118.39  9090/TCP,8080/TCP         
prometheus-operated     ClusterIP   None            9090/TCP                  
prometheus-operator     ClusterIP   None            8443/TCP

ingress-nginx

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: monitoring-ui
  namespace: monitoring
spec:
  ingressClassName: nginx
  rules:
  - host: local.grafana.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: grafana
            port:
              number: 3000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prometheus-ui
  namespace: monitoring
spec:
  ingressClassName: nginx
  rules:
  - host: local.prom.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-k8s
            port:
              number: 9090

配置nfs测试

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nfs-client-provisioner
  labels:
    app: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: nfs-client-provisioner
  template:
    metadata:
      labels:
        app: nfs-client-provisioner
    spec:
      serviceAccountName: nfs-client-provisioner
      containers:
        - name: nfs-client-provisioner
          image: quay.io/external_storage/nfs-client-provisioner:latest
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: fuseim.pri/ifs
            - name: NFS_SERVER
              value: 192.168.3.19
            - name: NFS_PATH
              value: /data/nfs-k8s
      volumes:
        - name: nfs-client-root
          nfs:
            server: 192.168.3.19
            path: /data/nfs-k8s
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: nfs-client-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  # replace with namespace where provisioner is deployed
  namespace: default
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    # replace with namespace where provisioner is deployed
    namespace: default
roleRef:
  kind: Role
  name: leader-locking-nfs-client-provisioner
  apiGroup: rbac.authorization.k8s.io

vm配置

创建一个pvc-victoriametrics

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-storage
  namespace: default 
provisioner: fuseim.pri/ifs # or choose another name, must match deployment's env PROVISIONER_NAME'
parameters:
  archiveOnDelete: "false"
# Supported policies: Delete、 Retain ， default is Delete
reclaimPolicy: Retain
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata: 
  name: pvc-victoriametrics
  namespace: monitoring
spec:
  accessModes:
  - ReadWriteMany 
  storageClassName: nfs-storage
  resources: 
    requests:
      storage: 10Gi

准备pvc

[linuxea.com ~/victoriametrics]# kubectl apply -f pvc.yaml 
storageclass.storage.k8s.io/nfs-storage created
persistentvolumeclaim/pvc-victoriametrics created
[linuxea.com ~/victoriametrics]# kubectl get pv
NAME                                       CAPACITY   ACCESS MODES RECLAIM POLICY STATUS CLAIM
...
pvc-97bea5fe-0131-4fb5-aaa9-66eee0802cb4   10Gi       RWX          Retain         Bound  monitoring/pvc-victoriametrics
...
[linuxea.com ~/victoriametrics]# kubectl get pvc -A
NAMESPACE    NAME                                 STATUS   VOLUME                                     CAPACITY   
...   
monitoring   pvc-victoriametrics                  Bound    pvc-97bea5fe-0131-4fb5-aaa9-66eee0802cb4   10Gi

创建victoriametrics，并配置上面的pvc

1w : 一周

# vm-grafana.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: victoria-metrics
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: victoria-metrics
  template:
    metadata:
      labels:
        app: victoria-metrics
    spec:
      containers:
        - name: vm
          image: victoriametrics/victoria-metrics:v1.76.1
          imagePullPolicy: IfNotPresent
          args:
            - -storageDataPath=/var/lib/victoria-metrics-data
            - -retentionPeriod=1w
          ports:
            - containerPort: 8428
              name: http
          resources:
            limits:
              cpu: "1"
              memory: 2048Mi
            requests:
              cpu: 100m
              memory: 512Mi          
          readinessProbe:
            httpGet:
              path: /health
              port: 8428
            initialDelaySeconds: 30
            timeoutSeconds: 30
          livenessProbe:
            httpGet:
              path: /health
              port: 8428
            initialDelaySeconds: 120
            timeoutSeconds: 30              
          volumeMounts:
            - mountPath: /var/lib/victoria-metrics-data
              name: victoriametrics-storage
      volumes:
        - name: victoriametrics-storage
          persistentVolumeClaim:
            claimName: nas-csi-pvc-oms-fat-victoriametrics
---
apiVersion: v1
kind: Service
metadata:
  name: victoria-metrics
  namespace: monitoring
spec:
  ports:
  - name: http
    port: 8428
    protocol: TCP
    targetPort: 8428
  selector:
    app: victoria-metrics
  type: ClusterIP

apply

[linuxea.com ~/victoriametrics]# kubectl apply -f vmctoriametrics.yaml
deployment.apps/victoria-metrics created
service/victoria-metrics created
[linuxea.com ~/victoriametrics]# kubectl -n monitoring get pod 
NAME                                   READY   STATUS    RESTARTS   AGE
alertmanager-main-0                    2/2     Running   88         268d
blackbox-exporter-55c457d5fb-6rc8m     3/3     Running   114        260d
grafana-756dc9b545-b2skg               1/1     Running   38         260d
kube-state-metrics-76f6cb7996-j2hx4    3/3     Running   153        260d
node-exporter-4hxzp                    2/2     Running   120        316d
node-exporter-54t9p                    2/2     Running   124        316d
node-exporter-8rfht                    2/2     Running   120        316d
node-exporter-hqzzn                    2/2     Running   126        316d
prometheus-adapter-59df95d9f5-7shw5    1/1     Running   78         260d
prometheus-k8s-0                       2/2     Running   89         268d
prometheus-operator-7775c66ccf-x2wv4   2/2     Running   115        260d
promoter-66f6dd475c-fdzrx              1/1     Running   3          8d
victoria-metrics-56d47f6fb-qmthh       0/1     Running   0          15s
[linuxea.com ~/victoriametrics]# kubectl -n monitoring get svc
NAME                        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
alertmanager-main           NodePort    10.68.30.147    <none>        9093:30092/TCP               316d
alertmanager-operated       ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   316d
blackbox-exporter           ClusterIP   10.68.25.245    <none>        9115/TCP,19115/TCP           316d
etcd-k8s                    ClusterIP   None            <none>        2379/TCP                     316d
external-node-k8s           ClusterIP   None            <none>        9100/TCP                     315d
external-pve-k8s            ClusterIP   None            <none>        9221/TCP                     305d
external-windows-node-k8s   ClusterIP   None            <none>        9182/TCP                     316d
grafana                     NodePort    10.68.133.224   <none>        3000:30091/TCP               316d
kube-state-metrics          ClusterIP   None            <none>        8443/TCP,9443/TCP            316d
node-exporter               ClusterIP   None            <none>        9100/TCP                     316d
prometheus-adapter          ClusterIP   10.68.138.175   <none>        443/TCP                      316d
prometheus-k8s              NodePort    10.68.207.185   <none>        9090:30090/TCP               316d
prometheus-operated         ClusterIP   None            <none>        9090/TCP                     316d
prometheus-operator         ClusterIP   None            <none>        8443/TCP                     316d
promoter                    ClusterIP   10.68.26.69     <none>        8080/TCP                     11d
victoria-metrics            ClusterIP   10.68.225.139   <none>        8428/TCP                     18s

修改prometheus的远程存储配置，我们主要修改如下，其他参数可在官方文档查看

首先修改远程写如到vm

  remoteWrite:
  - url: "http://victoria-metrics:8428/api/v1/write"
    queueConfig:
      capacity: 5000
    remoteTimeout: 30s

并且prometheus的存储时间为1天

retention: 1d

一天的本地存储只是为了应对告警，而远程写入到vm后通过grafana来看

Prometheus-prometheus.yaml

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  labels:
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/instance: k8s
    app.kubernetes.io/name: prometheus
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 2.35.0
  name: k8s
  namespace: monitoring
spec:
  retention: 1d
  alerting:
    alertmanagers:
    - apiVersion: v2
      name: alertmanager-main
      namespace: monitoring
      port: web
  enableFeatures: []
  externalLabels: {}
  image: quay.io/prometheus/prometheus:v2.35.0
  nodeSelector:
    kubernetes.io/os: linux
  podMetadata:
    labels:
      app.kubernetes.io/component: prometheus
      app.kubernetes.io/instance: k8s
      app.kubernetes.io/name: prometheus
      app.kubernetes.io/part-of: kube-prometheus
      app.kubernetes.io/version: 2.35.0
  podMonitorNamespaceSelector: {}
  podMonitorSelector: {}
  probeNamespaceSelector: {}
  probeSelector: {}
  replicas: 1
  resources:
    requests:
      memory: 400Mi
  remoteWrite:
  - url: "http://victoria-metrics:8428/api/v1/write"
    queueConfig:
      capacity: 5000
    remoteTimeout: 30s
  ruleNamespaceSelector: {}
  ruleSelector: {}
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceAccountName: prometheus-k8s
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector: {}
  version: 2.35.0

而此时的配置不出意外会被应用到URL/config

remote_write:
- url: http://victoria-metrics:8428/api/v1/write
  remote_timeout: 5m
  follow_redirects: true
  queue_config:
    capacity: 5000
    max_shards: 200
    min_shards: 1
    max_samples_per_send: 500
    batch_send_deadline: 5s
    min_backoff: 30ms
    max_backoff: 100ms
  metadata_config:
    send: true
    send_interval: 1m

查看日志

level=info ts=2022-04-28T15:26:12.047Z caller=main.go:944 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
ts=2022-04-28T15:26:12.053Z caller=dedupe.go:112 component=remote level=info remote_name=1a1964 url=http://victoria-metrics:8428/api/v1/write msg="Starting WAL watcher" queue=1a1964
ts=2022-04-28T15:26:12.053Z caller=dedupe.go:112 component=remote level=info remote_name=1a1964 url=http://victoria-metrics:8428/api/v1/write msg="Starting scraped metadata watcher"
ts=2022-04-28T15:26:12.053Z caller=dedupe.go:112 component=remote level=info remote_name=1a1964 url=http://victoria-metrics:8428/api/v1/write msg="Replaying WAL" queue=1a1964
....
totalDuration=55.219178ms remote_storage=85.51µs web_handler=440ns query_engine=719ns scrape=45.6µs scrape_sd=1.210328ms notify=4.99µs notify_sd=352.209µs rules=47.503195ms

回到nfs查看

[root@Node-172_16_100_49 /data/nfs-k8s/monitoring-pvc-victoriametrics-pvc-97bea5fe-0131-4fb5-aaa9-66eee0802cb4]# ll
total 0
drwxr-xr-x 4 root root 48 Apr 28 22:37 data
-rw-r--r-- 1 root root  0 Apr 28 22:37 flock.lock
drwxr-xr-x 5 root root 71 Apr 28 22:37 indexdb
drwxr-xr-x 2 root root 43 Apr 28 22:37 metadata
drwxr-xr-x 2 root root  6 Apr 28 22:37 snapshots
drwxr-xr-x 3 root root 27 Apr 28 22:37 tmp

修改grafana的配置

此时看到的数据是用promenteus中获取到的，修改grefana来从vm读取数据

  datasources.yaml: |-
    {
        "apiVersion": 1,
        "datasources": [
            {
                "access": "proxy",
                "editable": false,
                "name": "prometheus",
                "orgId": 1,
                "type": "prometheus",
                "url": "http://victoria-metrics:8428",
                "version": 1
            }
        ]
    }

顺便修改时区

stringData:
# 修改 时区
  grafana.ini: |
    [date_formats]
    default_timezone = CST

如下

apiVersion: v1
kind: Secret
metadata:
  labels:
    app.kubernetes.io/component: grafana
    app.kubernetes.io/name: grafana
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 8.5.0
  name: grafana-datasources
  namespace: monitoring
stringData:
# 修改链接的地址
  datasources.yaml: |-
    {
        "apiVersion": 1,
        "datasources": [
            {
                "access": "proxy",
                "editable": false,
                "name": "prometheus",
                "orgId": 1,
                "type": "prometheus",
                "url": "http://victoria-metrics:8428",
                "version": 1
            }
        ]
    }
type: Opaque
---
apiVersion: v1
kind: Secret
metadata:
  labels:
    app.kubernetes.io/component: grafana
    app.kubernetes.io/name: grafana
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 8.5.0
  name: grafana-config
  namespace: monitoring
stringData:
# 修改 时区
  grafana.ini: |
    [date_formats]
    default_timezone = CST
type: Opaque
# grafana:
#   sidecar:
#     datasources:
#       enabled: true
#       label: grafana_datasource
#       searchNamespace: ALL
#       defaultDatasourceEnabled: false
#   additionalDataSources:
#     - name: Loki
#       type: loki
#       url: http://loki-stack.loki-stack:3100/
#       access: proxy
#     - name: VictoriaMetrics
#       type: prometheus
#       url: http://victoria-metrics-single-server.victoria-metrics-single:8428
#       access: proxy

而此时的datasources就变成了vm，远程写入到了vm，grafana读取的是vm，而Prometheus还是读的是prometheus

监控vm

dashboards与版本有关，https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/dashboards

并且添加监控

# victoriametrics-metrics
apiVersion: v1
kind: Service
metadata:
  name: victoriametrics-metrics
  namespace: monitoring
  labels:
    app: victoriametrics-metrics
  annotations:
    prometheus.io/port: "8428"
    prometheus.io/scrape: "true"
spec:
  type: ClusterIP
  ports:
  - name: metrics
    port: 8428
    targetPort: 8428
    protocol: TCP
  selector:
  # 对应victoriametrics的service
    app: victoria-metrics 
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
    name: victoriametrics-metrics
    namespace: monitoring
spec:
  endpoints:
  - interval: 15s
    port: metrics
    path: /metrics
  namespaceSelector:
    matchNames:
    - monitoring
  selector:
    matchLabels:
      app: victoriametrics-metrics