Kubernetes集群安装文档v1.6版本

2023年 7月 9日 185.8k 0

本文创作与2017年4月,文中有很多错误和模糊的地方在很长的时间内没有在此修正。以下内容都已迁移到了 kubernete-handbook(http://jimmysong.io/kubernetes-handbook)的“最佳实践”章节中,请以 kubernetes-handbook 中的内容为准,本书托管在 GitHub 上,地址 https://github.com/rootsongjc/kubernetes-handbook。

和我一步步部署 kubernetes 集群

Kubernetes集群安装文档-v1.6版本-1

本系列文档介绍使用二进制部署 kubernetes 集群的所有步骤,而不是使用 kubeadm 等自动化方式来部署集群,同时开启了集群的TLS安全认证;

在部署的过程中,将详细列出各组件的启动参数,给出配置文件,详解它们的含义和可能遇到的问题。

部署完成后,你将理解系统各组件的交互原理,进而能快速解决实际问题。

所以本文档主要适合于那些有一定 kubernetes 基础,想通过一步步部署的方式来学习和了解系统配置、运行原理的人。

项目代码中提供了汇总后的markdon和pdf格式的安装文档,pdf版本文档下载。

注:本文档中不包括docker和私有镜像仓库的安装。

提供所有的配置文件

集群安装时所有组件用到的配置文件,包含在以下目录中:

  • etc: service的环境变量配置文件
  • manifest: kubernetes应用的yaml文件
  • systemd :systemd serivce配置文件

集群详情

  • Kubernetes 1.6.0
  • Docker 1.12.5(使用yum安装)
  • Etcd 3.1.5
  • Flanneld 0.7 vxlan 网络
  • TLS 认证通信 (所有组件,如 etcd、kubernetes master 和 node)
  • RBAC 授权
  • kublet TLS BootStrapping
  • kubedns、dashboard、heapster(influxdb、grafana)、EFK(elasticsearch、fluentd、kibana) 集群插件
  • 私有docker镜像仓库harbor(请自行部署,harbor提供离线安装包,直接使用docker-compose启动即可)

步骤介绍

  • 创建 TLS 通信所需的证书和秘钥
  • 创建 kubeconfig 文件
  • 创建三节点的高可用 etcd 集群
  • kubectl命令行工具
  • 部署高可用 master 集群
  • 部署 node 节点
  • kubedns 插件
  • Dashboard 插件
  • Heapster 插件
  • EFK 插件
  • 一、创建 kubernetes 各组件 TLS 加密通信的证书和秘钥

    kubernetes 系统的各组件需要使用 TLS 证书对通信进行加密,本文档使用 CloudFlare 的 PKI 工具集 cfssl 来生成 Certificate Authority (CA) 和其它证书;

    生成的 CA 证书和秘钥文件如下:

    • ca-key.pem
    • ca.pem
    • kubernetes-key.pem
    • kubernetes.pem
    • kube-proxy.pem
    • kube-proxy-key.pem
    • admin.pem
    • admin-key.pem

    使用证书的组件如下:

    • etcd:使用 ca.pem、kubernetes-key.pem、kubernetes.pem;
    • kube-apiserver:使用 ca.pem、kubernetes-key.pem、kubernetes.pem;
    • kubelet:使用 ca.pem;
    • kube-proxy:使用 ca.pem、kube-proxy-key.pem、kube-proxy.pem;
    • kubectl:使用 ca.pem、admin-key.pem、admin.pem;

    kube-controllerkube-scheduler 当前需要和 kube-apiserver 部署在同一台机器上且使用非安全端口通信,故不需要证书。

    安装 CFSSL

    方式一:直接使用二进制源码包安装

    $ wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
    $ chmod +x cfssl_linux-amd64
    $ sudo mv cfssl_linux-amd64 /root/local/bin/cfssl
    
    $ wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
    $ chmod +x cfssljson_linux-amd64
    $ sudo mv cfssljson_linux-amd64 /root/local/bin/cfssljson
    
    $ wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
    $ chmod +x cfssl-certinfo_linux-amd64
    $ sudo mv cfssl-certinfo_linux-amd64 /root/local/bin/cfssl-certinfo
    
    $ export PATH=/root/local/bin:$PATH

    方式二:使用go命令安装

    我们的系统中安装了Go1.7.5,使用以下命令安装更快捷:

    $go get -u github.com/cloudflare/cfssl/cmd/...
    $echo $GOPATH
    /usr/local
    $ls /usr/local/bin/cfssl*
    cfssl cfssl-bundle cfssl-certinfo cfssljson cfssl-newkey cfssl-scan
    

    $GOPATH/bin目录下得到以cfssl开头的几个命令。

    创建 CA (Certificate Authority)

    创建 CA 配置文件

    $ mkdir /root/ssl
    $ cd /root/ssl
    $ cfssl print-defaults config > config.json
    $ cfssl print-defaults csr > csr.json
    $ cat ca-config.json
    {
      "signing": {
        "default": {
          "expiry": "8760h"
        },
        "profiles": {
          "kubernetes": {
            "usages": [
                "signing",
                "key encipherment",
                "server auth",
                "client auth"
            ],
            "expiry": "8760h"
          }
        }
      }
    }

    字段说明

    • ca-config.json:可以定义多个 profiles,分别指定不同的过期时间、使用场景等参数;后续在签名证书时使用某个 profile;
    • signing:表示该证书可用于签名其它证书;生成的 ca.pem 证书中 CA=TRUE
    • server auth:表示client可以用该 CA 对server提供的证书进行验证;
    • client auth:表示server可以用该CA对client提供的证书进行验证;

    创建 CA 证书签名请求

    $ cat ca-csr.json
    {
      "CN": "kubernetes",
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "BeiJing",
          "L": "BeiJing",
          "O": "k8s",
          "OU": "System"
        }
      ]
    }
    • “CN”:Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name);浏览器使用该字段验证网站是否合法;
    • “O”:Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group);

    生成 CA 证书和私钥

    $ cfssl gencert -initca ca-csr.json | cfssljson -bare ca
    $ ls ca*
    ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem

    创建 kubernetes 证书

    创建 kubernetes 证书签名请求

    $ cat kubernetes-csr.json
    {
        "CN": "kubernetes",
        "hosts": [
          "127.0.0.1",
          "172.20.0.112",
          "172.20.0.113",
          "172.20.0.114",
          "172.20.0.115",
          "10.254.0.1",
          "kubernetes",
          "kubernetes.default",
          "kubernetes.default.svc",
          "kubernetes.default.svc.cluster",
          "kubernetes.default.svc.cluster.local"
        ],
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "CN",
                "ST": "BeiJing",
                "L": "BeiJing",
                "O": "k8s",
                "OU": "System"
            }
        ]
    }
    • 如果 hosts 字段不为空则需要指定授权使用该证书的 IP 或域名列表,由于该证书后续被 etcd 集群和 kubernetes master集群使用,所以上面分别指定了 etcd 集群、kubernetes master 集群的主机 IP 和 kubernetes 服务的服务 IP(一般是kue-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.254.0.1。

    生成 kubernetes 证书和私钥

    $ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes
    $ ls kuberntes*
    kubernetes.csr  kubernetes-csr.json  kubernetes-key.pem  kubernetes.pem

    或者直接在命令行上指定相关参数:

    $ echo '{"CN":"kubernetes","hosts":[""],"key":{"algo":"rsa","size":2048}}' | cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes -hostname="127.0.0.1,172.20.0.112,172.20.0.113,172.20.0.114,172.20.0.115,kubernetes,kubernetes.default" - | cfssljson -bare kubernetes

    创建 admin 证书

    创建 admin 证书签名请求

    $ cat admin-csr.json
    {
      "CN": "admin",
      "hosts": [],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "BeiJing",
          "L": "BeiJing",
          "O": "system:masters",
          "OU": "System"
        }
      ]
    }
    • 后续 kube-apiserver 使用 RBAC 对客户端(如 kubeletkube-proxyPod)请求进行授权;
    • kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Rolecluster-admin 绑定,该 Role 授予了调用kube-apiserver 的所有 API的权限;
    • OU 指定该证书的 Group 为 system:masterskubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;

    生成 admin 证书和私钥

    $ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
    $ ls admin*
    admin.csr  admin-csr.json  admin-key.pem  admin.pem

    创建 kube-proxy 证书

    创建 kube-proxy 证书签名请求

    $ cat kube-proxy-csr.json
    {
      "CN": "system:kube-proxy",
      "hosts": [],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "BeiJing",
          "L": "BeiJing",
          "O": "k8s",
          "OU": "System"
        }
      ]
    }
    • CN 指定该证书的 User 为 system:kube-proxy
    • kube-apiserver 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

    生成 kube-proxy 客户端证书和私钥

    $ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes  kube-proxy-csr.json | cfssljson -bare kube-proxy
    $ ls kube-proxy*
    kube-proxy.csr  kube-proxy-csr.json  kube-proxy-key.pem  kube-proxy.pem

    校验证书

    以 kubernetes 证书为例

    使用 opsnssl 命令

    $ openssl x509  -noout -text -in  kubernetes.pem
    ...
        Signature Algorithm: sha256WithRSAEncryption
            Issuer: C=CN, ST=BeiJing, L=BeiJing, O=k8s, OU=System, CN=Kubernetes
            Validity
                Not Before: Apr  5 05:36:00 2017 GMT
                Not After : Apr  5 05:36:00 2018 GMT
            Subject: C=CN, ST=BeiJing, L=BeiJing, O=k8s, OU=System, CN=kubernetes
    ...
            X509v3 extensions:
                X509v3 Key Usage: critical
                    Digital Signature, Key Encipherment
                X509v3 Extended Key Usage:
                    TLS Web Server Authentication, TLS Web Client Authentication
                X509v3 Basic Constraints: critical
                    CA:FALSE
                X509v3 Subject Key Identifier:
                    DD:52:04:43:10:13:A9:29:24:17:3A:0E:D7:14:DB:36:F8:6C:E0:E0
                X509v3 Authority Key Identifier:
                    keyid:44:04:3B:60:BD:69:78:14:68:AF:A0:41:13:F6:17:07:13:63:58:CD
    
                X509v3 Subject Alternative Name:
                    DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster, DNS:kubernetes.default.svc.cluster.local, IP Address:127.0.0.1, IP Address:172.20.0.112, IP Address:172.20.0.113, IP Address:172.20.0.114, IP Address:172.20.0.115, IP Address:10.254.0.1
    ...
    • 确认 Issuer 字段的内容和 ca-csr.json 一致;
    • 确认 Subject 字段的内容和 kubernetes-csr.json 一致;
    • 确认 X509v3 Subject Alternative Name 字段的内容和 kubernetes-csr.json 一致;
    • 确认 X509v3 Key Usage、Extended Key Usage 字段的内容和 ca-config.jsonkubernetes profile 一致;

    使用 cfssl-certinfo 命令

    $ cfssl-certinfo -cert kubernetes.pem
    ...
    {
      "subject": {
        "common_name": "kubernetes",
        "country": "CN",
        "organization": "k8s",
        "organizational_unit": "System",
        "locality": "BeiJing",
        "province": "BeiJing",
        "names": [
          "CN",
          "BeiJing",
          "BeiJing",
          "k8s",
          "System",
          "kubernetes"
        ]
      },
      "issuer": {
        "common_name": "Kubernetes",
        "country": "CN",
        "organization": "k8s",
        "organizational_unit": "System",
        "locality": "BeiJing",
        "province": "BeiJing",
        "names": [
          "CN",
          "BeiJing",
          "BeiJing",
          "k8s",
          "System",
          "Kubernetes"
        ]
      },
      "serial_number": "174360492872423263473151971632292895707129022309",
      "sans": [
        "kubernetes",
        "kubernetes.default",
        "kubernetes.default.svc",
        "kubernetes.default.svc.cluster",
        "kubernetes.default.svc.cluster.local",
        "127.0.0.1",
        "10.64.3.7",
        "10.254.0.1"
      ],
      "not_before": "2017-04-05T05:36:00Z",
      "not_after": "2018-04-05T05:36:00Z",
      "sigalg": "SHA256WithRSA",
    ...

    分发证书

    将生成的证书和秘钥文件(后缀名为.pem)拷贝到所有机器的 /etc/kubernetes/ssl 目录下备用;

    $ sudo mkdir -p /etc/kubernetes/ssl
    $ sudo cp *.pem /etc/kubernetes/ssl

    参考

    • Generate self-signed certificates
    • Setting up a Certificate Authority and Creating TLS Certificates
    • Client Certificates V/s Server Certificates
    • 数字证书及 CA 的扫盲介绍

    二、创建 kubeconfig 文件

    kubeletkube-proxy 等 Node 机器上的进程与 Master 机器的 kube-apiserver 进程通信时需要认证和授权;

    kubernetes 1.4 开始支持由 kube-apiserver 为客户端生成 TLS 证书的 TLS Bootstrapping 功能,这样就不需要为每个客户端生成证书了;该功能当前仅支持为 kubelet 生成证书;

    创建 TLS Bootstrapping Token

    Token auth file

    Token可以是任意的包涵128 bit的字符串,可以使用安全的随机数发生器生成。

    export BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ')
    cat > token.csv  etcd.service nginx.org.
    Commercial support is available at
    nginx.com.
    
    Thank you for using nginx.
    
    

    访问172.20.0.113:32724172.20.0.114:32724或者172.20.0.115:32724都可以得到nginx的页面。

    welcome-nginx

    七、安装和配置 kubedns 插件

    官方的yaml文件目录:kubernetes/cluster/addons/dns

    该插件直接使用kubernetes部署,官方的配置文件中包含以下镜像:

    gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.1
    gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.1
    gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.1
    

    我clone了上述镜像,上传到我的私有镜像仓库:

    sz-pg-oam-docker-hub-001.tendcloud.com/library/k8s-dns-dnsmasq-nanny-amd64:1.14.1
    sz-pg-oam-docker-hub-001.tendcloud.com/library/k8s-dns-kube-dns-amd64:1.14.1
    sz-pg-oam-docker-hub-001.tendcloud.com/library/k8s-dns-sidecar-amd64:1.14.1
    

    同时上传了一份到时速云备份:

    index.tenxcloud.com/jimmy/k8s-dns-dnsmasq-nanny-amd64:1.14.1
    index.tenxcloud.com/jimmy/k8s-dns-kube-dns-amd64:1.14.1
    index.tenxcloud.com/jimmy/k8s-dns-sidecar-amd64:1.14.1
    

    以下yaml配置文件中使用的是私有镜像仓库中的镜像。

    kubedns-cm.yaml  
    kubedns-sa.yaml  
    kubedns-controller.yaml  
    kubedns-svc.yaml

    已经修改好的 yaml 文件见:dns

    系统预定义的 RoleBinding

    预定义的 RoleBinding system:kube-dns 将 kube-system 命名空间的 kube-dns ServiceAccount 与 system:kube-dns Role 绑定, 该 Role 具有访问 kube-apiserver DNS 相关 API 的权限;

    $ kubectl get clusterrolebindings system:kube-dns -o yaml
    apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      annotations:
        rbac.authorization.kubernetes.io/autoupdate: "true"
      creationTimestamp: 2017-04-11T11:20:42Z
      labels:
        kubernetes.io/bootstrapping: rbac-defaults
      name: system:kube-dns
      resourceVersion: "58"
      selfLink: /apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindingssystem%3Akube-dns
      uid: e61f4d92-1ea8-11e7-8cd7-f4e9d49f8ed0
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:kube-dns
    subjects:
    - kind: ServiceAccount
      name: kube-dns
      namespace: kube-system

    kubedns-controller.yaml 中定义的 Pods 时使用了 kubedns-sa.yaml 文件定义的 kube-dns ServiceAccount,所以具有访问 kube-apiserver DNS 相关 API 的权限。

    配置 kube-dns ServiceAccount

    无需修改。

    配置 kube-dns 服务

    $ diff kubedns-svc.yaml.base kubedns-svc.yaml
    30c30
       clusterIP: 10.254.0.2
    • spec.clusterIP = 10.254.0.2,即明确指定了 kube-dns Service IP,这个 IP 需要和 kubelet 的 --cluster-dns 参数值一致;

    配置 kube-dns Deployment

    $ diff kubedns-controller.yaml.base kubedns-controller.yaml
    58c58
             image: sz-pg-oam-docker-hub-001.tendcloud.com/library/k8s-dns-kube-dns-amd64:v1.14.1
    88c88
             - --domain=cluster.local.
    92c92
             #__PILLAR__FEDERATIONS__DOMAIN__MAP__
    110c110
             image: sz-pg-oam-docker-hub-001.tendcloud.com/library/k8s-dns-dnsmasq-nanny-amd64:v1.14.1
    129c129
             - --server=/cluster.local./127.0.0.1#10053
    148c148
             image: sz-pg-oam-docker-hub-001.tendcloud.com/library/k8s-dns-sidecar-amd64:v1.14.1
    161,162c161,162
             - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local.,5,A
    • 使用系统已经做了 RoleBinding 的 kube-dns ServiceAccount,该账户具有访问 kube-apiserver DNS 相关 API 的权限;

    执行所有定义文件

    $ pwd
    /root/kubedns
    $ ls *.yaml
    kubedns-cm.yaml  kubedns-controller.yaml  kubedns-sa.yaml  kubedns-svc.yaml
    $ kubectl create -f .

    检查 kubedns 功能

    新建一个 Deployment

    $ cat  my-nginx.yaml
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: my-nginx
    spec:
      replicas: 2
      template:
        metadata:
          labels:
            run: my-nginx
        spec:
          containers:
          - name: my-nginx
            image: sz-pg-oam-docker-hub-001.tendcloud.com/library/nginx:1.9
            ports:
            - containerPort: 80
    $ kubectl create -f my-nginx.yaml

    Export 该 Deployment, 生成 my-nginx 服务

    $ kubectl expose deploy my-nginx
    $ kubectl get services --all-namespaces |grep my-nginx
    default       my-nginx     10.254.179.239           80/TCP          42m

    创建另一个 Pod,查看 /etc/resolv.conf 是否包含 kubelet 配置的 --cluster-dns--cluster-domain,是否能够将服务my-nginx 解析到 Cluster IP 10.254.179.239

    $ kubectl create -f nginx-pod.yaml
    $ kubectl exec  nginx -i -t -- /bin/bash
    root@nginx:/# cat /etc/resolv.conf
    nameserver 10.254.0.2
    search default.svc.cluster.local. svc.cluster.local. cluster.local. tendcloud.com
    options ndots:5
    
    root@nginx:/# ping my-nginx
    PING my-nginx.default.svc.cluster.local (10.254.179.239): 56 data bytes
    76 bytes from 119.147.223.109: Destination Net Unreachable
    ^C--- my-nginx.default.svc.cluster.local ping statistics ---
    
    root@nginx:/# ping kubernetes
    PING kubernetes.default.svc.cluster.local (10.254.0.1): 56 data bytes
    ^C--- kubernetes.default.svc.cluster.local ping statistics ---
    11 packets transmitted, 0 packets received, 100% packet loss
    
    root@nginx:/# ping kube-dns.kube-system.svc.cluster.local
    PING kube-dns.kube-system.svc.cluster.local (10.254.0.2): 56 data bytes
    ^C--- kube-dns.kube-system.svc.cluster.local ping statistics ---
    6 packets transmitted, 0 packets received, 100% packet loss

    从结果来看,service名称可以正常解析。

    八、配置和安装 dashboard

    官方文件目录:kubernetes/cluster/addons/dashboard

    我们使用的文件

    $ ls *.yaml
    dashboard-controller.yaml  dashboard-service.yaml dashboard-rbac.yaml

    已经修改好的 yaml 文件见:dashboard

    由于 kube-apiserver 启用了 RBAC 授权,而官方源码目录的 dashboard-controller.yaml 没有定义授权的 ServiceAccount,所以后续访问 kube-apiserver 的 API 时会被拒绝,web中提示:

    Forbidden (403)
    
    User "system:serviceaccount:kube-system:default" cannot list jobs.batch in the namespace "default". (get jobs.batch)
    

    增加了一个dashboard-rbac.yaml文件,定义一个名为 dashboard 的 ServiceAccount,然后将它和 Cluster Role view 绑定。

    配置dashboard-service

    $ diff dashboard-service.yaml.orig dashboard-service.yaml
    10a11
    >   type: NodePort
    • 指定端口类型为 NodePort,这样外界可以通过地址 nodeIP:nodePort 访问 dashboard;

    配置dashboard-controller

    $ diff dashboard-controller.yaml.orig dashboard-controller.yaml
    23c23
             image: sz-pg-oam-docker-hub-001.tendcloud.com/library/kubernetes-dashboard-amd64:v1.6.0

    执行所有定义文件

    $ pwd
    /root/kubernetes/cluster/addons/dashboard
    $ ls *.yaml
    dashboard-controller.yaml  dashboard-service.yaml
    $ kubectl create -f  .
    service "kubernetes-dashboard" created
    deployment "kubernetes-dashboard" created

    检查执行结果

    查看分配的 NodePort

    $ kubectl get services kubernetes-dashboard -n kube-system
    NAME                   CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
    kubernetes-dashboard   10.254.224.130          80:30312/TCP   25s
    • NodePort 30312映射到 dashboard pod 80端口;

    检查 controller

    $ kubectl get deployment kubernetes-dashboard  -n kube-system
    NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
    kubernetes-dashboard   1         1         1            1           3m
    $ kubectl get pods  -n kube-system | grep dashboard
    kubernetes-dashboard-1339745653-pmn6z   1/1       Running   0          4m

    访问dashboard

    有以下三种方式:

    • kubernetes-dashboard 服务暴露了 NodePort,可以使用 http://NodeIP:nodePort 地址访问 dashboard;
    • 通过 kube-apiserver 访问 dashboard(https 6443端口和http 8080端口方式);
    • 通过 kubectl proxy 访问 dashboard:

    通过 kubectl proxy 访问 dashboard

    启动代理

    $ kubectl proxy --address='172.20.0.113' --port=8086 --accept-hosts='^*$'
    Starting to serve on 172.20.0.113:8086
    • 需要指定 --accept-hosts 选项,否则浏览器访问 dashboard 页面时提示 “Unauthorized”;

    浏览器访问 URL:http://172.20.0.113:8086/ui 自动跳转到:http://172.20.0.113:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default

    通过 kube-apiserver 访问dashboard

    获取集群服务地址列表

    $ kubectl cluster-info
    Kubernetes master is running at https://172.20.0.113:6443
    KubeDNS is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns
    kubernetes-dashboard is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

    浏览器访问 URL:https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard(浏览器会提示证书验证,因为通过加密通道,以改方式访问的话,需要提前导入证书到你的计算机中)。这是我当时在这遇到的坑:通过 kube-apiserver 访问dashboard,提示User “system:anonymous” cannot proxy services in the namespace “kube-system”. #5,已经解决。

    导入证书

    将生成的admin.pem证书转换格式

    openssl pkcs12 -export -in admin.pem  -out admin.p12 -inkey admin-key.pem
    

    将生成的admin.p12证书导入的你的电脑,导出的时候记住你设置的密码,导入的时候还要用到。

    如果你不想使用https的话,可以直接访问insecure port 8080端口:http://172.20.0.113:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

    kubernetes-dashboard

    由于缺少 Heapster 插件,当前 dashboard 不能展示 Pod、Nodes 的 CPU、内存等 metric 图形。

    九、配置和安装 Heapster

    到 heapster release 页面 下载最新版本的 heapster。

    $ wget https://github.com/kubernetes/heapster/archive/v1.3.0.zip
    $ unzip v1.3.0.zip
    $ mv v1.3.0.zip heapster-1.3.0

    文件目录: heapster-1.3.0/deploy/kube-config/influxdb

    $ cd heapster-1.3.0/deploy/kube-config/influxdb
    $ ls *.yaml
    grafana-deployment.yaml  grafana-service.yaml  heapster-deployment.yaml  heapster-service.yaml  influxdb-deployment.yaml  influxdb-service.yaml heapster-rbac.yaml

    我们自己创建了heapster的rbac配置heapster-rbac.yaml

    已经修改好的 yaml 文件见:heapster

    配置 grafana-deployment

    $ diff grafana-deployment.yaml.orig grafana-deployment.yaml
    16c16
             image: sz-pg-oam-docker-hub-001.tendcloud.com/library/heapster-grafana-amd64:v4.0.2
    40,41c40,41
               #value: /
    • 如果后续使用 kube-apiserver 或者 kubectl proxy 访问 grafana dashboard,则必须将 GF_SERVER_ROOT_URL 设置为/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/,否则后续访问grafana时访问时提示找不到http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/api/dashboards/home 页面;

    配置 heapster-deployment

    $ diff heapster-deployment.yaml.orig heapster-deployment.yaml
    16c16
             image: sz-pg-oam-docker-hub-001.tendcloud.com/library/heapster-amd64:v1.3.0-beta.1

    配置 influxdb-deployment

    influxdb 官方建议使用命令行或 HTTP API 接口来查询数据库,从 v1.1.0 版本开始默认关闭 admin UI,将在后续版本中移除 admin UI 插件。

    开启镜像中 admin UI的办法如下:先导出镜像中的 influxdb 配置文件,开启 admin 插件后,再将配置文件内容写入 ConfigMap,最后挂载到镜像中,达到覆盖原始配置的目的:

    注意:manifests 目录已经提供了 修改后的 ConfigMap 定义文件

    $ # 导出镜像中的 influxdb 配置文件
    $ docker run --rm --entrypoint 'cat'  -ti lvanneo/heapster-influxdb-amd64:v1.1.1 /etc/config.toml >config.toml.orig
    $ cp config.toml.orig config.toml
    $ # 修改:启用 admin 接口
    $ vim config.toml
    $ diff config.toml.orig config.toml
    35c35
       enabled = true
    $ # 将修改后的配置写入到 ConfigMap 对象中
    $ kubectl create configmap influxdb-config --from-file=config.toml  -n kube-system
    configmap "influxdb-config" created
    $ # 将 ConfigMap 中的配置文件挂载到 Pod 中,达到覆盖原始配置的目的
    $ diff influxdb-deployment.yaml.orig influxdb-deployment.yaml
    16c16
             image: sz-pg-oam-docker-hub-001.tendcloud.com/library/heapster-influxdb-amd64:v1.1.1
    19a20,21
    >         - mountPath: /etc/
    >           name: influxdb-config
    22a25,27
    >       - name: influxdb-config
    >         configMap:
    >           name: influxdb-config

    配置 monitoring-influxdb Service

    $ diff influxdb-service.yaml.orig influxdb-service.yaml
    12a13
    >   type: NodePort
    15a17,20
    >     name: http
    >   - port: 8083
    >     targetPort: 8083
    >     name: admin
    
    • 定义端口类型为 NodePort,额外增加了 admin 端口映射,用于后续浏览器访问 influxdb 的 admin UI 界面;

    执行所有定义文件

    $ pwd
    /root/heapster-1.3.0/deploy/kube-config/influxdb
    $ ls *.yaml
    grafana-service.yaml      heapster-rbac.yaml     influxdb-cm.yaml          influxdb-service.yaml
    grafana-deployment.yaml  heapster-deployment.yaml  heapster-service.yaml  influxdb-deployment.yaml
    $ kubectl create -f  .
    deployment "monitoring-grafana" created
    service "monitoring-grafana" created
    deployment "heapster" created
    serviceaccount "heapster" created
    clusterrolebinding "heapster" created
    service "heapster" created
    configmap "influxdb-config" created
    deployment "monitoring-influxdb" created
    service "monitoring-influxdb" created

    检查执行结果

    检查 Deployment

    $ kubectl get deployments -n kube-system | grep -E 'heapster|monitoring'
    heapster               1         1         1            1           2m
    monitoring-grafana     1         1         1            1           2m
    monitoring-influxdb    1         1         1            1           2m

    检查 Pods

    $ kubectl get pods -n kube-system | grep -E 'heapster|monitoring'
    heapster-110704576-gpg8v                1/1       Running   0          2m
    monitoring-grafana-2861879979-9z89f     1/1       Running   0          2m
    monitoring-influxdb-1411048194-lzrpc    1/1       Running   0          2m

    检查 kubernets dashboard 界面,看是显示各 Nodes、Pods 的 CPU、内存、负载等利用率曲线图;

    Kubernetes集群安装文档-v1.6版本-4

    访问 grafana

  • 通过 kube-apiserver 访问:获取 monitoring-grafana 服务 URL
    $ kubectl cluster-info
    Kubernetes master is running at https://172.20.0.113:6443
    Heapster is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/heapster
    KubeDNS is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns
    kubernetes-dashboard is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
    monitoring-grafana is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
    monitoring-influxdb is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb
    
    To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

    浏览器访问 URL: http://172.20.0.113:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

  • 通过 kubectl proxy 访问:创建代理
    $ kubectl proxy --address='172.20.0.113' --port=8086 --accept-hosts='^*$'
    Starting to serve on 172.20.0.113:8086

    浏览器访问 URL:http://172.20.0.113:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

  • Kubernetes集群安装文档-v1.6版本-5

    访问 influxdb admin UI

    获取 influxdb http 8086 映射的 NodePort

    $ kubectl get svc -n kube-system|grep influxdb
    monitoring-influxdb    10.254.22.46           8086:32299/TCP,8083:30269/TCP   9m
    

    通过 kube-apiserver 的非安全端口访问 influxdb 的 admin UI 界面:http://172.20.0.113:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:8083/

    在页面的 “Connection Settings” 的 Host 中输入 node IP, Port 中输入 8086 映射的 nodePort 如上面的 32299,点击 “Save” 即可(我的集群中的地址是172.20.0.113:32299):

    Kubernetes集群安装文档-v1.6版本-6

     

    十、配置和安装 EFK

    官方文件目录:cluster/addons/fluentd-elasticsearch

    $ ls *.yaml
    es-controller.yaml  es-service.yaml  fluentd-es-ds.yaml  kibana-controller.yaml  kibana-service.yaml efk-rbac.yaml

    同样EFK服务也需要一个efk-rbac.yaml文件,配置serviceaccount为efk

    已经修改好的 yaml 文件见:EFK

    配置 es-controller.yaml

    $ diff es-controller.yaml.orig es-controller.yaml
    24c24
           - image: sz-pg-oam-docker-hub-001.tendcloud.com/library/elasticsearch:v2.4.1-2

    配置 es-service.yaml

    无需配置;

    配置 fluentd-es-ds.yaml

    $ diff fluentd-es-ds.yaml.orig fluentd-es-ds.yaml
    26c26
             image: sz-pg-oam-docker-hub-001.tendcloud.com/library/fluentd-elasticsearch:1.22

    配置 kibana-controller.yaml

    $ diff kibana-controller.yaml.orig kibana-controller.yaml
    22c22
             image: sz-pg-oam-docker-hub-001.tendcloud.com/library/kibana:v4.6.1-1

    给 Node 设置标签

    定义 DaemonSet fluentd-es-v1.22 时设置了 nodeSelector beta.kubernetes.io/fluentd-ds-ready=true ,所以需要在期望运行 fluentd 的 Node 上设置该标签;

    $ kubectl get nodes
    NAME        STATUS    AGE       VERSION
    172.20.0.113   Ready     1d        v1.6.0
    
    $ kubectl label nodes 172.20.0.113 beta.kubernetes.io/fluentd-ds-ready=true
    node "172.20.0.113" labeled

    给其他两台node打上同样的标签。

    执行定义文件

    $ kubectl create -f .
    serviceaccount "efk" created
    clusterrolebinding "efk" created
    replicationcontroller "elasticsearch-logging-v1" created
    service "elasticsearch-logging" created
    daemonset "fluentd-es-v1.22" created
    deployment "kibana-logging" created
    service "kibana-logging" created

    检查执行结果

    $ kubectl get deployment -n kube-system|grep kibana
    kibana-logging         1         1         1            1           2m
    
    $ kubectl get pods -n kube-system|grep -E 'elasticsearch|fluentd|kibana'
    elasticsearch-logging-v1-mlstp          1/1       Running   0          1m
    elasticsearch-logging-v1-nfbbf          1/1       Running   0          1m
    fluentd-es-v1.22-31sm0                  1/1       Running   0          1m
    fluentd-es-v1.22-bpgqs                  1/1       Running   0          1m
    fluentd-es-v1.22-qmn7h                  1/1       Running   0          1m
    kibana-logging-1432287342-0gdng         1/1       Running   0          1m
    
    $ kubectl get service  -n kube-system|grep -E 'elasticsearch|kibana'
    elasticsearch-logging   10.254.77.62            9200/TCP                        2m
    kibana-logging          10.254.8.113            5601/TCP                        2m

    kibana Pod 第一次启动时会用**较长时间(10-20分钟)**来优化和 Cache 状态页面,可以 tailf 该 Pod 的日志观察进度:

    $ kubectl logs kibana-logging-1432287342-0gdng -n kube-system -f
    ELASTICSEARCH_URL=http://elasticsearch-logging:9200
    server.basePath: /api/v1/proxy/namespaces/kube-system/services/kibana-logging
    {"type":"log","@timestamp":"2017-04-12T13:08:06Z","tags":["info","optimize"],"pid":7,"message":"Optimizing and caching bundles for kibana and statusPage. This may take a few minutes"}
    {"type":"log","@timestamp":"2017-04-12T13:18:17Z","tags":["info","optimize"],"pid":7,"message":"Optimization of bundles for kibana and statusPage complete in 610.40 seconds"}
    {"type":"log","@timestamp":"2017-04-12T13:18:17Z","tags":["status","plugin:kibana@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
    {"type":"log","@timestamp":"2017-04-12T13:18:18Z","tags":["status","plugin:elasticsearch@1.0.0","info"],"pid":7,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
    {"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:kbn_vislib_vis_types@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
    {"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:markdown_vis@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
    {"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:metric_vis@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
    {"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:spyModes@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
    {"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:statusPage@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
    {"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:table_vis@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
    {"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["listening","info"],"pid":7,"message":"Server running at http://0.0.0.0:5601"}
    {"type":"log","@timestamp":"2017-04-12T13:18:24Z","tags":["status","plugin:elasticsearch@1.0.0","info"],"pid":7,"state":"yellow","message":"Status changed from yellow to yellow - No existing Kibana index found","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
    {"type":"log","@timestamp":"2017-04-12T13:18:29Z","tags":["status","plugin:elasticsearch@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from yellow to green - Kibana index ready","prevState":"yellow","prevMsg":"No existing Kibana index found"}

    访问 kibana

  • 通过 kube-apiserver 访问:获取 monitoring-grafana 服务 URL
    $ kubectl cluster-info
    Kubernetes master is running at https://172.20.0.113:6443
    Elasticsearch is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/elasticsearch-logging
    Heapster is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/heapster
    Kibana is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/kibana-logging
    KubeDNS is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns
    kubernetes-dashboard is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
    monitoring-grafana is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
    monitoring-influxdb is running at https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb

    浏览器访问 URL: https://172.20.0.113:6443/api/v1/proxy/namespaces/kube-system/services/kibana-logging/app/kibana

  • 通过 kubectl proxy 访问:创建代理
    $ kubectl proxy --address='172.20.0.113' --port=8086 --accept-hosts='^*$'
    Starting to serve on 172.20.0.113:8086

    浏览器访问 URL:http://172.20.0.113:8086/api/v1/proxy/namespaces/kube-system/services/kibana-logging

  • 在 Settings -> Indices 页面创建一个 index(相当于 mysql 中的一个 database),选中 Index contains time-based events,使用默认的 logstash-* pattern,点击 Create ;

    可能遇到的问题

    如果你在这里发现Create按钮是灰色的无法点击,且Time-filed name中没有选项,fluentd要读取/var/log/containers/目录下的log日志,这些日志是从/var/lib/docker/containers/${CONTAINER_ID}/${CONTAINER_ID}-json.log链接过来的,查看你的docker配置,—log-dirver需要设置为json-file格式,默认的可能是journald,参考docker logging。

    Kubernetes集群安装文档-v1.6版本-7

    创建Index后,可以在 Discover 下看到 ElasticSearch logging 中汇聚的日志;

    Kubernetes集群安装文档-v1.6版本-8

    注意

  • 由于启用了 TLS 双向认证、RBAC 授权等严格的安全机制,建议从头开始部署,而不要从中间开始,否则可能会认证、授权等失败!
  • 本文档将随着各组件的更新而更新,有任何问题欢迎提 issue!
  • 相关文章

    KubeSphere 部署向量数据库 Milvus 实战指南
    探索 Kubernetes 持久化存储之 Longhorn 初窥门径
    征服 Docker 镜像访问限制!KubeSphere v3.4.1 成功部署全攻略
    那些年在 Terraform 上吃到的糖和踩过的坑
    无需 Kubernetes 测试 Kubernetes 网络实现
    Kubernetes v1.31 中的移除和主要变更

    发布评论