efk
ElasticSearch 是一个搜索引擎,由java编写,Logstash用于生成日志,收集日志,转换日志,而后输入到ElasticSearch 。
Logstash扮演agent用来收集日志,因此使用DaemonSet,日志收集完成发送到Logstash server端,Logstash统一整理之后注入到ElasticSearch 当中。ElasticSearch 一般情况下会是一个集群。Logstash也可能不是一个节点,如果两者直接速度不协调,可以使用消息队列,比如:redis组件。一般使用的情况如下图:
其中Logstash扮演的角色也包括将所有节点的日志格式化(统一的格式转换)等额外操作,而后注入到ElasticSearch cluster。
但是Logstash作为agent来讲,会显得重量级。因此,使用Filebeat来替代。但是,我们并不使用Filebeat。在节点上工作并收集日志的工具不单单有Logstash,Filebeat,还有Fluentd,还有其他的等。
在k8s集群上,每个节点运行了很多pod,每个pod内也有很多容器,这些容器的日志信息,就需要统一平台进行管理。
尽管我们可以使用kuberctl logs进行查看日志内容,但是在k8s中,pod是 以群体存在的,这种方式并不适用。其次,如果一个pod中的容器崩溃,那么崩溃前的日志内容很可能就无法收集,这是无法解决的。
当一个pod容器宕掉,想要查看日志,就需要提前实时进行收集,在云环境平台,尤其需要日志统一平台进行收集日志。
在一个完整的k8s平台中,有4个重要附件:coredns,ingress,metrics-server+prometheus,dashboard,而EFK也作为一个基础附件,不过,EFK在k8s集群中大多情况下是必须要提供的。
ElasticSearch
而在helm中efk的部署方式,特别是ElasticSearch的部署方式和常规的elk方式是有所不同的。为了能让ElasticSearch运行在k8s环境下,ElasticSearch官方制作了类型的相关镜象,在其镜象中打包了相关组件,直接能够运行在k8s之上,一般而言可以运行三种格式:
ElasticSearch拆分为两部分,由master和data组成,master节点负责轻量化的查询请求,data节点负责重量级的索引构建等请求。ElasticSearch能够完成既能查询也能构建索引功能,分为两层实现。master作为接入到ElasticSearch的唯一入口,而master和data都可作为分布式,data层可随量横向扩展,master也做多台冗余。在data层是需要持久数据存储,也就是存储卷。这样一来就意味着有两个集群,分别是master和data集群。如下图:
除此之外,还有client集群。摄入节点(如上图),收入任何的日志收集工具,如Fluentd发来的日志,由摄入节点统一生成特定格式以后发给master节点,由此,我们可以当作它为Logstash节点。在ELK中Logstash也就是作为清洗日志,而后转交给ElasticSearch。而摄入节点也就是有此功能。这也是x-pack的方式。
- master 和 data都是有状态的,意味着需要存储卷,持久存储卷。
k8s日志收集方式
收集日志可以分为两种方式:pod内部和外部
内部:每个pod单独发送自己的日志给日志存储的,这样一来就需要在pod内部署几个容器来进行收集日志,随着pod增多,那就需要部署很多。也可以在docker中部署日志收集器,但是这种方式并不被推荐。
外部:一般而言, 会在节点上部署一个统一的插件,节点上所有容器,以及节点自身。统一收集发往日志平台,而这套日志收集系统,可以部署在集群内,也可以部署在集群之外。
我们暂且不管ElasticSearch部署在哪里。
假如使用Fluentd收集,Fluentd本身是部署在集群外和集群内是需要考量的。如果在集群内,只需要挂载存储卷并且将主机日志目录关联到pod中,运行DaemonSet即可。而运行主机上,如果Fluentd出现问题就不再集群控制范围内。
- Fluentd本身是通过本地/var/log来获取日志,而节点上每一个容器的日志是输出到/var/log/containers/下
部署elasticsearch
[root@linuxea helm]# helm fetch stable/elasticsearch --version 1.13.3
[root@linuxea helm]# tar xf elasticsearch-1.13.3.tgz
tar: elasticsearch/Chart.yaml: implausibly old time stamp 1970-01-01 01:00:00
tar: elasticsearch/values.yaml: implausibly old time stamp 1970-01-01 01:00:00
tar: elasticsearch/templates/NOTES.txt: implausibly old time stamp 1970-01-01 01:00:00
tar: elasticsearch/templates/_helpers.tpl: implausibly old time stamp 1970-01-01 01:00:00
tar: elasticsearch/templates/client-deployment.yaml: implausibly old time stamp 1970-01-01 01:00:
创建一个名称空间
[root@linuxea elasticsearch]# kubectl create namespace efk
namespace/efk created
[root@linuxea elasticsearch]# kubectl get ns -n efk
NAME STATUS AGE
default Active 8d
efk Active 13s
而后install stable/elasticsearch,并且指定名称空间,-f指定/values.yaml文件
- values文件部分要将持久存储关闭
podDisruptionBudget:
enabled: false
persistence:
enabled: false
安装
[root@linuxea elasticsearch]# helm install --name els-1 --namespace=efk -f ./values.yaml stable/elasticsearch
NAME: els-1
LAST DEPLOYED: Mon Nov 19 06:45:40 2018
NAMESPACE: efk
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME AGE
els-1-elasticsearch 1s
==> v1/ServiceAccount
els-1-elasticsearch-client 1s
els-1-elasticsearch-data 1s
els-1-elasticsearch-master 1s
==> v1/Service
els-1-elasticsearch-client 1s
els-1-elasticsearch-discovery 1s
==> v1beta1/Deployment
els-1-elasticsearch-client 1s
==> v1beta1/StatefulSet
els-1-elasticsearch-data 1s
els-1-elasticsearch-master 1s
==> v1/Pod(related)
初始化状态
NAME READY STATUS RESTARTS AGE
els-1-elasticsearch-client-779495bbdc-5d22f 0/1 Init:0/1 0 1s
els-1-elasticsearch-client-779495bbdc-tzbps 0/1 Init:0/1 0 1s
els-1-elasticsearch-data-0 0/1 Init:0/2 0 1s
els-1-elasticsearch-master-0 0/1 Init:0/2 0 1s
NOTES信息,这些信息可通过stats复现
NOTES:
The elasticsearch cluster has been installed.
Elasticsearch can be accessed:
* Within your cluster, at the following DNS name at port 9200:
els-1-elasticsearch-client.efk.svc
* From outside the cluster, run these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace efk -l "app=elasticsearch,component=client,release=els-1" -o jsonpath="{.items[0].metadata.name}")
echo "Visit http://127.0.0.1:9200 to use Elasticsearch"
kubectl port-forward --namespace efk $POD_NAME 9200:9200
当镜象拉取完成,都准备好就会启动
[root@linuxea ~]# kubectl get pods -n efk
NAME READY STATUS RESTARTS AGE
els-1-elasticsearch-client-779495bbdc-5d22f 0/1 Running 0 32s
els-1-elasticsearch-client-779495bbdc-tzbps 0/1 Running 0 32s
els-1-elasticsearch-data-0 0/1 Running 0 32s
els-1-elasticsearch-master-0 0/1 Running 0 32s
稍等后全是Running状态,READY满载,也就是client,client,master都运行了2个
[root@linuxea ~]# kubectl get pods -n efk
NAME READY STATUS RESTARTS AGE
els-1-elasticsearch-client-779495bbdc-5d22f 1/1 Running 0 1m
els-1-elasticsearch-client-779495bbdc-tzbps 1/1 Running 0 1m
els-1-elasticsearch-data-0 1/1 Running 0 1m
els-1-elasticsearch-data-1 1/1 Running 0 47s
els-1-elasticsearch-master-0 1/1 Running 0 1m
els-1-elasticsearch-master-1 1/1 Running 0 53s
验证我们下载一个cirror镜象,验证elasticsearch是否正常运行
[root@linuxea elasticsearch]# kubectl run cirror-$RANDOM --rm -it --image=cirros -- /bin/sh
If you don't see a command prompt, try pressing enter.
/#
查看是否能够被解析
/ # nslookup els-1-elasticsearch-client.efk.svc
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: els-1-elasticsearch-client.efk.svc
Address 1: 10.104.57.197 els-1-elasticsearch-client.efk.svc.cluster.local
是否能够访问9200端口
/ # curl els-1-elasticsearch-client.efk.svc.cluster.local:9200
{
"name" : "els-1-elasticsearch-client-779495bbdc-tzbps",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "ROD_0h1vRiW_5POVBwz3Nw",
"version" : {
"number" : "6.4.3",
"build_flavor" : "oss",
"build_type" : "tar",
"build_hash" : "fe40335",
"build_date" : "2018-10-30T23:17:19.084789Z",
"build_snapshot" : false,
"lucene_version" : "7.4.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
_cat
/ # curl els-1-elasticsearch-client.efk.svc.cluster.local:9200/_cat/
=^.^=
/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/tasks
/_cat/indices
/_cat/indices/{index}
/_cat/segments
/_cat/segments/{index}
/_cat/count
/_cat/count/{index}
/_cat/recovery
/_cat/recovery/{index}
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/aliases/{alias}
/_cat/thread_pool
/_cat/thread_pool/{thread_pools}
/_cat/plugins
/_cat/fielddata
/_cat/fielddata/{fields}
/_cat/nodeattrs
/_cat/repositories
/_cat/snapshots/{repository}
/_cat/templates
以及节点
/ # curl els-1-elasticsearch-client.efk.svc.cluster.local:9200/_cat/nodes
172.16.4.92 21 85 2 0.32 0.27 0.16 i - els-1-elasticsearch-client-779495bbdc-tzbps
172.16.5.55 19 76 2 0.08 0.11 0.12 mi * els-1-elasticsearch-master-1
172.16.3.109 21 40 5 0.15 0.13 0.10 di - els-1-elasticsearch-data-0
172.16.4.91 21 85 2 0.32 0.27 0.16 mi - els-1-elasticsearch-master-0
172.16.5.54 21 76 2 0.08 0.11 0.12 i - els-1-elasticsearch-client-779495bbdc-5d22f
172.16.4.93 21 85 1 0.32 0.27 0.16 di - els-1-elasticsearch-data-1
或者索引
/ # curl els-1-elasticsearch-client.efk.svc.cluster.local:9200/_cat/indices
部署fluenetd
[root@linuxea helm]# helm fetch stable/fluentd-elasticsearch
[root@linuxea helm]# tar xf fluentd-elasticsearch-1.1.0.tgz
tar: fluentd-elasticsearch/Chart.yaml: implausibly old time stamp 1970-01-01 01:00:00
tar: fluentd-elasticsearch/values.yaml: implausibly old time stamp 1970-01-01 01:00:00
tar: fluentd-elasticsearch/templates/NOTES.txt: implausibly old time stamp 1970-01-01 01:00:00
tar: fluentd-elasticsearch/templates/_helpers.tpl: implausibly old time stamp 1970-01-01 01:00:00
tar: fluentd-elasticsearch/templates/clusterrole.yaml: implausibly old time stamp 1970-01-01 01:00:00
修改
elasticsearch:
host: 'elasticsearch-client'
port: 9200
buffer_chunk_limit: 2M
buffer_queue_limit: 8
为
elasticsearch:
host: 'els-1-elasticsearch-client.efk.svc.cluster.local'
port: 9200
buffer_chunk_limit: 2M
buffer_queue_limit: 8
这个els-1-elasticsearch-client.efk.svc.cluster.local地址是集群内部的地址,并不是pod地址。
prometheus启用
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "24231"
-------------
service:
type: ClusterIP
ports:
- name: "monitor-agent"
port: 24231
污点容忍度
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
而后install
[root@linuxea helm]# helm install --name fluentd1 --namespace=efk -f fluentd-elasticsearch/values.yaml stable/fluentd-elasticsearch
观察pod running
[root@linuxea ~]# kubectl get pods -n efk -w
NAME READY STATUS RESTARTS AGE
els-1-elasticsearch-client-779495bbdc-5d22f 1/1 Running 0 1h
els-1-elasticsearch-client-779495bbdc-tzbps 1/1 Running 0 1h
els-1-elasticsearch-data-0 1/1 Running 0 1h
els-1-elasticsearch-data-1 1/1 Running 0 1h
els-1-elasticsearch-master-0 1/1 Running 0 1h
els-1-elasticsearch-master-1 1/1 Running 0 1h
fluentd1-fluentd-elasticsearch-28wnb 0/1 ContainerCreating 0 16s
fluentd1-fluentd-elasticsearch-77qmr 0/1 ContainerCreating 0 16s
fluentd1-fluentd-elasticsearch-885fc 0/1 ContainerCreating 0 16s
fluentd1-fluentd-elasticsearch-9kzfm 0/1 ContainerCreating 0 16s
fluentd1-fluentd-elasticsearch-lnbvg 0/1 ContainerCreating 0 16s
fluentd1-fluentd-elasticsearch-9kzfm 1/1 Running 0 2m
fluentd1-fluentd-elasticsearch-77qmr 1/1 Running 0 2m
fluentd1-fluentd-elasticsearch-28wnb 1/1 Running 0 2m
fluentd1-fluentd-elasticsearch-885fc 1/1 Running 0 2m
fluentd1-fluentd-elasticsearch-lnbvg 1/1 Running 0 3m
验证
验证索引
/ # curl els-1-elasticsearch-client.efk.svc.cluster.local:9200/_cat/indices
green open logstash-2018.11.15 trf07L62QHaAIG5oym73kw 5 1 12796 0 5.9mb 3mb
green open logstash-2018.11.16 thookgxbS86mNmbLPCeOQA 5 1 9964 0 4.8mb 2.4mb
green open logstash-2018.11.18 Y0pbmu7RSQizbvoH7V6Cig 5 1 9597 0 5.3mb 2.7mb
green open logstash-2018.11.17 pIhgyn-7TaeYOzLbH-dSJA 5 1 13302 0 9.8mb 4.8mb
green open logstash-2018.11.14 JirkQqsUSmqUnn0bRPVJLA 5 1 12402 0 5.8mb 2.8mb
green open logstash-2018.11.10 4zOpjsVFSMmF0hpY13jryg 5 1 179246 0 128.8mb 65.8mb
green open logstash-2018.11.11 ZF6V9DETQlCBsJPSMdc6ww 5 1 34778 0 15.7mb 7.8mb
green open logstash-2018.11.13 0qZgntkHTRiuK2_S3Rtvnw 5 1 12679 0 6.3mb 3.1mb
green open logstash-2018.11.09 8lSv0UvxQHWTPBZVx9EMVg 5 1 7229 0 7.3mb 4mb
green open logstash-2018.11.12 rJiKEdFdTzqE3ovKecn4kw 5 1 11983 0 5.3mb 2.6mb
green open logstash-2018.11.19 MhasgEdtS3KWR-KpS30E7A 5 1 38671 0 73.3mb 40mb
部署kibana
kibana版本要和es一致
[root@linuxea helm]# helm fetch stable/kibana --version 0.18.0
[root@linuxea helm]# tar xf kibana-0.18.0.tgz
tar: kibana/Chart.yaml: implausibly old time stamp 1970-01-01 01:00:00
tar: kibana/values.yaml: implausibly old time stamp 1970-01-01 01:00:00
tar: kibana/templates/NOTES.txt: implausibly old time stamp 1970-01-01 01:00:00
tar: kibana/templates/_helpers.tpl: implausibly old time stamp 1970-01-01 01:00:00
修改values的elasticsearch.url地址,这个地址可使用helm list查看,并且使用helm status HELMNAME查看
files:
kibana.yml:
## Default Kibana configuration from kibana-docker.
server.name: kibana
server.host: "0"
# elasticsearch.url: http://elasticsearch:9200
elasticsearch.url: http://els-1-elasticsearch-client.efk.svc:9200
使用NodePort。便于集群外访问
service:
type: NodePort
externalPort: 443
internalPort: 5601
另外,如果有必要,这里的image版本必须和ElasticSearch一样
image:
repository: "docker.elastic.co/kibana/kibana-oss"
tag: "6.4.3"
pullPolicy: "IfNotPresent"
安装
[root@linuxea helm]# helm install --name kibana1 --namespace=efk -f kibana/values.yaml stable/kibana --version 0.18.0
NAME: kibana1
LAST DEPLOYED: Mon Nov 19 08:13:34 2018
NAMESPACE: efk
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME AGE
kibana1 1s
==> v1/Service
kibana1 1s
==> v1beta1/Deployment
kibana1 1s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
kibana1-578f8d68c7-dvq2z 0/1 ContainerCreating 0 1s
NOTES:
To verify that kibana1 has started, run:
kubectl --namespace=efk get pods -l "app=kibana"
Kibana can be accessed:
* From outside the cluster, run these commands in the same shell:
export NODE_PORT=$(kubectl get --namespace efk -o jsonpath="{.spec.ports[0].nodePort}" services kibana1)
export NODE_IP=$(kubectl get nodes --namespace efk -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
经过漫长的等待kibana1已经running,不过由于资源不足,fluentd有被驱逐Evicted的现象
[root@linuxea helm]# kubectl get pods -n efk -o wide -w
NAME READY STATUS RESTARTS AGE IP NODE
els-1-elasticsearch-client-779495bbdc-9rg4x 1/1 Running 0 7m 172.16.3.124 linuxea.node-2.com
els-1-elasticsearch-client-779495bbdc-bhq2f 1/1 Running 0 7m 172.16.2.28 linuxea.node-1.com
els-1-elasticsearch-data-0 1/1 Running 0 7m 172.16.5.65 linuxea.node-4.com
els-1-elasticsearch-data-1 1/1 Running 0 6m 172.16.2.29 linuxea.node-1.com
els-1-elasticsearch-master-0 1/1 Running 0 7m 172.16.3.125 linuxea.node-2.com
els-1-elasticsearch-master-1 1/1 Running 0 6m 172.16.5.66 linuxea.node-4.com
els-1-elasticsearch-master-2 1/1 Running 0 5m 172.16.4.97 linuxea.node-3.com
fluentd1-fluentd-elasticsearch-2bllt 1/1 Running 0 3m 172.16.4.98 linuxea.node-3.com
fluentd1-fluentd-elasticsearch-7pkvl 1/1 Running 0 3m 172.16.2.30 linuxea.node-1.com
fluentd1-fluentd-elasticsearch-cnhk6 1/1 Running 0 3m 172.16.0.26 linuxea.master-1.com
fluentd1-fluentd-elasticsearch-mk9m2 1/1 Running 0 3m 172.16.5.67 linuxea.node-4.com
fluentd1-fluentd-elasticsearch-wm2kw 1/1 Running 0 3m 172.16.3.126 linuxea.node-2.com
kibana1-bfbbf89f6-4tkzb 0/1 ContainerCreating 0 45s <none> linuxea.node-2.com
kibana1-bfbbf89f6-4tkzb 1/1 Running 0 1m 172.16.3.127 linuxea.node-2.com
通过NodePort端口进行访问
[root@linuxea helm]# kubectl get svc -n efk
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
fluentd1-fluentd-elasticsearch ClusterIP 10.106.221.12 <none> 24231/TCP 4m
kibana1 NodePort 10.98.70.188 <none> 443:32725/TCP 2m