在之前的pod生命周期介绍中,容器启动并不是单纯的up起来就能完成工作,这期间需要做一系列的配置预热才可以更好的被使用。
存活状态liveness probe
探针类型有三种,ExecAction
,TCPSocketAction
,HTTPGetAction
ExecAction
: 容器探测failureThreshold
:探测几次,默认3次探测失败,则失败periodSeconds
:间隔时间,10秒timeoutSeconds
:探测超时时间,1秒initialDelayseconds
: 初始化探测延迟时间(在容器启动的时候,并不能立即做探测,因为在容器启动时候,有启动前初始化操作,也就是说,这个时间,容器并未准备好,因此,应该等待一段时间,确保初始化成功后在进行探测。也就是说,容器启动,并不意味着容器内的进程就运行起来的)。如果未定义,容器启动就探测,就会出现问题(默认情况下,kubernetes在容器内进程启动后立即发送流量,如果尚未准备好,则会出错(默认情况下,kubernetes认为一切正常就可以向“尚未准备完成,且已经up的pod发送请求”),正确的方式应该等待容器准备完成,完全启动,然后才允许服务将流量发送到该pod)。仍然使用生命周期中的图作为参考,如下:
ExecAction
ExecAction
内嵌command,其命令必须在容器内存在,如果返回成功,则存活,否则失败
command探测
编辑yaml文件其中,imagePullPolicy使用IfNotPresent,如果本地存在镜象,就不拉取,不存在在拉取command命令中,创建一个linuxea的文件,暂停10秒,而后又删除,删除完成后暂停3600秒
在liveness Probe(存活状态监测中),test /linuxea初始化探针延迟(initialDelaySeconds)1秒间隔三秒探测一次(periodSeconds)
[root@linuxea linuxea]# cat liveness-exec-linuxea.yaml
apiVersion: v1
kind: Pod
metadata:
name: linuxea.com
namespace: default
spec:
containers:
- name: liveness-exec-linuxea
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["/bin/sh","-c","touch /linuxea; sleep 10; rm -f /linuxea;sleep 3600"]
livenessProbe:
exec:
command: ["test","-e","/linuxea"]
initialDelaySeconds: 1
periodSeconds: 3
# restartPolicy: Onfailure
启动
[root@linuxea linuxea]# kubectl create -f liveness-exec-linuxea.yaml
pod/linuxea.com created
启动后开始ContainerCreating,这个动作中如果node没有镜象就会去下载
[root@linuxea linuxea]# kubectl get pods
NAME READY STATUS RESTARTS AGE
client-linuxea 1/1 Running 0 3d
linuxea.com 0/1 ContainerCreating 0 7s
而后使用kubectl get pods -w
查看状态信息
[root@linuxea linuxea]# kubectl get pods -w
NAME READY STATUS RESTARTS AGE
client-linuxea 1/1 Running 0 3d
linuxea.com 1/1 Running 0 36s
nginx-linuxea-5786698598-dx2jr 1/1 Running 0 1d
nginx-linuxea-5786698598-stttv 1/1 Running 0 1d
nginx-linuxea-5786698598-t6lpx 1/1 Running 0 1d
pod-demo-linuxea 3/3 Running 0 22h
linuxea.com 1/1 Running 1 58s
linuxea.com 1/1 Running 2 1m
linuxea.com 1/1 Running 3 2m
linuxea.com 1/1 Running 4 3m
linuxea.com 1/1 Running 5 4m
此前RESTARTS为5,也就是说检测了五次/linuxea文件,由于被删除,所以失败了五次,restartPolicy默认重启,所以目前也重启了五次
[root@linuxea linuxea]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
client-linuxea 1/1 Running 0 3d 172.16.2.252 linuxea.node-2.com <none>
linuxea.com 1/1 Running 5 4m 172.16.3.28 linuxea.node-3.com <none>
nginx-linuxea-5786698598-dx2jr 1/1 Running 0 1d 172.16.2.11 linuxea.node-2.com <none>
tcp探测
指明主机和端口进行探测tcp和httpget是非常相似的,在容器启动5秒后,发送就绪tcpSocket,尝试链接pod内容器的8080端口,如果成功链接,就转换为就绪状态。kubelet将会每(periodSeconds)10秒检查一次
ports:
- containerPort: 8080
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
httpGet探测
apiVersion: v1
kind: Pod
metadata:
name: linuxea.com-httpget
namespace: default
spec:
containers:
- name: liveness-httpd-linuxea
image: marksugar/nginx:1.14.a
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
livenessProbe:
httpGet:
port: http
path: /index.html
initialDelaySeconds: 1
periodSeconds: 3
# restartPolicy: Onfailure
创建
[root@linuxea linuxea]# kubectl create -f liveness-httpget.yaml
pod/linuxea.com-httpget created
[root@linuxea linuxea]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
client-linuxea 1/1 Running 0 3d 172.16.2.252 linuxea.node-2.com <none>
linuxea.com 1/1 Running 37 1h 172.16.3.28 linuxea.node-3.com <none>
linuxea.com-httpget 1/1 Running 0 15s 172.16.2.12 linuxea.node-2.com <none>
nginx-linuxea-5786698598-dx2jr 1/1 Running 0 1d 172.16.2.11 linuxea.node-2.com <none>
nginx-linuxea-5786698598-stttv 1/1 Running 0 1d 172.16.1.15 linuxea.node-1.com <none>
nginx-linuxea-5786698598-t6lpx 1/1 Running 0 1d 172.16.3.25 linuxea.node-3.com <none>
pod-demo-linuxea 3/3 Running 0 1d 172.16.3.27 linuxea.node-3.com <none>
[root@linuxea linuxea]# curl 172.16.2.12
linuxea-linuxea.com-httpget.com-127.0.0.1/8 172.16.2.12/24
[root@linuxea linuxea]# kubectl describe pods linuxea.com-httpget|egrep "Restart|Liveness"
Restart Count: 0
Liveness: http-get http://:http/index.html delay=1s timeout=1s period=3s #success=1 #failure=3
模拟故障 kubectl exec -it linuxea.com-httpget -- /bin/sh
,删掉index.html
[root@linuxea linuxea]# kubectl exec -it linuxea.com-httpget -- /bin/sh
/ # ls /data/wwwroot/index.html
/data/wwwroot/index.html
/ # rm -rf /data/wwwroot/index.html
/ # command terminated with exit code 137
开始报404,使用kubectl describe pods linuxea.com-httpget|egrep "Restart|Liveness"
查看
[root@linuxea linuxea]# kubectl describe pods linuxea.com-httpget|egrep "Restart|Liveness"
Restart Count: 0
Liveness: http-get http://:http/index.html delay=1s timeout=1s period=3s #success=1 #failure=3
Warning Unhealthy 7s (x3 over 13s) kubelet, linuxea.node-2.com Liveness probe failed: Get 404.html: stopped after 10 redirects
而后重启。一旦重启就会重置,容器内的文件将会恢复
[root@linuxea linuxea]# kubectl describe pods linuxea.com-httpget|egrep "Last|Restart|Liveness"
Last State: Terminated
Restart Count: 1
Liveness: http-get http://:http/index.html delay=1s timeout=1s period=3s #success=1 #failure=3
Warning Unhealthy 39s (x3 over 45s) kubelet, linuxea.node-2.com Liveness probe failed: Get 404.html: stopped after 10 redirects
参考官网1,参考官网2