9.1健康检查

健康检查

kubernetes通过健康检查机制判断pod是否正常运行。若pod为正常运行，kuebernetes将自动重启pod。

kubernetes的健康检查分别有以下几种策略：

默认策略
基于bash的策略
基于http的策略
基于tcp的策略

默认策略

kubernetes默认策略为：只要PID为1的程序不退出，则认为容器是正常的

在容器中查看运行的进程可通过如下命令：

$ ps -ef

提示：若想测试默认策略，你可以通过如下命令杀死PID为1的进程，观察pod是否重启
$ kill 1

基于bash的策略

bash 命令执行成功返回0，失败返回1，返回值可通过$?获取。演示如下

$ pwd
/
$ echo $?
0

$ cd /hello
bash: cd: /hello: No such file or directory
$ echo $?
1

基于bash命令的这种特性，kubernetes提供了基于bash的策略。

设置bash策略，可按照如下方式设置deployment：

livenessProbe:
  exec:
    command:
    - /bin/sh
    - -c
    - ps -ef|grep java|grep -v grep
  initialDelaySeconds：10            # 容器启动10s之后再执行
  periodSecond：10                    # 健康检查的间隔，每10s检查一次
  failureThreshold：2                # 允许失败2次，失败次数超过2次，pod将重启
  successThreshold：1                # 从错误到正确，正确执行1次，却认为pod正常
  timeoutSeconds：5                    # 命令的最长等待时间，执行命令时间超过5s，任务本次健康检查失败

基于http的策略

基于http的策略：kubernetes通过发起http请求，若http的响应值为200，则认为健康检查通过，否则失败。

设置http策略，可按照如下方式设置deployment：

livenessProbe:
  httpGet:
    path: /examples/index.html
    port: 8080
    scheme: HTTP
  initialDelaySeconds：10            # 容器启动10s之后再执行
  periodSecond：10                    # 健康检查的间隔，每10s检查一次
  failureThreshold：2                # 允许失败2次，失败次数超过2次，pod将重启
  successThreshold：1                # 从错误到正确，正确执行1次，却认为pod正常
  timeoutSeconds：5                    # 命令的最长等待时间，执行命令时间超过5s，任务本次健康检查失败

注意：设置initialDelaySeconds的值时，尽量大于pod的预计启动时间

基于TCP的健康检查

基于tcp的策略：kubernetes判断port是否开启，若port开启，则认为健康检查通过，否则失败。

livenessProbe:
  tcpSocket:
    port: 8080
  initialDelaySeconds：10            # 容器启动10s之后再执行
  periodSecond：10                    # 健康检查的间隔，每10s检查一次
  failureThreshold：2                # 允许失败2次，失败次数超过2次，pod将重启
  successThreshold：1                # 从错误到正确，正确执行1次，却认为pod正常
  timeoutSeconds：5                    # 命令的最长等待时间，执行命令时间超过5s，任务本次健康检查失败

提示：查看端口使用情况可通过以下命令
netstat -ntlp

扩展

故障排查

若pod频繁重启pod，可将重启策略从always更改为never，这样即使健康检查为通过，pod也不会重启，从而方便排错

Previous8.4Label-小标签大作为 Next9.2Ready检查

Last updated 4 years ago

Was this helpful?