从 ECR 到 EKS 的图像无法正常工作,因为生成的 Pod 始终为 0/2

2024年 2月 8日 93.5k 0

从 ecr 到 eks 的图像无法正常工作,因为生成的 pod 始终为 0/2

php小编草莓在解决容器化应用部署问题时,发现从ECR(Amazon Elastic Container Registry)到EKS(Amazon Elastic Kubernetes Service)的图像无法正常工作的情况。具体表现为生成的Pod始终为0/2,这意味着容器无法正常启动或运行。这个问题可能涉及到多个方面,包括图像本身的问题、容器配置的错误或者网络环境的限制等。下面将详细介绍一些常见的解决方案,帮助开发者快速解决这个问题。

问题内容

我已经尝试了几乎所有方法来让事情走上正确的路径,但仍然无法让我的 pod 处于可用状态。

所以我有一个用 go 编写的基本应用程序。

我使用 docker build --tag docker-gs-ping . 创建了程序的映像
然后我尝试在容器内运行相同的命令 docker run --publish 8080:8080 docker-gs-ping

然后我想将我的图像保存到 amazon ecr,为此我在 ecr 中创建了一个存储库。

现在,在创建存储库后,我标记了本地中存在的图像。

docker tag f49366b7f534 ****40312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest

登录后复制

f49366b7f534是我本地的图像标签。 docker-gs-ping 是 ecr 中的存储库名称。

然后我使用命令将标记的图像上传到 ecr。

docker push ****40312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest

登录后复制

不确定上述命令是否会从本地推送标记的图像或最近的图像,因为无法提及要推送到 ecr 的特定图像。

目前的结果是

完成上述步骤后,我使用以下文件和命令创建了一个 vps:

eks 堆栈:

---
awstemplateformatversion: '2010-09-09'
description: 'amazon eks cluster'

parameters:
clustername:
type: string
default: my-eks-cluster
numberofworkernodes:
type: number
default: 1
workernodesinstancetype:
type: string
default: t2.micro
kubernetesversion:
type: string
default: 1.22

resources:

###########################################
## roles
###########################################
eksrole:
type: aws::iam::role
properties:
rolename: my.eks.cluster.role
assumerolepolicydocument:
version: "2012-10-17"
statement:
- effect: allow
principal:
service:
- eks.amazonaws.com
action:
- sts:assumerole
path: /
managedpolicyarns:
- "arn:aws:iam::aws:policy/amazoneksclusterpolicy"
eksnoderole:
type: aws::iam::role
properties:
rolename: my.eks.node.role
assumerolepolicydocument:
version: "2012-10-17"
statement:
- effect: allow
principal:
service:
- ec2.amazonaws.com
action:
- sts:assumerole
path: /
managedpolicyarns:
- "arn:aws:iam::aws:policy/amazoneksworkernodepolicy"
- "arn:aws:iam::aws:policy/amazonec2containerregistryreadonly"
- "arn:aws:iam::aws:policy/amazoneks_cni_policy"

###########################################
## eks cluster
###########################################

ekscluster:
type: aws::eks::cluster
properties:
name: !ref clustername
version: !ref kubernetesversion
rolearn: !getatt eksrole.arn
resourcesvpcconfig:
securitygroupids:
- !importvalue controlplanesecuritygroupid
subnetids: !split [ ',', !importvalue privatesubnetids ]

eksnodegroup:
type: aws::eks::nodegroup
dependson: ekscluster
properties:
clustername: !ref clustername
noderole: !getatt eksnoderole.arn
scalingconfig:
minsize:
ref: numberofworkernodes
desiredsize:
ref: numberofworkernodes
maxsize:
ref: numberofworkernodes
subnets: !split [ ',', !importvalue privatesubnetids ]

登录后复制

命令:aws cloudformation create-stack --region us-east-1 --stack-name my-eks-cluster --capability capability_named_iam --template-body file://eks-stack.yaml

eks vpc yaml

---
awstemplateformatversion: '2010-09-09'
description: 'amazon eks vpc - private and public subnets'

parameters:

vpcblock:
type: string
default: 192.168.0.0/16
description: the cidr range for the vpc. this should be a valid private (rfc 1918) cidr range.

publicsubnet01block:
type: string
default: 192.168.0.0/18
description: cidrblock for public subnet 01 within the vpc

publicsubnet02block:
type: string
default: 192.168.64.0/18
description: cidrblock for public subnet 02 within the vpc

privatesubnet01block:
type: string
default: 192.168.128.0/18
description: cidrblock for private subnet 01 within the vpc

privatesubnet02block:
type: string
default: 192.168.192.0/18
description: cidrblock for private subnet 02 within the vpc

metadata:
aws::cloudformation::interface:
parametergroups:
-
label:
default: "worker network configuration"
parameters:
- vpcblock
- publicsubnet01block
- publicsubnet02block
- privatesubnet01block
- privatesubnet02block

resources:
vpc:
type: aws::ec2::vpc
properties:
cidrblock: !ref vpcblock
enablednssupport: true
enablednshostnames: true
tags:
- key: name
value: !sub '${aws::stackname}-vpc'

internetgateway:
type: "aws::ec2::internetgateway"

vpcgatewayattachment:
type: "aws::ec2::vpcgatewayattachment"
properties:
internetgatewayid: !ref internetgateway
vpcid: !ref vpc

publicroutetable:
type: aws::ec2::routetable
properties:
vpcid: !ref vpc
tags:
- key: name
value: public subnets
- key: network
value: public

privateroutetable01:
type: aws::ec2::routetable
properties:
vpcid: !ref vpc
tags:
- key: name
value: private subnet az1
- key: network
value: private01

privateroutetable02:
type: aws::ec2::routetable
properties:
vpcid: !ref vpc
tags:
- key: name
value: private subnet az2
- key: network
value: private02

publicroute:
dependson: vpcgatewayattachment
type: aws::ec2::route
properties:
routetableid: !ref publicroutetable
destinationcidrblock: 0.0.0.0/0
gatewayid: !ref internetgateway

privateroute01:
dependson:
- vpcgatewayattachment
- natgateway01
type: aws::ec2::route
properties:
routetableid: !ref privateroutetable01
destinationcidrblock: 0.0.0.0/0
natgatewayid: !ref natgateway01

privateroute02:
dependson:
- vpcgatewayattachment
- natgateway02
type: aws::ec2::route
properties:
routetableid: !ref privateroutetable02
destinationcidrblock: 0.0.0.0/0
natgatewayid: !ref natgateway02

natgateway01:
dependson:
- natgatewayeip1
- publicsubnet01
- vpcgatewayattachment
type: aws::ec2::natgateway
properties:
allocationid: !getatt 'natgatewayeip1.allocationid'
subnetid: !ref publicsubnet01
tags:
- key: name
value: !sub '${aws::stackname}-natgatewayaz1'

natgateway02:
dependson:
- natgatewayeip2
- publicsubnet02
- vpcgatewayattachment
type: aws::ec2::natgateway
properties:
allocationid: !getatt 'natgatewayeip2.allocationid'
subnetid: !ref publicsubnet02
tags:
- key: name
value: !sub '${aws::stackname}-natgatewayaz2'

natgatewayeip1:
dependson:
- vpcgatewayattachment
type: 'aws::ec2::eip'
properties:
domain: vpc

natgatewayeip2:
dependson:
- vpcgatewayattachment
type: 'aws::ec2::eip'
properties:
domain: vpc

publicsubnet01:
type: aws::ec2::subnet
metadata:
comment: subnet 01
properties:
mappubliciponlaunch: true
availabilityzone:
fn::select:
- '0'
- fn::getazs:
ref: aws::region
cidrblock:
ref: publicsubnet01block
vpcid:
ref: vpc
tags:
- key: name
value: !sub "${aws::stackname}-publicsubnet01"
- key: kubernetes.io/role/elb
value: 1

publicsubnet02:
type: aws::ec2::subnet
metadata:
comment: subnet 02
properties:
mappubliciponlaunch: true
availabilityzone:
fn::select:
- '1'
- fn::getazs:
ref: aws::region
cidrblock:
ref: publicsubnet02block
vpcid:
ref: vpc
tags:
- key: name
value: !sub "${aws::stackname}-publicsubnet02"
- key: kubernetes.io/role/elb
value: 1

privatesubnet01:
type: aws::ec2::subnet
metadata:
comment: subnet 03
properties:
availabilityzone:
fn::select:
- '0'
- fn::getazs:
ref: aws::region
cidrblock:
ref: privatesubnet01block
vpcid:
ref: vpc
tags:
- key: name
value: !sub "${aws::stackname}-privatesubnet01"
- key: kubernetes.io/role/internal-elb
value: 1

privatesubnet02:
type: aws::ec2::subnet
metadata:
comment: private subnet 02
properties:
availabilityzone:
fn::select:
- '1'
- fn::getazs:
ref: aws::region
cidrblock:
ref: privatesubnet02block
vpcid:
ref: vpc
tags:
- key: name
value: !sub "${aws::stackname}-privatesubnet02"
- key: kubernetes.io/role/internal-elb
value: 1

publicsubnet01routetableassociation:
type: aws::ec2::subnetroutetableassociation
properties:
subnetid: !ref publicsubnet01
routetableid: !ref publicroutetable

publicsubnet02routetableassociation:
type: aws::ec2::subnetroutetableassociation
properties:
subnetid: !ref publicsubnet02
routetableid: !ref publicroutetable

privatesubnet01routetableassociation:
type: aws::ec2::subnetroutetableassociation
properties:
subnetid: !ref privatesubnet01
routetableid: !ref privateroutetable01

privatesubnet02routetableassociation:
type: aws::ec2::subnetroutetableassociation
properties:
subnetid: !ref privatesubnet02
routetableid: !ref privateroutetable02

controlplanesecuritygroup:
type: aws::ec2::securitygroup
properties:
groupdescription: cluster communication with worker nodes
vpcid: !ref vpc

outputs:

publicsubnetids:
description: public subnets ids in the vpc
value: !join [ ",", [ !ref publicsubnet01, !ref publicsubnet02 ] ]
export:
name: publicsubnetids

privatesubnetids:
description: private subnets ids in the vpc
value: !join [ ",", [ !ref privatesubnet01, !ref privatesubnet02 ] ]
export:
name: privatesubnetids

controlplanesecuritygroupid:
description: security group for the cluster control plane communication with worker nodes
value: !ref controlplanesecuritygroup
export:
name: controlplanesecuritygroupid

vpcid:
description: the vpc id
value: !ref vpc
export:
name: vpcid

登录后复制

命令:aws cloudformation create-stack --region us-east-1 --stack-name my-eks-vpc --template-body file://eks-vpc-stack.yaml

命令后的结果:

现在我尝试部署deployment.yaml和service.yaml文件

deployment.yaml

apiversion: apps/v1
kind: deployment
metadata:
name: helloworld
namespace: default
spec:
replicas: 2
selector:
matchlabels:
app: helloworld
template:
metadata:
labels:
app: helloworld
spec:
containers:
- name: new-container
image: ****40312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest
ports:
- containerport: 80

登录后复制

命令和结果:

现在service.yaml

apiversion: v1
kind: service
metadata:
name: helloworld
spec:
type: loadbalancer
selector:
app: helloworld
ports:
- name: http
port: 80
targetport: 80

登录后复制

命令和结果:

完成这一切后,当我运行 kubectl get 部署时,我得到如下结果:

为了调试,我尝试了 kubectl描述pod helloworld,我得到如下

C:UsersvisratnaGolandProjectstestaws>kubectl describe pod helloworld
Name: helloworld-c6dc56598-jmpvr
Namespace: default
Priority: 0
Service Account: default
Node: docker-desktop/192.168.65.4
Start Time: Fri, 07 Jul 2023 22:22:18 +0530
Labels: app=helloworld
pod-template-hash=c6dc56598
Annotations:
Status: Pending
IP: 10.1.0.7
IPs:
IP: 10.1.0.7
Controlled By: ReplicaSet/helloworld-c6dc56598
Containers:
new-container:
Container ID:
Image: 549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-sldvv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-sldvv:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 23m default-scheduler Successfully assigned default/helloworld-c6dc56598-jmpvr to docker-desktop
Normal Pulling 22m (x4 over 23m) kubelet Pulling image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest"
Warning Failed 22m (x4 over 23m) kubelet Failed to pull image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest": rpc error: code = Unknown desc = Error response from daemon: Head "https://549840312665.dkr.ecr.us-east-1.amazonaws.com/v2/docker-gs-ping/manifests/latest": no basic auth credentials
Warning Failed 22m (x4 over 23m) kubelet Error: ErrImagePull
Warning Failed 22m (x6 over 23m) kubelet Error: ImagePullBackOff
Normal BackOff 3m47s (x85 over 23m) kubelet Back-off pulling image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest"

Name: helloworld-c6dc56598-r9b4d
Namespace: default
Priority: 0
Service Account: default
Node: docker-desktop/192.168.65.4
Start Time: Fri, 07 Jul 2023 22:22:18 +0530
Labels: app=helloworld
pod-template-hash=c6dc56598
Annotations:
Status: Pending
IP: 10.1.0.6
IPs:
IP: 10.1.0.6
Controlled By: ReplicaSet/helloworld-c6dc56598
Containers:
new-container:
Container ID:
Image: 549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-84rw4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-84rw4:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 23m default-scheduler Successfully assigned default/helloworld-c6dc56598-r9b4d to docker-desktop
Normal Pulling 22m (x4 over 23m) kubelet Pulling image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest"
Warning Failed 22m (x4 over 23m) kubelet Failed to pull image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest": rpc error: code = Unknown desc = Error response from daemon: Head "https://549840312665.dkr.ecr.us-east-1.amazonaws.com/v2/docker-gs-ping/manifests/latest": no basic auth credentials
Warning Failed 22m (x4 over 23m) kubelet Error: ErrImagePull
Warning Failed 22m (x6 over 23m) kubelet Error: ImagePullBackOff
Normal BackOff 3m43s (x86 over 23m) kubelet Back-off pulling image "549840312665.dkr.ecr.us-east-1.amazonaws.com/docker-gs-ping:latest"

登录后复制

我已经按照 stackoverflow 上的建议尝试了许多解决方案,但似乎没有任何对我有用的解决方案,有什么建议我可以让事情正常工作吗?预先非常感谢您。

解决方法

有几件事。首先,您应该避免使用最新标签。这是一种反模式。当您将映像推送到 ECR 时,请使用构建标签或版本号作为映像标签。其次,您需要验证您的工作线程节点是否有权从 ECR 提取映像,特别是 AmazonEC2ContainerRegistryReadOnly 策略。否则,kubelet 将无法从 ECR 中提取镜像。如果注册表与集群位于不同的帐户中,则需要创建存储库[资源]策略。请参阅 https://docs.aws.amazon.com/AmazonECR /latest/userguide/repository-policies.html。

以上就是从 ECR 到 EKS 的图像无法正常工作,因为生成的 Pod 始终为 0/2的详细内容,更多请关注每日运维网(www.mryunwei.com)其它相关文章!

相关文章

JavaScript2024新功能:Object.groupBy、正则表达式v标志
PHP trim 函数对多字节字符的使用和限制
新函数 json_validate() 、randomizer 类扩展…20 个PHP 8.3 新特性全面解析
使用HTMX为WordPress增效:如何在不使用复杂框架的情况下增强平台功能
为React 19做准备:WordPress 6.6用户指南
如何删除WordPress中的所有评论

发布评论