AEWS 3주차 - EKS Storage, Managed Node Groups

AEWS

AEWS 3주차 - EKS Storage, Managed Node Groups

daniel00324 2025. 2. 17. 23:43

※ 본 게재 글은 gasida님의 'AEWS' 강의내용과 실습예제 및 AWS 공식사이트, 관련 Blog 등을 참고하여 작성하였습니다.

0. 실습환경 구성

▶ 구성도 : 2개의 VPC(EKS 배포, 운영용 구분), myeks-vpc의 public 에 EFS 추가

[ 구성 설명 ]

myeks-vpc 에 각기 AZ를 사용하는 퍼블릭/프라이빗 서브넷 배치
- EFS 스토리지 배포, 3개의 퍼블릭 서브넷에 네트워크 인터페이스 연동
- 로그밸런서 배포를 위한 퍼블릭/프라이빗 서브넷에 태그 설정 - Docs
- Amazon EKS optimized Amazon Linux 2023 accelerated AMIs now available - Link
operator-vpc 에 AZ1를 사용하는 퍼블릭/프라이빗 서브넷 배치 : 172.20.1.100 운영서버 EC2 배포
내부 통신을 위한 VPC Peering 배치

[1] AWS CloudFormation 을 통해 기본 실습 환경 배포

# yaml 파일 다운로드
curl -O https://s3.ap-northeast-2.amazonaws.com/cloudformation.cloudneta.net/K8S/myeks-3week.yaml

# 배포
# aws cloudformation deploy --template-file myeks-1week.yaml --stack-name mykops --parameter-overrides KeyName=<My SSH Keyname> SgIngressSshCidr=<My Home Public IP Address>/32 --region <리전>
예시) aws cloudformation deploy --template-file ~/Downloads/myeks-3week.yaml \
     --stack-name myeks --parameter-overrides KeyName=kp-gasida SgIngressSshCidr=$(curl -s ipinfo.io/ip)/32 --region ap-northeast-2

# CloudFormation 스택 배포 완료 후 운영서버 EC2 IP 출력
aws cloudformation describe-stacks --stack-name myeks --query 'Stacks[*].Outputs[*].OutputValue' --output text
예시) 3.35.137.31

# 운영서버 EC2 에 SSH 접속
예시) ssh ec2-user@3.35.137.31
ssh -i <ssh 키파일> ec2-user@$(aws cloudformation describe-stacks --stack-name myeks --query 'Stacks[*].Outputs[0].OutputValue' --output text)

[ 실행 결과 - 한 눈에 보기 ]

[2] eksctl 을 통해 EKS 배포

( ** aws configure 로 사전 AWS Account 접속 환경 구성 )

Step1. 변수 설정

export CLUSTER_NAME=myeks

# myeks-VPC/Subnet 정보 확인 및 변수 지정
export VPCID=$(aws ec2 describe-vpcs --filters "Name=tag:Name,Values=$CLUSTER_NAME-VPC" --query 'Vpcs[*].VpcId' --output text)
echo $VPCID

export PubSubnet1=$(aws ec2 describe-subnets --filters Name=tag:Name,Values="$CLUSTER_NAME-Vpc1PublicSubnet1" --query "Subnets[0].[SubnetId]" --output text)
export PubSubnet2=$(aws ec2 describe-subnets --filters Name=tag:Name,Values="$CLUSTER_NAME-Vpc1PublicSubnet2" --query "Subnets[0].[SubnetId]" --output text)
export PubSubnet3=$(aws ec2 describe-subnets --filters Name=tag:Name,Values="$CLUSTER_NAME-Vpc1PublicSubnet3" --query "Subnets[0].[SubnetId]" --output text)
echo $PubSubnet1 $PubSubnet2 $PubSubnet3


#------------------ 
## SSHKEYNAME=<각자 자신의 SSH Keypair 이름>
SSHKEYNAME=kp-kyukim

Step2. myeks.yaml 파일 작성 : vpc/subnet 과 ssh 키 이름 수정 ( step1. 변수설정 참조 동적반영)

cat << EOF > myeks.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: myeks
  region: ap-northeast-2
  version: "1.31"

iam:
  withOIDC: true # enables the IAM OIDC provider as well as IRSA for the Amazon CNI plugin

  serviceAccounts: # service accounts to create in the cluster. See IAM Service Accounts
  - metadata:
      name: aws-load-balancer-controller
      namespace: kube-system
    wellKnownPolicies:
      awsLoadBalancerController: true

vpc:
  cidr: 192.168.0.0/16
  clusterEndpoints:
    privateAccess: true # if you only want to allow private access to the cluster
    publicAccess: true # if you want to allow public access to the cluster
  id: $VPCID
  subnets:
    public:
      ap-northeast-2a:
        az: ap-northeast-2a
        cidr: 192.168.1.0/24
        id: $PubSubnet1
      ap-northeast-2b:
        az: ap-northeast-2b
        cidr: 192.168.2.0/24
        id: $PubSubnet2
      ap-northeast-2c:
        az: ap-northeast-2c
        cidr: 192.168.3.0/24
        id: $PubSubnet3

addons:
  - name: vpc-cni # no version is specified so it deploys the default version
    version: latest # auto discovers the latest available
    attachPolicyARNs: # attach IAM policies to the add-on's service account
      - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
    configurationValues: |-
      enableNetworkPolicy: "true"

  - name: kube-proxy
    version: latest

  - name: coredns
    version: latest

  - name: metrics-server
    version: latest

managedNodeGroups:
- amiFamily: AmazonLinux2023
  desiredCapacity: 3
  iam:
    withAddonPolicies:
      certManager: true # Enable cert-manager
      externalDNS: true # Enable ExternalDNS
  instanceType: t3.medium
  preBootstrapCommands:
    # install additional packages
    - "dnf install nvme-cli links tree tcpdump sysstat ipvsadm ipset bind-utils htop -y"
  labels:
    alpha.eksctl.io/cluster-name: myeks
    alpha.eksctl.io/nodegroup-name: ng1
  maxPodsPerNode: 100
  maxSize: 3
  minSize: 3
  name: ng1
  ssh:
    allow: true
    publicKeyName: $SSHKEYNAME
  tags:
    alpha.eksctl.io/nodegroup-name: ng1
    alpha.eksctl.io/nodegroup-type: managed
  volumeIOPS: 3000
  volumeSize: 120
  volumeThroughput: 125
  volumeType: gp3
EOF

Step3. Yaml로 eks 배포

eksctl create cluster -f myeks.yaml --verbose 4

Step4. 배포 완료 후 기본정보 확인

EKS 관리 콘솔 확인 : Overview, Compute, Networking, Add-ons, Access
EKS 정보 확인 ( ** kubectl 환경 구성 : Link )

## aws eks update-kubeconfig --region [region-code] --name [my-cluster]
aws eks update-kubeconfig --region ap-northeast-2 --name myeks

#
kubectl cluster-info

# 네임스페이스 default 변경 적용
kubens default

# 
kubectl ctx
kubectl config rename-context "<각자 자신의 IAM User>@myeks.ap-northeast-2.eksctl.io" "eksworkshop"
kubectl config rename-context "admin@myeks.ap-northeast-2.eksctl.io" "eksworkshop"

#
kubectl get node --label-columns=node.kubernetes.io/instance-type,eks.amazonaws.com/capacityType,topology.kubernetes.io/zone
kubectl get node -v=6

#
kubectl get pod -A
kubectl get pdb -n kube-system

# 관리형 노드 그룹 확인
eksctl get nodegroup --cluster $CLUSTER_NAME
aws eks describe-nodegroup --cluster-name $CLUSTER_NAME --nodegroup-name ng1 | jq

# eks addon 확인
eksctl get addon --cluster $CLUSTER_NAME

# aws-load-balancer-controller를 위한 iam service account 생성 확인 : AWS IAM role bound to a Kubernetes service account
eksctl get iamserviceaccount --cluster $CLUSTER_NAME
NAMESPACE	NAME				ROLE ARN
kube-system	aws-load-balancer-controller	arn:aws:iam::911283464785:role/eksctl-myeks-addon-iamserviceaccount-kube-sys-Role1-S60JsHI62pHB

EC2 관리 콘솔 확인 : type, az, IP, ec2 instance profile → iam role 확인

[ 실행 결과 - 한 눈에 보기 ]

☞ eksctl 로 cluster 배포 시, eksctl 버전이 특정 버전 이하이면 하기와 같은 bug 발생!!

해결 방법은 eksctl 버전 업그레이드 이후, 다시 수행하면 된다. - 참조 링크

[3] 관리형 노드 그룹(EC2) 접속 및 노드 정보 확인 : max-pods

관리 콘솔 EC2 서비스 : 관리형 노드 그룹(EC2) 에 보안그룹 ID 확인
해당 보안그룹 inbound 에 자신의 집 공인 IP 추가 후 접속 확인

# 인스턴스 정보 확인 1
aws ec2 describe-instances --query "Reservations[*].Instances[*].{InstanceID:InstanceId, PublicIPAdd:PublicIpAddress, PrivateIPAdd:PrivateIpAddress, InstanceName:Tags[?Key=='Name']|[0].Value, Status:State.Name}" --filters Name=instance-state-name,Values=running --output table

# 인스턴스 정보 확인 2 : AZ, ID, 공인IP
aws ec2 describe-instances \
    --filters "Name=tag:Name,Values=myeks-ng1-Node" \
    --query "Reservations[].Instances[].{InstanceID:InstanceId, PublicIP:PublicIpAddress, AZ:Placement.AvailabilityZone}" \
    --output table

# AZ1 배치된 EC2 공인 IP
aws ec2 describe-instances \
    --filters "Name=tag:Name,Values=myeks-ng1-Node" "Name=availability-zone,Values=ap-northeast-2a" \
    --query 'Reservations[*].Instances[*].PublicIpAddress' \
    --output text

# AZ2 배치된 EC2 공인 IP
aws ec2 describe-instances \
    --filters "Name=tag:Name,Values=myeks-ng1-Node" "Name=availability-zone,Values=ap-northeast-2b" \
    --query 'Reservations[*].Instances[*].PublicIpAddress' \
    --output text

# AZ3 배치된 EC2 공인 IP
aws ec2 describe-instances \
    --filters "Name=tag:Name,Values=myeks-ng1-Node" "Name=availability-zone,Values=ap-northeast-2c" \
    --query 'Reservations[*].Instances[*].PublicIpAddress' \
    --output text

# EC2 공인 IP 변수 지정
export N1=$(aws ec2 describe-instances --filters "Name=tag:Name,Values=myeks-ng1-Node" "Name=availability-zone,Values=ap-northeast-2a" --query 'Reservations[*].Instances[*].PublicIpAddress' --output text)
export N2=$(aws ec2 describe-instances --filters "Name=tag:Name,Values=myeks-ng1-Node" "Name=availability-zone,Values=ap-northeast-2b" --query 'Reservations[*].Instances[*].PublicIpAddress' --output text)
export N3=$(aws ec2 describe-instances --filters "Name=tag:Name,Values=myeks-ng1-Node" "Name=availability-zone,Values=ap-northeast-2c" --query 'Reservations[*].Instances[*].PublicIpAddress' --output text)
echo $N1, $N2, $N3


# *remoteAccess* 포함된 보안그룹 ID
aws ec2 describe-security-groups --filters "Name=group-name,Values=*remoteAccess*" | jq
export MNSGID=$(aws ec2 describe-security-groups --filters "Name=group-name,Values=*remoteAccess*" --query 'SecurityGroups[*].GroupId' --output text)

# 해당 보안그룹 inbound 에 자신의 집 공인 IP 룰 추가
aws ec2 authorize-security-group-ingress --group-id $MNSGID --protocol '-1' --cidr $(curl -s ipinfo.io/ip)/32

# 해당 보안그룹 inbound 에 운영서버 내부 IP 룰 추가
aws ec2 authorize-security-group-ingress --group-id $MNSGID --protocol '-1' --cidr 172.20.1.100/32


# ping 테스트
ping -c 2 $N1
ping -c 2 $N2
ping -c 2 $N3

# 워커 노드 SSH 접속
ssh -i <SSH 키> -o StrictHostKeyChecking=no ec2-user@$N1 hostname
for i in $N1 $N2 $N3; do echo ">> node $i <<"; ssh -o StrictHostKeyChecking=no ec2-user@$i hostname; echo; done

ssh ec2-user@$N1
exit
ssh ec2-user@$N2
exit
ssh ec2-user@$N2
exit

[ 실행 결과 - 한 눈에 보기 ]

(옵션) AWS EC2 System Manager - Session Manager 로 관리형 노드 그룹(EC2) 접속

- ssm manager plug-In 설치 : Link

☞ WSL 환경에서 가이드 대로 plug-In 설치 후, 노드로 접근 테스트 함!!

노드 정보 확인 & max-pods 정보 확인

# 노드 기본 정보 확인
for i in $N1 $N2 $N3; do echo ">> node $i <<"; ssh ec2-user@$i hostnamectl; echo; done
for i in $N1 $N2 $N3; do echo ">> node $i <<"; ssh ec2-user@$i sudo ip -c addr; echo; done

# 
for i in $N1 $N2 $N3; do echo ">> node $i <<"; ssh ec2-user@$i lsblk; echo; done
for i in $N1 $N2 $N3; do echo ">> node $i <<"; ssh ec2-user@$i df -hT /; echo; done


# 스토리지클래스 및 CSI 노드 확인
kubectl get sc
kubectl describe sc gp2

kubectl get crd
kubectl get csinodes


# max-pods 정보 확인
kubectl describe node | grep Capacity: -A13
kubectl get nodes -o custom-columns="NAME:.metadata.name,MAXPODS:.status.capacity.pods"

# 노드에서 확인
## for i in $N1 $N2 $N3; do echo ">> node $i <<"; ssh ec2-user@$i cat /etc/eks/bootstrap.sh; echo; done
ssh ec2-user@$N1 sudo cat /etc/kubernetes/kubelet/config.json | jq
for i in $N1 $N2 $N3; do echo ">> node $i <<"; ssh ec2-user@$i sudo cat /etc/kubernetes/kubelet/config.json | grep maxPods; echo; done
for i in $N1 $N2 $N3; do echo ">> node $i <<"; ssh ec2-user@$i sudo cat /etc/kubernetes/kubelet/config.json.d/00-nodeadm.conf | grep maxPods; echo; done

(참고) If you don’t specify an AMI ID for the bootstrap.sh file included with Amazon EKS optimized Linux or Bottlerocket, managed node groups enforce a maximum number on the value of maxPods. For instances with less than 30 vCPUs, the maximum number is 110. For instances with greater than 30 vCPUs, the maximum number jumps to 250. These numbers are based on Kubernetes scalability thresholds and recommended settings by internal Amazon EKS scalability team testing. For more information, see the Amazon VPC CNI plugin increases pods per node limits blog post.

[ 실행 결과 - 한 눈에 보기 ]

[4] 운영서버 EC2 : eks kubeconfig 설정, EFS 마운트 테스트

Step1. [운영서버 EC2] eks kubeconfig 설정

# eks 설치한 iam 자격증명을 설정하기
aws configure
...

# get-caller-identity 확인
aws sts get-caller-identity --query Arn

# kubeconfig 생성
aws eks update-kubeconfig --name myeks --user-alias <위 출력된 자격증명 사용자>
aws eks update-kubeconfig --name myeks --user-alias admin

# 
kubectl cluster-info
kubectl ns default
kubectl get node -v6

Step2. [운영서버 EC2] EFS 마운트 테스트

# 현재 EFS 정보 확인
aws efs describe-file-systems | jq

# 파일 시스템 ID만 출력
aws efs describe-file-systems --query "FileSystems[*].FileSystemId" --output text
fs-040469b8fab273469

# EFS 마운트 대상 정보 확인
aws efs describe-mount-targets --file-system-id $(aws efs describe-file-systems --query "FileSystems[*].FileSystemId" --output text) | jq

# IP만 출력 : 
aws efs describe-mount-targets --file-system-id $(aws efs describe-file-systems --query "FileSystems[*].FileSystemId" --output text) --query "MountTargets[*].IpAddress" --output text
192.168.2.102	192.168.1.71	192.168.3.184

# DNS 질의 : 안되는 이유가 무엇일까요?
# EFS 도메인 이름(예시) : fs-040469b8fab273469.efs.ap-northeast-2.amazonaws.com
dig +short $(aws efs describe-file-systems --query "FileSystems[*].FileSystemId" --output text).efs.ap-northeast-2.amazonaws.com


# EFS 마운트 테스트
EFSIP1=<IP만 출력에서 아무 IP나 지정>
EFSIP1=192.168.1.71

df -hT
mkdir /mnt/myefs
mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport $EFSIP1:/ /mnt/myefs
findmnt -t nfs4
df -hT --type nfs4

# 파일 작성
nfsstat
echo "EKS Workshop" > /mnt/myefs/memo.txt
nfsstat
ls -l /mnt/myefs
cat /mnt/myefs/memo.txt

# EC2 재부팅 이후에도 mount 탑재가 될 수 있게 설정 해보자! : (힌트 : /etc/fstab)

[ 실행 결과 - 한 눈에 보기 ]

[5] EKS 배포 후 실습 편의를 위한 설정 : macOS, Windows(WSL2)

☞ Windows (WSL2 - Ubuntu) ⇒ 실습 완료 후 삭제 할 것!

# 실습 완료 후 삭제 할 것!
MyDomain=gasida.link # 각자 자신의 도메인 이름 입력
MyDnzHostedZoneId=$(aws route53 list-hosted-zones-by-name --dns-name "$MyDomain." --query "HostedZones[0].Id" --output text)

cat << EOF >> ~/.bashrc

# eksworkshop
export CLUSTER_NAME=myeks
export VPCID=$(aws ec2 describe-vpcs --filters "Name=tag:Name,Values=$CLUSTER_NAME-VPC" --query 'Vpcs[*].VpcId' --output text)
export PubSubnet1=$(aws ec2 describe-subnets --filters Name=tag:Name,Values="$CLUSTER_NAME-Vpc1PublicSubnet1" --query "Subnets[0].[SubnetId]" --output text)
export PubSubnet2=$(aws ec2 describe-subnets --filters Name=tag:Name,Values="$CLUSTER_NAME-Vpc1PublicSubnet2" --query "Subnets[0].[SubnetId]" --output text)
export PubSubnet3=$(aws ec2 describe-subnets --filters Name=tag:Name,Values="$CLUSTER_NAME-Vpc1PublicSubnet3" --query "Subnets[0].[SubnetId]" --output text)
export N1=$(aws ec2 describe-instances --filters "Name=tag:Name,Values=$CLUSTER_NAME-ng1-Node" "Name=availability-zone,Values=ap-northeast-2a" --query 'Reservations[*].Instances[*].PublicIpAddress' --output text)
export N2=$(aws ec2 describe-instances --filters "Name=tag:Name,Values=$CLUSTER_NAME-ng1-Node" "Name=availability-zone,Values=ap-northeast-2b" --query 'Reservations[*].Instances[*].PublicIpAddress' --output text)
export N3=$(aws ec2 describe-instances --filters "Name=tag:Name,Values=$CLUSTER_NAME-ng1-Node" "Name=availability-zone,Values=ap-northeast-2c" --query 'Reservations[*].Instances[*].PublicIpAddress' --output text)
MyDomain=gasida.link # 각자 자신의 도메인 이름 입력
MyDnzHostedZoneId=$(aws route53 list-hosted-zones-by-name --dns-name "$MyDomain." --query "HostedZones[0].Id" --output text)
EOF

# [신규 터미널] 확인
echo $CLUSTER_NAME $VPCID $PubSubnet1 $PubSubnet2 $PubSubnet3
echo $N1 $N2 $N3 $MyDomain $MyDnzHostedZoneId
tail -n 12 ~/.bashrc

[6] AWS LoadBalancerController, ExternalDNS, kube-ops-view 설치

▶ 설치 ( 참고 : ACM 생성 및 ELB 연동 테스트 - Link )

# kube-ops-view
helm repo add geek-cookbook https://geek-cookbook.github.io/charts/
helm install kube-ops-view geek-cookbook/kube-ops-view --version 1.2.2 --set service.main.type=ClusterIP  --set env.TZ="Asia/Seoul" --namespace kube-system

# AWS LoadBalancerController
helm repo add eks https://aws.github.io/eks-charts
helm repo update
kubectl get sa -n kube-system aws-load-balancer-controller
helm install aws-load-balancer-controller eks/aws-load-balancer-controller -n kube-system --set clusterName=$CLUSTER_NAME \
  --set serviceAccount.create=false --set serviceAccount.name=aws-load-balancer-controller

# ExternalDNS
MyDomain=<자신의 도메인>
MyDomain=gasida.link
MyDnzHostedZoneId=$(aws route53 list-hosted-zones-by-name --dns-name "$MyDomain." --query "HostedZones[0].Id" --output text)
curl -s https://raw.githubusercontent.com/gasida/PKOS/main/aews/externaldns.yaml | MyDomain=$MyDomain MyDnzHostedZoneId=$MyDnzHostedZoneId envsubst | kubectl apply -f -

# 사용 리전의 인증서 ARN 확인 : 정상 상태 확인(만료 상태면 에러 발생!)
CERT_ARN=$(aws acm list-certificates --query 'CertificateSummaryList[].CertificateArn[]' --output text)
echo $CERT_ARN

# kubeopsview 용 Ingress 설정 : group 설정으로 1대의 ALB를 여러개의 ingress 에서 공용 사용
cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    alb.ingress.kubernetes.io/certificate-arn: $CERT_ARN
    alb.ingress.kubernetes.io/group.name: study
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS":443}, {"HTTP":80}]'
    alb.ingress.kubernetes.io/load-balancer-name: myeks-ingress-alb
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/ssl-redirect: "443"
    alb.ingress.kubernetes.io/success-codes: 200-399
    alb.ingress.kubernetes.io/target-type: ip
  labels:
    app.kubernetes.io/name: kubeopsview
  name: kubeopsview
  namespace: kube-system
spec:
  ingressClassName: alb
  rules:
  - host: kubeopsview.$MyDomain
    http:
      paths:
      - backend:
          service:
            name: kube-ops-view
            port:
              number: 8080
        path: /
        pathType: Prefix
EOF

▶ 확인

# 설치된 파드 정보 확인
kubectl get pods -n kube-system

# service, ep, ingress 확인
kubectl get ingress,svc,ep -n kube-system

# Kube Ops View 접속 정보 확인 
echo -e "Kube Ops View URL = https://kubeopsview.$MyDomain/#scale=1.5"
open "https://kubeopsview.$MyDomain/#scale=1.5" # macOS

[ 실습 결과 - 한 눈에 보기 ]

1. 스토리지의 이해

▶ 배경 소개 - AWS-Blog

1. 임시 볼륨 (Ephemeral Volume) : Temporary filesystem, Volume - 링크

- 파드 내 존재하며, 동일 파드 내 Containers 간에 데이터를 공유할 수 있다.

- 임시 파일시스템은 container와 동일한 life-cycle을 갖으며, 임시 볼륨은 Pod와 동일한 Life-cycle을 갖는다. 결론적으로, 파드 내부의 데이터는 파드가 삭제되면 모두 삭제된다.

출처 : https://aws.amazon.com/ko/blogs/tech/persistent-storage-for-kubernetes/

2. 영구 볼륨 (Persistent-Volume)

- PV는 POD의 수명주기와 상관 없이, 데이터를 영구적으로 보관 및 제공해 줄 수 있다.

- POD는 PVC (PersistentVolumeClaim) 를 통해 PV를 연결하여 사용할 수 있다.

[ PV의 특-장점 ]

PV는 Pod의 수명 주기에 구속되지 않습니다: PV객체에 연결된 포드를 삭제해도 해당 PV는 유지됩니다.
앞의 설명은 포드 충돌 시에도 유효합니다: PV 객체는 장애시에도 유지되고 클러스터에서 제거되지 않습니다.
PV는 클러스터 전체에 적용됩니다: 클러스터의 모든 노드에서 실행 중인 모든 포드에서 연결할 수 있습니다.

[ K8S의 접근모드 3가지 ]

ReadWriteOnce: 볼륨은 동시에 하나의 노드에서만 읽기/쓰기를 허용합니다.
ReadOnlyMany: 볼륨은 동시에 여러 노드에서 읽기 전용 모드를 허용합니다.
ReadWriteMany: 볼륨은 동시에 여러 노드에서 읽기/쓰기를 허용합니다.

3. K8S 의 볼륨 마운트 기술 (2가지)

1) Static Provisioning : 사용자가 수동으로 PV, PVC를 만들어 파드에 볼륨을 연동하는 기술

2) Dynamic Provisioning : POD가 생성될 때, 자동으로 볼륨을 마운트하여 파드에 연결하는 기능

4. 볼륨 초기화 정책

- Reclaim Policy : PV의 사용이 만료되어 정리할 때 사용되는 정책으로 Retain(보존), Delete(삭제) 가 있다.

Delete 정책의 경우, EBS 볼륨도 삭제됨

5. CSI ( Container Storage Interface )

1) 등장 배경 : K8S 내부 소스코드에 존재하는 AWS EBS Provisioner는 Cluster 배포와 동일한 Life-Cycle을 가지고 있다. 따라서, provisioner의 신규 기능을 사용하고자 할 경우 별도 업그레이드를 해야 하는 제약사항이 발생한다. Kubernetes 개발자는 해당 관리를 위해 Kubernetes 내부에 내장된 provisioner (in-tree)를 모두 삭제하고, 별도의 controller Pod을 통해 동적 provisioning을 사용할 수 있도록 CSI를 만들게 되었다.

2) 활용 : CSI 를 사용하면, K8S 의 공통화된 CSI 인터페이스를 통해 다양한 프로바이더를 사용할 수 있다.

하기 그림은 일반적인 CSI driver의 구조입니다. AWS EBS CSI driver 역시 아래와 같은 구조를 가지는데, 오른쪽 StatefulSet 또는 Deployment로 배포된 controller Pod이 AWS API를 사용하여 실제 EBS volume을 생성하는 역할을 합니다. 왼쪽 DaemonSet으로 배포된 node Pod은 AWS API를 사용하여 Kubernetes node (EC2 instance)에 EBS volume을 attach 해줍니다.

▶ 실습

☞ POD 기본 저장소 동작확인 - Docs

Step1. redis POD 생성 및 데이터 입력 후, 재기동 시 동작 확인

# 모니터링
kubectl get pod -w

# redis 파드 생성
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: redis
spec:
  terminationGracePeriodSeconds: 0
  containers:
  - name: redis
    image: redis
EOF

# redis 파드 내에 파일 작성
kubectl exec -it redis -- pwd
kubectl exec -it redis -- sh -c "echo hello > /data/hello.txt"
kubectl exec -it redis -- cat /data/hello.txt

# ps 설치
kubectl exec -it redis -- sh -c "apt update && apt install procps -y"
kubectl exec -it redis -- ps aux

# redis 프로세스 강제 종료 : 파드가 어떻게 되나요? hint) restartPolicy
kubectl exec -it redis -- kill 1
kubectl get pod

# redis 파드 내에 파일 확인
kubectl exec -it redis -- cat /data/hello.txt
kubectl exec -it redis -- ls -l /data

# 파드 삭제
kubectl delete pod redis

[ 실행 결과 - 한 눈에 보기 ]

☞ 'POD 기본 저장소' 의 수명주기는 POD의 수명주기와 동일!! ( POD 재기동 시, 내용 사라진다!! )

☞ 'empty Dir' 동작확인 - Docs

# 모니터링
kubectl get pod -w

# redis 파드 생성
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: redis
spec:
  terminationGracePeriodSeconds: 0
  containers:
  - name: redis
    image: redis
    volumeMounts:
    - name: redis-storage
      mountPath: /data/redis
  volumes:
  - name: redis-storage
    emptyDir: {}
EOF

# redis 파드 내에 파일 작성
kubectl exec -it redis -- pwd
kubectl exec -it redis -- sh -c "echo hello > /data/redis/hello.txt"
kubectl exec -it redis -- cat /data/redis/hello.txt

# ps 설치
kubectl exec -it redis -- sh -c "apt update && apt install procps -y"
kubectl exec -it redis -- ps aux

# redis 프로세스 강제 종료 : 파드가 어떻게 되나요? hint) restartPolicy
kubectl exec -it redis -- kill 1
kubectl get pod

# redis 파드 내에 파일 확인
kubectl exec -it redis -- cat /data/redis/hello.txt
kubectl exec -it redis -- ls -l /data/redis

# 파드 삭제 후 파일 확인
kubectl delete pod redis
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: redis
spec:
  terminationGracePeriodSeconds: 0
  containers:
  - name: redis
    image: redis
    volumeMounts:
    - name: redis-storage
      mountPath: /data/redis
  volumes:
  - name: redis-storage
    emptyDir: {}
EOF

# redis 파드 내에 파일 확인
kubectl exec -it redis -- cat /data/redis/hello.txt
kubectl exec -it redis -- ls -l /data/redis

# 파드 삭제
kubectl delete pod redis

[ 실행 결과 - 한 눈에 보기 ]

☞ 호스트 Path 를 사용하는 PV/PVC : local-path-provisioner 스트리지 클래스 배포 - 링크

Step1. Local-path-Storage & Provisioner 설치 (생성)

# 배포
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.31/deploy/local-path-storage.yaml
...
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-path
provisioner: rancher.io/local-path
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: local-path-config
  namespace: local-path-storage
data:
  config.json: |-
    {
            "nodePathMap":[
            {
                    "node":"DEFAULT_PATH_FOR_NON_LISTED_NODES",
                    "paths":["/opt/local-path-provisioner"]
            }
            ]
    }
  setup: |-
    #!/bin/sh
    set -eu
    mkdir -m 0777 -p "$VOL_DIR"
  teardown: |-
    #!/bin/sh
    set -eu
    rm -rf "$VOL_DIR"
...

# 확인
kubectl get-all -n local-path-storage
kubectl get pod -n local-path-storage -owide
kubectl describe cm -n local-path-storage local-path-config
kubectl get sc
kubectl get sc local-path
NAME         PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  34s

(참고) The provisioner supports automatic configuration reloading. Users can change the configuration using kubectl apply or kubectl edit with config map local-path-config

Step2. PV/PVC 를 사용하는 파드 생성

# PVC 생성
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: localpath-claim
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-path
  resources:
    requests:
      storage: 1Gi
EOF

# PVC 확인
kubectl get pvc
kubectl describe pvc


# 파드 생성
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo \$(date -u) >> /data/out.txt; sleep 5; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: localpath-claim
EOF

# 파드 확인
kubectl get pod,pv,pvc
kubectl describe pv    # Node Affinity 확인
kubectl exec -it app -- tail -f /data/out.txt
Thu Feb 13 09:33:53 UTC 2025
Thu Feb 13 09:33:58 UTC 2025
... 

# 워커노드 중 현재 파드가 배포되어 있다만, 아래 경로에 out.txt 파일 존재 확인
for node in $N1 $N2 $N3; do ssh ec2-user@$node tree /opt/local-path-provisioner; done
/opt/local-path-provisioner
└── pvc-f1615862-e4cd-47d0-b89c-8d0e99270678_default_localpath-claim
    └── out.txt

# 해당 워커노드 자체에서 out.txt 파일 확인 : 아래 굵은 부분은 각자 실습 환경에 따라 다름
ssh ec2-user@$N1 tail -f /opt/local-path-provisioner/pvc-f1615862-e4cd-47d0-b89c-8d0e99270678_default_localpath-claim/out.txt
...

Step3. 파드 삭제 후 파드 재생성해서 데이터 유지 되는지 확인

# 파드 삭제 후 PV/PVC 확인
kubectl delete pod app
kubectl get pod,pv,pvc
for node in $N1 $N2 $N3; do ssh ec2-user@$node tree /opt/local-path-provisioner; done

# 파드 다시 실행
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo \$(date -u) >> /data/out.txt; sleep 5; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: localpath-claim
EOF
 
# 확인
kubectl exec -it app -- head /data/out.txt
kubectl exec -it app -- tail -f /data/out.txt

Step4. 다음 실습을 위해서 파드와 PVC 삭제

# 파드와 PVC 삭제 
kubectl delete pod app
kubectl get pv,pvc
kubectl delete pvc localpath-claim

# 확인
kubectl get pv
for node in $N1 $N2 $N3; do ssh ec2-user@$node tree /opt/local-path-provisioner; done

☞ Kubestr 모니터링 및 성능 측정 확인 (NVMe SSD) - 링크 Github 한글 CloudStorage

- Kubestr 이용한 성능 측정 - 링크 , Youtube , Blog ⇒ local-path 와 NFS 등 스토리지 클래스의 IOPS 차이를 확인해 보자!!

# [운영서버 EC2] kubestr 툴 다운로드 - Link
wget https://github.com/kastenhq/kubestr/releases/download/v0.4.48/kubestr_0.4.48_Linux_amd64.tar.gz
tar xvfz kubestr_0.4.48_Linux_amd64.tar.gz && mv kubestr /usr/local/bin/ && chmod +x /usr/local/bin/kubestr

# 스토리지클래스 점검
kubestr -h
kubestr

# 모니터링
watch 'kubectl get pod -owide;echo;kubectl get pv,pvc'

## 아래 서버 각각 접속 후 iostat 명령 실행 해두기 : 입출력(I/O) 통계 확인
ssh ec2-user@$N1
ssh ec2-user@$N2
ssh ec2-user@$N3
-----------------
iostat -xmdz 1
--------------------------------------------------------------
# rrqm/s : 초당 드라이버 요청 대기열에 들어가 병합된 읽기 요청 횟수
# wrqm/s : 초당 드라이버 요청 대기열에 들어가 병합된 쓰기 요청 횟수
# r/s : 초당 디스크 장치에 요청한 읽기 요청 횟수
# w/s : 초당 디스크 장치에 요청한 쓰기 요청 횟수
# rMB/s : 초당 디스크 장치에서 읽은 메가바이트 수
# wMB/s : 초당 디스크 장치에 쓴 메가바이트 수
# await : 가장 중요한 지표, 평균 응답 시간. 드라이버 요청 대기열에서 기다린 시간과 장치의 I/O 응답시간을 모두 포함 (단위: ms)
iostat -xmdz 1 -p xvdf
Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvdf              0.00     0.00 2637.93    0.00    10.30     0.00     8.00     6.01    2.28    2.28    0.00   0.33  86.21
--------------------------------------------------------------

# 랜덤 읽기 성능 테스트 수행 : 3분 정도 소요
# libaio 엔진과 다이렉트 I/O를 사용하여 고성능 스토리지의 랜덤 읽기 성능을 측정.
# OS 캐시를 사용하지 않고 직접 디스크 I/O 수행 (direct 플래그)
cat << EOF > fio-read.fio
[global]
ioengine=libaio
direct=1
bs=4k
runtime=120
time_based=1
iodepth=16
numjobs=4
group_reporting
size=1g
rw=randread
[read]
EOF
kubestr fio -f fio-read.fio -s local-path --size 10G  # size 미 지정시 기본 100G로 노드 Disk full 발생하니 유의
PVC created kubestr-fio-pvc-v5wzp
Pod created kubestr-fio-pod-24nqd
Running FIO test (fio-read.fio) on StorageClass (local-path) with a PVC of Size (10G)
Elapsed time- 2m33.103404647s
FIO test results:
  
FIO version - fio-3.36
Global options - ioengine=libaio verify= direct=1 gtod_reduce=

JobName: 
  blocksize= filesize= iodepth= rw=
read:
  IOPS=3023.671631 BW(KiB/s)=12094
  iops: min=2306 max=8972 avg=3025.259521
  bw(KiB/s): min=9224 max=35888 avg=12101.108398

Disk stats (read/write):
  nvme0n1: ios=362470/161 merge=0/55 ticks=6337974/3107 in_queue=6341080, util=96.498177%
  -  OK

# (참고) [NVMe] Read 평균 IOPS는 20300
kubestr fio -f fio-read.fio -s local-path --size 10G
...
read:
  IOPS=20300.531250 BW(KiB/s)=81202
  iops: min=17304 max=71653 avg=20309.919922
  bw(KiB/s): min=69216 max=286612 avg=81239.710938


# 랜덤 쓰기 성능 테스트 수행 : 5분 정도 소요
# numjobs=16, iodepth=16 : 총 16×16 = 256개의 I/O 요청이 동시에 발생
cat << EOF > fio-write.fio
[global]
ioengine=libaio
numjobs=16
iodepth=16
direct=1
bs=4k
runtime=120
time_based=1
size=1g
group_reporting
rw=randrw
rwmixread=0
rwmixwrite=100
[write]
EOF
kubestr fio -f fio-write.fio -s local-path --size 20G
...
write:
  IOPS=3024.619873 BW(KiB/s)=12098
  iops: min=1557 max=8682 avg=3024.849365
  bw(KiB/s): min=6231 max=34732 avg=12099.703125
...

[ 실행 결과 - 한 눈에 보기 ]

(참고) Linux 6.9 I/O Stack - Image , 한글소개

2. AWS EBS Controller

[ 내용 이해하기 - 요약 ]

- EKS에서는 다양한 Storage Class의 연동을 위해 CSI 인터페이스를 제공하고 있다. 이 중에 쿠버네티스 CSI 인터페이스에 맞춰 개발된 AWS CSI를 CSI 드라이버 라고 지칭한다.

- AWS CSI 드라이버는 크게 2개 구성요소가 있다.

1) AWS API를 호출하면서 AWS 스토리지를 관리하는 CSI-Controller

2) kubelet과 상호작용하면서 AWS스토리지를 pod에 마운트하는 CSI-Node

출처 : https://malwareanalysis.tistory.com/598

[ More ... ]

persistentvolume, persistentvolumeclaim의 accessModes는 ReadWriteOnce로 설정해야 합니다.- why?
1) EBS 호환성 측면 : 다수 노드 동시 접근금지
2) 데이터 안정성 측명 : a. 데이터의 일관성과 안정성 유지 b. 다중 노드에서 경합 방지
EBS스토리지 기본 설정이 동일 AZ에 있는 EC2 인스턴스(에 배포된 파드)에 연결해야 합니다.

참고 링크 :

1) 악분님 Blog

2) Volume (ebs-csi-controller) : EBS CSI driver 동작 : 볼륨 생성 및 파드에 볼륨 연결 - 링크 , Docs , EBS

[ 실습 ]

1) 설치 : Amazon EBS CSI driver as an Amazon EKS add-on - Parameters

# 아래는 aws-ebs-csi-driver 전체 버전 정보와 기본 설치 버전(True) 정보 확인
aws eks describe-addon-versions \
    --addon-name aws-ebs-csi-driver \
    --kubernetes-version 1.31 \
    --query "addons[].addonVersions[].[addonVersion, compatibilities[].defaultVersion]" \
    --output text

# ISRA 설정 : AWS관리형 정책 AmazonEBSCSIDriverPolicy 사용
eksctl create iamserviceaccount \
  --name ebs-csi-controller-sa \
  --namespace kube-system \
  --cluster ${CLUSTER_NAME} \
  --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
  --approve \
  --role-only \
  --role-name AmazonEKS_EBS_CSI_DriverRole

# ISRA 확인
eksctl get iamserviceaccount --cluster ${CLUSTER_NAME}
NAMESPACE	    NAME				            ROLE ARN
kube-system 	ebs-csi-controller-sa		arn:aws:iam::911283464785:role/AmazonEKS_EBS_CSI_DriverRole
...

# Amazon EBS CSI driver addon 배포(설치)
export ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' --output text)
eksctl create addon --name aws-ebs-csi-driver --cluster ${CLUSTER_NAME} --service-account-role-arn arn:aws:iam::${ACCOUNT_ID}:role/AmazonEKS_EBS_CSI_DriverRole --force
kubectl get sa -n kube-system ebs-csi-controller-sa -o yaml | head -5

# 확인
eksctl get addon --cluster ${CLUSTER_NAME}
kubectl get deploy,ds -l=app.kubernetes.io/name=aws-ebs-csi-driver -n kube-system
kubectl get pod -n kube-system -l 'app in (ebs-csi-controller,ebs-csi-node)'
kubectl get pod -n kube-system -l app.kubernetes.io/component=csi-driver

# ebs-csi-controller 파드에 6개 컨테이너 확인
kubectl get pod -n kube-system -l app=ebs-csi-controller -o jsonpath='{.items[0].spec.containers[*].name}' ; echo
ebs-plugin csi-provisioner csi-attacher csi-snapshotter csi-resizer liveness-probe

# csinodes 확인
kubectl api-resources | grep -i csi
kubectl get csinodes
kubectl describe csinodes
...
Name:               ip-192-168-1-104.ap-northeast-2.compute.internal
Labels:             <none>
Annotations:        storage.alpha.kubernetes.io/migrated-plugins:
                      kubernetes.io/aws-ebs,kubernetes.io/azure-disk,kubernetes.io/azure-file,kubernetes.io/cinder,kubernetes.io/gce-pd,kubernetes.io/portworx-v...
CreationTimestamp:  Sat, 15 Feb 2025 13:43:03 +0900
Spec:
  Drivers:
    ebs.csi.aws.com:
      Node ID:  i-01fe8eed1ead9cde5
      Allocatables:
        Count:        25
      Topology Keys:  [kubernetes.io/os topology.ebs.csi.aws.com/zone topology.kubernetes.io/zone]
Events:               <none>
...

kubectl get csidrivers
NAME              ATTACHREQUIRED   PODINFOONMOUNT   STORAGECAPACITY   TOKENREQUESTS   REQUIRESREPUBLISH   MODES        AGE
ebs.csi.aws.com   true             false            false             <unset>         false               Persistent   109s
efs.csi.aws.com   false            false            false             <unset>         false               Persistent   40m

kubectl describe csidrivers ebs.csi.aws.com


# (참고) 노드에 최대 EBS 부착 수량 변경
aws eks update-addon --cluster-name ${CLUSTER_NAME} --addon-name aws-ebs-csi-driver \
  --addon-version v1.39.0-eksbuild.1 --configuration-values '{
    "node": {
      "volumeAttachLimit": 31,
      "enableMetrics": true
    }
  }'
혹은
cat << EOF > node-attachments.yaml
"node":
  "volumeAttachLimit": 31
  "enableMetrics": true
EOF
aws eks update-addon --cluster-name ${CLUSTER_NAME} --addon-name aws-ebs-csi-driver \
  --addon-version v1.39.0-eksbuild.1 --configuration-values 'file://node-attachments.yaml'


## 확인
kubectl get ds -n kube-system ebs-csi-node -o yaml
...
      containers:
      - args:
        - node
        - --endpoint=$(CSI_ENDPOINT)
        - --csi-mount-point-prefix=/var/lib/kubelet/plugins/kubernetes.io/csi/ebs.csi.aws.com/
        - --volume-attach-limit=31
        - --logging-format=text
        - --v=2

kubectl describe csinodes
...
Spec:
  Drivers:
    ebs.csi.aws.com:
      Node ID:  i-0660cdc75451595ab
      Allocatables:
        Count:        31

[ 실행 결과 - 한 눈에 보기 ]

2) gp3 스토리지 클래스 생성 : AWS EBS 스토리지 클래스 파라미터 - 링크 , Parameters

# gp3 스토리지 클래스 생성
kubectl get sc
cat <<EOF | kubectl apply -f -
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: gp3
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
allowVolumeExpansion: true
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
parameters:
  type: gp3
  #iops: "5000"
  #throughput: "250"
  allowAutoIOPSPerGBIncrease: 'true'
  encrypted: 'true'
  fsType: xfs # 기본값이 ext4
EOF
kubectl get sc
kubectl describe sc gp3 | grep Parameters


# gp3 속도 테스트 : iops,bw 가 hostPath에서 테스트한 루트 볼륨(gp3)와 동일하여 아래 테스트도 동일함.
while true; do aws ec2 describe-volumes --filters Name=tag:ebs.csi.aws.com/cluster,Values=true --query "Volumes[].{VolumeId: VolumeId, VolumeType: VolumeType, InstanceId: Attachments[0].InstanceId, State: Attachments[0].State}" --output text; date; sleep 1; done
kubestr fio -f fio-read.fio -s gp3 --size 10G

3) PVC/PV POD 테스트

# 워커노드의 EBS 볼륨 확인 : tag(키/값) 필터링 - 링크
aws ec2 describe-volumes --filters Name=tag:Name,Values=$CLUSTER_NAME-ng1-Node --output table
aws ec2 describe-volumes --filters Name=tag:Name,Values=$CLUSTER_NAME-ng1-Node --query "Volumes[*].Attachments" | jq
aws ec2 describe-volumes --filters Name=tag:Name,Values=$CLUSTER_NAME-ng1-Node --query "Volumes[*].{ID:VolumeId,Tag:Tags}" | jq
aws ec2 describe-volumes --filters Name=tag:Name,Values=$CLUSTER_NAME-ng1-Node --query "Volumes[].[VolumeId, VolumeType, Attachments[].[InstanceId, State][]][]" | jq
aws ec2 describe-volumes --filters Name=tag:Name,Values=$CLUSTER_NAME-ng1-Node --query "Volumes[].{VolumeId: VolumeId, VolumeType: VolumeType, InstanceId: Attachments[0].InstanceId, State: Attachments[0].State}" | jq

# 워커노드에서 파드에 추가한 EBS 볼륨 확인
aws ec2 describe-volumes --filters Name=tag:ebs.csi.aws.com/cluster,Values=true --output table
aws ec2 describe-volumes --filters Name=tag:ebs.csi.aws.com/cluster,Values=true --query "Volumes[*].{ID:VolumeId,Tag:Tags}" | jq
aws ec2 describe-volumes --filters Name=tag:ebs.csi.aws.com/cluster,Values=true --query "Volumes[].{VolumeId: VolumeId, VolumeType: VolumeType, InstanceId: Attachments[0].InstanceId, State: Attachments[0].State}" | jq

# 워커노드에서 파드에 추가한 EBS 볼륨 모니터링
while true; do aws ec2 describe-volumes --filters Name=tag:ebs.csi.aws.com/cluster,Values=true --query "Volumes[].{VolumeId: VolumeId, VolumeType: VolumeType, InstanceId: Attachments[0].InstanceId, State: Attachments[0].State}" --output text; date; sleep 1; done

# PVC 생성
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ebs-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 4Gi
  storageClassName: gp3
EOF
kubectl get pvc,pv

# 파드 생성
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo \$(date -u) >> /data/out.txt; sleep 5; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: ebs-claim
EOF

# PVC, 파드 확인
kubectl get pvc,pv,pod
kubectl get VolumeAttachment
kubectl df-pv

# 추가된 EBS 볼륨 상세 정보 확인 : AWS 관리콘솔 EC2(EBS)에서 확인
aws ec2 describe-volumes --volume-ids $(kubectl get pv -o jsonpath="{.items[0].spec.csi.volumeHandle}") | jq

# PV 상세 확인 : nodeAffinity 내용의 의미는?
kubectl get pv -o yaml
...
    nodeAffinity:
      required:
        nodeSelectorTerms:
        - matchExpressions:
          - key: topology.ebs.csi.aws.com/zone
            operator: In
            values:
            - ap-northeast-2b
...

kubectl get node --label-columns=topology.ebs.csi.aws.com/zone,topology.k8s.aws/zone-id
kubectl describe node

# 파일 내용 추가 저장 확인
kubectl exec app -- tail -f /data/out.txt

## 파드 내에서 볼륨 정보 확인
kubectl exec -it app -- sh -c 'df -hT --type=overlay'
kubectl exec -it app -- sh -c 'df -hT --type=xfs'

[ 실행 결과 - 한 눈에 보기 ]

도전과제1 )

Example: Deploying WordPress and MySQL with Persistent Volumes : PVC/PV를 사용하여 워드프레스 배포해보기 - 링크

https://whchoi98.gitbook.io/k8s/eks-storage/stateful-container : 해당 링크의 EBS CSI 구성 실습 따라해보기
워드프레스 접속은 Ingress(ALB, group 활용) - Service(ClusterIP)로 설정할 것

도전과제2 )

Use multiple EBS volumes for containers : 파드를 위한 저장소를 신규 EBS 디스크 사용하게 설정 - 링크

Multiple EBS Volumes for Containers 를 인스턴스 스토어에서 사용하게 노드그룹 생성해서 사용 https://blog.montkim.com/eks-storage

3. AWS Volume SnapShots Controller

▶ Volumesnapshots 컨트롤러 설치 - 링크 VolumeSnapshot example Blog Docs

# Install Snapshot CRDs
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl get crd | grep snapshot
kubectl api-resources  | grep snapshot

# Install Common Snapshot Controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml
kubectl get deploy -n kube-system snapshot-controller
kubectl get pod -n kube-system

# Install Snapshotclass
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-ebs-csi-driver/master/examples/kubernetes/snapshot/manifests/classes/snapshotclass.yaml
kubectl get vsclass # 혹은 volumesnapshotclasses
kubectl describe vsclass

▶ 사용 example Blog

# PVC 생성
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ebs-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 4Gi
  storageClassName: gp3
EOF
kubectl get pvc,pv

# 파드 생성
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: app
    image: centos
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo \$(date -u) >> /data/out.txt; sleep 5; done"]
    volumeMounts:
    - name: persistent-storage
      mountPath: /data
  volumes:
  - name: persistent-storage
    persistentVolumeClaim:
      claimName: ebs-claim
EOF

# 파일 내용 추가 저장 확인
kubectl exec app -- tail -f /data/out.txt

# VolumeSnapshot 생성 : Create a VolumeSnapshot referencing the PersistentVolumeClaim name
# AWS 관리 콘솔 EBS 스냅샷 확인
cat <<EOF | kubectl apply -f -
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: ebs-volume-snapshot
spec:
  volumeSnapshotClassName: csi-aws-vsc
  source:
    persistentVolumeClaimName: ebs-claim
EOF

# VolumeSnapshot 확인
kubectl get volumesnapshot
kubectl get volumesnapshot ebs-volume-snapshot -o jsonpath={.status.boundVolumeSnapshotContentName} ; echo
kubectl describe volumesnapshot.snapshot.storage.k8s.io ebs-volume-snapshot
kubectl get volumesnapshotcontents

# VolumeSnapshot ID 확인 
kubectl get volumesnapshotcontents -o jsonpath='{.items[*].status.snapshotHandle}' ; echo

# AWS EBS 스냅샷 확인
aws ec2 describe-snapshots --owner-ids self | jq
aws ec2 describe-snapshots --owner-ids self --query 'Snapshots[]' --output table

# app & pvc 제거 : 강제로 장애 재현
kubectl delete pod app && kubectl delete pvc ebs-claim

[ 실행 결과 - 한 눈에 보기 ]

도전과제3) 볼륨 snapscheduler (정기적인 반복 스냅샷 생성) - Link , Blog1 , Blog2

도전과제4) 비동기방식으로 다른 쿠버 클러스터에 PV볼륨을 복제 : volsync - 링크

4. AWS EFS Controller

▶ EFS 파일시스템 확인 및 EFS Controller Addon 설치 - Docs , Github

[ 구성 아키텍쳐 ]

출처 : https://dev.to/awscommunity-asean/aws-eks-with-efs-csi-driver-and-irsa-using-cdk-dgc

[ 실습코드 ]

# EFS 정보 확인 
aws efs describe-file-systems --query "FileSystems[*].FileSystemId" --output text

# 아래는 aws-efs-csi-driver 전체 버전 정보와 기본 설치 버전(True) 정보 확인
aws eks describe-addon-versions \
    --addon-name aws-efs-csi-driver \
    --kubernetes-version 1.31 \
    --query "addons[].addonVersions[].[addonVersion, compatibilities[].defaultVersion]" \
    --output text

# IAM 정책 생성
curl -s -O https://raw.githubusercontent.com/kubernetes-sigs/aws-efs-csi-driver/master/docs/iam-policy-example.json
aws iam create-policy --policy-name AmazonEKS_EFS_CSI_Driver_Policy --policy-document file://iam-policy-example.json

# ISRA 설정 : 고객관리형 정책 AmazonEKS_EFS_CSI_Driver_Policy 사용
eksctl create iamserviceaccount \
  --name efs-csi-controller-sa \
  --namespace kube-system \
  --cluster ${CLUSTER_NAME} \
  --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEFSCSIDriverPolicy \
  --approve \
  --role-only \
  --role-name AmazonEKS_EFS_CSI_DriverRole

# ISRA 확인
eksctl get iamserviceaccount --cluster ${CLUSTER_NAME}

# Amazon EFS CSI driver addon 배포(설치)
export ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' --output text)
eksctl create addon --name aws-efs-csi-driver --cluster ${CLUSTER_NAME} --service-account-role-arn arn:aws:iam::${ACCOUNT_ID}:role/AmazonEKS_EFS_CSI_DriverRole --force
kubectl get sa -n kube-system efs-csi-controller-sa -o yaml | head -5

# 확인
eksctl get addon --cluster ${CLUSTER_NAME}
kubectl get pod -n kube-system -l "app.kubernetes.io/name=aws-efs-csi-driver,app.kubernetes.io/instance=aws-efs-csi-driver"
kubectl get pod -n kube-system -l app=efs-csi-controller -o jsonpath='{.items[0].spec.containers[*].name}' ; echo
kubectl get csidrivers efs.csi.aws.com -o yaml

[ 실행 결과 - 한 눈에 보기 ]

▶ EFS 파일시스템을 파드가 사용하게 설정 : Add empty StorageClasses from static example - Workshop 링크

* EC2에서 실습

# 모니터링
watch 'kubectl get sc efs-sc; echo; kubectl get pv,pvc,pod'

# [운영 서버 EC2]
# 실습 코드 clone
git clone https://github.com/kubernetes-sigs/aws-efs-csi-driver.git /root/efs-csi
cd /root/efs-csi/examples/kubernetes/multiple_pods/specs && tree

# EFS 스토리지클래스 생성 및 확인
cat storageclass.yaml
kubectl apply -f storageclass.yaml
kubectl get sc efs-sc

# PV 생성 및 확인 : volumeHandle을 자신의 EFS 파일시스템ID로 변경
EfsFsId=$(aws efs describe-file-systems --query "FileSystems[*].FileSystemId" --output text)
sed -i "s/fs-4af69aab/$EfsFsId/g" pv.yaml
cat pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: efs-pv
spec:
  capacity:
    storage: 5Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
  csi:
    driver: efs.csi.aws.com
    volumeHandle: fs-05699d3c12ef609e2

kubectl apply -f pv.yaml
kubectl get pv; kubectl describe pv

# PVC 생성 및 확인
cat claim.yaml
kubectl apply -f claim.yaml
kubectl get pvc

# 파드 생성 및 연동 : 파드 내에 /data 데이터는 EFS를 사용
# 추후에 파드1,2가 각기 다른 노드에 배포되게 추가해두자!
cat pod1.yaml pod2.yaml
kubectl apply -f pod1.yaml,pod2.yaml
kubectl df-pv

# 파드 정보 확인 : PV에 5Gi 와 파드 내에서 확인한 NFS4 볼륨 크리 8.0E의 차이는 무엇?
kubectl get pods
kubectl exec -ti app1 -- sh -c "df -hT -t nfs4"
kubectl exec -ti app2 -- sh -c "df -hT -t nfs4"
Filesystem           Type            Size      Used Available Use% Mounted on
127.0.0.1:/          nfs4            8.0E         0      8.0E   0% /data

# 공유 저장소 저장 동작 확인
tree /mnt/myefs              # 운영서버 EC2 에서 확인
tail -f /mnt/myefs/out1.txt  # 운영서버 EC2 에서 확인
tail -f /mnt/myefs/out2.txt  # 운영서버 EC2 에서 확인
kubectl exec -ti app1 -- tail -f /data/out1.txt
kubectl exec -ti app2 -- tail -f /data/out2.txt

[ 실행 결과 - 한 눈에 보기 ]

▶ EFS 파일시스템을 다수의 파드가 사용하게 설정 : Dynamic provisioning using EFS ← Fargate node는 현재 미지원 - Workshop , KrBlog

# 모니터링
watch 'kubectl get sc efs-sc; echo; kubectl get pv,pvc,pod'

# [운영 서버 EC2]
# EFS 스토리지클래스 생성 및 확인
curl -s -O https://raw.githubusercontent.com/kubernetes-sigs/aws-efs-csi-driver/master/examples/kubernetes/dynamic_provisioning/specs/storageclass.yaml
cat storageclass.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com
parameters:
  provisioningMode: efs-ap #  The type of volume to be provisioned by Amazon EFS. Currently, only access point based provisioning is supported (efs-ap).
  fileSystemId: fs-92107410 # The file system under which the access point is created.
  directoryPerms: "700" # The directory permissions of the root directory created by the access point.
  gidRangeStart: "1000" # optional, The starting range of the Posix group ID to be applied onto the root directory of the access point. The default value is 50000.
  gidRangeEnd: "2000" # optional, The ending range of the Posix group ID. The default value is 7000000.
  basePath: "/dynamic_provisioning" # optional, The path on the file system under which the access point root directory is created. If the path isn't provided, the access points root directory is created under the root of the file system.
  subPathPattern: "${.PVC.namespace}/${.PVC.name}" # optional, A pattern that describes the subPath under which an access point should be created. So if the pattern were ${.PVC.namespace}/${PVC.name}, the PVC namespace is foo and the PVC name is pvc-123-456, and the basePath is /dynamic_provisioner the access point would be created at /dynamic_provisioner/foo/pvc-123-456
  ensureUniqueDirectory: "true" # optional # A boolean that ensures that, if set, a UUID is appended to the final element of any dynamically provisioned path, as in the above example. This can be turned off but this requires you as the administrator to ensure that your storage classes are set up correctly. Otherwise, it's possible that 2 pods could end up writing to the same directory by accident. Please think very carefully before setting this to false!
  reuseAccessPoint: "false" # optional
  
sed -i "s/fs-92107410/$EfsFsId/g" storageclass.yaml
kubectl apply -f storageclass.yaml
kubectl get sc efs-sc

# PVC/파드 생성 및 확인
curl -s -O https://raw.githubusercontent.com/kubernetes-sigs/aws-efs-csi-driver/master/examples/kubernetes/dynamic_provisioning/specs/pod.yaml
cat pod.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: efs-claim
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: efs-sc
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: efs-app
spec:
  containers:
    - name: app
      image: centos
      command: ["/bin/sh"]
      args: ["-c", "while true; do echo $(date -u) >> /data/out; sleep 5; done"]
      volumeMounts:
        - name: persistent-storage
          mountPath: /data
  volumes:
    - name: persistent-storage
      persistentVolumeClaim:
        claimName: efs-claim

kubectl apply -f pod.yaml
kubectl get pvc,pv,pod

# PVC/PV 생성 로그 확인
kubectl krew install stern
kubectl stern -n kube-system -l app=efs-csi-controller -c csi-provisioner
혹은
kubectl logs  -n kube-system -l app=efs-csi-controller -c csi-provisioner -f

# 파드 정보 확인
kubectl exec -it efs-app -- sh -c "df -hT -t nfs4"
Filesystem           Type            Size      Used Available Use% Mounted on
127.0.0.1:/          nfs4            8.0E         0      8.0E   0% /data

# 공유 저장소 저장 동작 확인
tree /mnt/myefs              # 운영서버 EC2 에서 확인
kubectl exec efs-app -- bash -c "cat /data/out"
kubectl exec efs-app -- bash -c "ls -l /data/out"
kubectl exec efs-app -- bash -c "stat /data/"

▶ 자원 정리 :

# 쿠버네티스 리소스 삭제
kubectl delete pod app1 app2
kubectl delete pvc efs-claim && kubectl delete pv efs-pv && kubectl delete sc efs-sc

5. EKS Persistent Volumes for Instance Store & Add NodeGroup

▶ 신규 노드 그룹 ng2 생성 - Blog : c5d.large 의 EC2 인스턴스 스토어(임시 블록 스토리지) 설정 작업 - 링크 , NVMe SSD - 링크

데이터 손실 : 기본 디스크 드라이브 오류, 인스턴스가 중지됨, 인스턴스가 최대 절전 모드로 전환됨, 인스턴스가 종료됨

[ 실습 코드 ]

Step1. 노드그룹 생성 및 환경세팅

☞ 인스턴스 스토어는 EC2 스토리지(EBS) 정보에 출력되지는 않는다!!

# 인스턴스 스토어 볼륨이 있는 c5 모든 타입의 스토리지 크기
aws ec2 describe-instance-types \
 --filters "Name=instance-type,Values=c5*" "Name=instance-storage-supported,Values=true" \
 --query "InstanceTypes[].[InstanceType, InstanceStorageInfo.TotalSizeInGB]" \
 --output table
--------------------------
|  DescribeInstanceTypes |
+---------------+--------+
|  c5d.large    |  50    |
|  c5d.12xlarge |  1800  |
...

# 신규 노드 그룹 생성 전 정보 확인
eksctl create nodegroup --help
eksctl create nodegroup -c $CLUSTER_NAME -r ap-northeast-2 --subnet-ids "$PubSubnet1","$PubSubnet2","$PubSubnet3" --ssh-access \
  -n ng2 -t c5d.large -N 1 -m 1 -M 1 --node-volume-size=30 --node-labels disk=instancestore --max-pods-per-node 100 --dry-run > myng2.yaml

cat <<EOT > nvme.yaml
  preBootstrapCommands:
    - |
      # Install Tools
      yum install nvme-cli links tree jq tcpdump sysstat -y

      # Filesystem & Mount
      mkfs -t xfs /dev/nvme1n1
      mkdir /data
      mount /dev/nvme1n1 /data

      # Get disk UUID
      uuid=\$(blkid -o value -s UUID mount /dev/nvme1n1 /data) 

      # Mount the disk during a reboot
      echo /dev/nvme1n1 /data xfs defaults,noatime 0 2 >> /etc/fstab
EOT
sed -i -n -e '/volumeType/r nvme.yaml' -e '1,$p' myng2.yaml

#
export PubSubnet1=$(aws ec2 describe-subnets --filters Name=tag:Name,Values="$CLUSTER_NAME-Vpc1PublicSubnet1" --query "Subnets[0].[SubnetId]" --output text)
export PubSubnet2=$(aws ec2 describe-subnets --filters Name=tag:Name,Values="$CLUSTER_NAME-Vpc1PublicSubnet2" --query "Subnets[0].[SubnetId]" --output text)
export PubSubnet3=$(aws ec2 describe-subnets --filters Name=tag:Name,Values="$CLUSTER_NAME-Vpc1PublicSubnet3" --query "Subnets[0].[SubnetId]" --output text)
echo $PubSubnet1 $PubSubnet2 $PubSubnet3

# 
SSHKEYNAME=<각자 자신의 SSH Keypair 이름>
SSHKEYNAME=kp-gasida

Step2. myng2.yaml 파일 작성

cat << EOF > myng2.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: myeks
  region: ap-northeast-2
  version: "1.31"

managedNodeGroups:
- amiFamily: AmazonLinux2
  desiredCapacity: 1
  instanceType: c5d.large
  labels:
    alpha.eksctl.io/cluster-name: myeks
    alpha.eksctl.io/nodegroup-name: ng2
    disk: instancestore
  maxPodsPerNode: 110
  maxSize: 1
  minSize: 1
  name: ng2
  ssh:
    allow: true
    publicKeyName: $SSHKEYNAME
  subnets:
  - $PubSubnet1
  - $PubSubnet2
  - $PubSubnet3
  tags:
    alpha.eksctl.io/nodegroup-name: ng2
    alpha.eksctl.io/nodegroup-type: managed
  volumeIOPS: 3000
  volumeSize: 30
  volumeThroughput: 125
  volumeType: gp3
  preBootstrapCommands:
    - |
      # Install Tools
      yum install nvme-cli links tree jq tcpdump sysstat -y

      # Filesystem & Mount
      mkfs -t xfs /dev/nvme1n1
      mkdir /data
      mount /dev/nvme1n1 /data

      # Get disk UUID
      uuid=\$(blkid -o value -s UUID mount /dev/nvme1n1 /data) 

      # Mount the disk during a reboot
      echo /dev/nvme1n1 /data xfs defaults,noatime 0 2 >> /etc/fstab
EOF

Step3. 신규 노드그룹 생성

# 신규 노드 그룹 생성
eksctl create nodegroup -f myng2.yaml

# 확인
kubectl get node --label-columns=node.kubernetes.io/instance-type,eks.amazonaws.com/capacityType,topology.kubernetes.io/zone
kubectl get node -l disk=instancestore


# ng2 노드 그룹 *ng2-remoteAccess* 포함된 보안그룹 ID
aws ec2 describe-security-groups --filters "Name=group-name,Values=*ng2-remoteAccess*" | jq
export NG2SGID=$(aws ec2 describe-security-groups --filters "Name=group-name,Values=*ng2-remoteAccess*" --query 'SecurityGroups[*].GroupId' --output text)
aws ec2 authorize-security-group-ingress --group-id $NG2SGID --protocol '-1' --cidr $(curl -s ipinfo.io/ip)/32
aws ec2 authorize-security-group-ingress --group-id $NG2SGID --protocol '-1' --cidr 172.20.1.100/32


# 워커 노드 SSH 접속
N4=<각자 자신의 워커 노드4번 공인 IP 지정>
N4=3.37.44.222
ssh ec2-user@$N4 hostname

# 확인
ssh ec2-user@$N4 sudo nvme list
ssh ec2-user@$N4 sudo lsblk -e 7 -d
ssh ec2-user@$N4 sudo df -hT -t xfs
ssh ec2-user@$N4 sudo tree /data
ssh ec2-user@$N4 sudo cat /etc/fstab

# (옵션) max-pod 확인
kubectl describe node -l disk=instancestore | grep Allocatable: -A7

# (옵션) kubelet 데몬 파라미터 확인 : --max-pods=29 --max-pods=110
ssh ec2-user@$N4 cat /etc/eks/bootstrap.sh
ssh ec2-user@$N4 sudo ps -ef | grep kubelet
root        3012       1  0 06:50 ?        00:00:02 /usr/bin/kubelet --config /etc/kubernetes/kubelet/kubelet-config.json --kubeconfig /var/lib/kubelet/kubeconfig --container-runtime-endpoint unix:///run/containerd/containerd.sock --image-credential-provider-config /etc/eks/image-credential-provider/config.json --image-credential-provider-bin-dir /etc/eks/image-credential-provider --node-ip=192.168.2.228 --pod-infra-container-image=602401143452.dkr.ecr.ap-northeast-2.amazonaws.com/eks/pause:3.5 --v=2 --hostname-override=ip-192-168-2-228.ap-northeast-2.compute.internal --cloud-provider=external --node-labels=eks.amazonaws.com/sourceLaunchTemplateVersion=1,alpha.eksctl.io/cluster-name=myeks,alpha.eksctl.io/nodegroup-name=ng2,disk=instancestore,eks.amazonaws.com/nodegroup-image=ami-0fa05db9e3c145f63,eks.amazonaws.com/capacityType=ON_DEMAND,eks.amazonaws.com/nodegroup=ng2,eks.amazonaws.com/sourceLaunchTemplateId=lt-0955d0931c1d712c1 --max-pods=29 --max-pods=110

Step4. local-path 스토리지 클래스 재생성 : 패스 변경

# 기존 local-path 스토리지 클래스 삭제
kubectl delete -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.31/deploy/local-path-storage.yaml

#
curl -sL https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.31/deploy/local-path-storage.yaml | sed 's/opt/data/g' | kubectl apply -f -

kubectl describe cm -n local-path-storage local-path-config
...
        "nodePathMap":[
        {
                "node":"DEFAULT_PATH_FOR_NON_LISTED_NODES",
                "paths":["/data/local-path-provisioner"]
        }
        ]
...

# 모니터링
watch 'kubectl get pod -owide;echo;kubectl get pv,pvc'
ssh ec2-user@$N4 iostat -xmdz 1 -p nvme1n1

# [운영서버 EC2] Read 측정
kubestr fio -f fio-read.fio -s local-path --size 10G --nodeselector disk=instancestore
...
read:
  IOPS=20309.355469 BW(KiB/s)=81237
  iops: min=17392 max=93872 avg=20316.857422
  bw(KiB/s): min=69570 max=375488 avg=81268.023438

Disk stats (read/write):
  nvme1n1: ios=2432488/9 merge=0/3 ticks=7639891/23 in_queue=7639913, util=99.950768%
  -  OK

Step5. 자원 정리

# local-path 스토리지 클래스 삭제
kubectl delete -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.31/deploy/local-path-storage.yaml

# ng2 노드그룹 삭제
eksctl delete nodegroup -c $CLUSTER_NAME -n ng2

[ 실행 결과 - 한 눈에 보기 ]

참고 : 일반 EBS(기본값 3000 IOPS) vs 인스턴스 스토어 평균 IOPS 속도 비교 with kubestr ← 인스턴스 스토어가 7배 빠름

6. 노드 그룹

[ 내용 이해하기 - 요약 ]

☞ 노드그룹이란?

- Kubernetes 클러스터 내에서 워커 노드를 관리하기 위한 그룹을 나타냅니다. 이 그룹은 주로 EC2 인스턴스로 구성되며, EKS 클러스터 내에서 실행되는 컨테이너 워크로드를 처리합니다. Nodegroup은 특정 인스턴스 유형, 크기 및 구성으로 구성됩니다.

[ 사전지식 ]

▶ [운영서버 EC2] docker buildx 활성화 : Multi(or cross)-platform 빌드 - Link , Docs , Youtube

♣ docker buildx 는 Emulating 기법을 사용하여 이기종 아키텍쳐 간 이미지 생성이 가능하도록 해준다.

# 
arch
x86_64

# CPU Arch arm64v8 , riscv64 실행 시도
docker run --rm -it riscv64/ubuntu bash
docker run --rm -it arm64v8/ubuntu bash


# Extended build capabilities with BuildKit - List builder instances
docker buildx ls
NAME/NODE DRIVER/ENDPOINT STATUS  BUILDKIT PLATFORMS
default * docker
  default default         running v0.12.5  linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386


# docker buildx 활성화 (멀티 아키텍처 빌드를 위해 필요)
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
docker images

docker buildx create --use --name mybuilder
docker buildx ls

# Buildx가 정상 동작하는지 확인
docker buildx inspect --bootstrap
...
Platforms: linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/arm64, linux/riscv64, linux/ppc64, linux/ppc64le, linux/s390x, linux/386, linux/arm/v7, linux/arm/v6
...

docker buildx ls
NAME/NODE    DRIVER/ENDPOINT             STATUS  BUILDKIT PLATFORMS
mybuilder *  docker-container
  mybuilder0 unix:///var/run/docker.sock running v0.19.0  linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/arm64, linux/riscv64, linux/ppc64, linux/ppc64le, linux/s390x, linux/386, linux/arm/v7, linux/arm/v6
default      docker
  default    default                     running v0.12.5  linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386, linux/arm64, linux/riscv64, linux/ppc64, linux/ppc64le, linux/s390x, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6

docker ps
CONTAINER ID   IMAGE                           COMMAND       CREATED              STATUS              PORTS     NAMES
fa8773b87c70   moby/buildkit:buildx-stable-1   "buildkitd"   About a minute ago   Up About a minute             buildx_buildkit_mybuilder0

▶ (샘플) 컨테이너 이미지 빌드 및 실행 - 윈도우PC(amd64)와 macOS(arm64)

#
mkdir myweb && cd myweb

# server.py 파일 작성
cat > server.py <<EOF
from http.server import ThreadingHTTPServer, BaseHTTPRequestHandler
from datetime import datetime
import socket

class RequestHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header('Content-type', 'text/plain')
        self.end_headers()
        
        now = datetime.now()
        hostname = socket.gethostname()
        response_string = now.strftime("The time is %-I:%M:%S %p, VERSION 0.0.1\n")
        response_string += f"Server hostname: {hostname}\n"
        self.wfile.write(bytes(response_string, "utf-8")) 

def startServer():
    try:
        server = ThreadingHTTPServer(('', 80), RequestHandler)
        print("Listening on " + ":".join(map(str, server.server_address)))
        server.serve_forever()
    except KeyboardInterrupt:
        server.shutdown()

if __name__ == "__main__":
    startServer()
EOF


# Dockerfile 생성
cat > Dockerfile <<EOF
FROM python:3.12
ENV PYTHONUNBUFFERED 1
COPY . /app
WORKDIR /app 
CMD python3 server.py
EOF

# 빌드, 실행 후 삭제
docker pull python:3.12
docker build -t myweb:1 -t myweb:latest .
docker images
docker run -d -p 8080:80 --name=timeserver myweb
curl http://localhost:8080
docker rm -f timeserver


# 멀티 플랫폼 빌드 후 푸시
docker images
docker login

DOCKERNAME=<도커허브 계정명>
DOCKERNAME=gasida

docker buildx build --platform linux/amd64,linux/arm64 --push --tag $DOCKERNAME/myweb:multi .
docker images
docker manifest inspect $DOCKERNAME/myweb:multi | jq
docker buildx imagetools inspect $DOCKERNAME/myweb:multi

# 컨테이너 실행 해보기 : 윈도우PC(amd64)와 macOS(arm64) 두 곳 모두 동일한 컨테이너 이미지 경로로 실행해보자!
docker ps
docker run -d -p 8080:80 --name=timeserver $DOCKERNAME/myweb:multi
docker ps

# 컨테이너 접속 및 로그 확인
curl http://localhost:8080
docker logs timeserver

# 컨테이너 이미지 내부에 파일 확인
docker exec -it timeserver ls -l

# 컨테이너 이미지 내부에 server.py 파일 확인
docker exec -it timeserver cat server.py

# 컨테이너 삭제
docker rm -f timeserver

6-1. Spot NodeGroup

▶ Spot instances 노드 그룹 (managed-spot) 생성 - Link , Blog

AWS 고객이 EC2 여유 용량 풀을 활용하여 엄청난 할인으로 EC2 인스턴스를 실행할 수 있습니다.
EC2에 용량이 다시 필요할 때 2분 알림으로 Spot Instances를 중단할 수 있습니다.
Kubernetes 워커 노드로 Spot Instances를 사용하는 것은 상태 비저장 API 엔드포인트, 일괄 처리, ML 학습 워크로드, Apache Spark를 사용한 빅데이터 ETL, 대기열 처리 애플리케이션, CI/CD 파이프라인과 같은 워크로드에 매우 인기 있는 사용 패턴입니다.
예를 들어 Kubernetes에서 상태 비저장 API 서비스를 실행하는 것은 Spot Instances를 워커 노드로 사용하기에 매우 적합합니다. Pod를 우아하게 종료할 수 있고 Spot Instances가 중단되면 다른 워커 노드에서 대체 Pod를 예약할 수 있기 때문입니다.

Instance type diversification - Link

# [운영서버 EC2] ec2-instance-selector 설치
curl -Lo ec2-instance-selector https://github.com/aws/amazon-ec2-instance-selector/releases/download/v2.4.1/ec2-instance-selector-`uname | tr '[:upper:]' '[:lower:]'`-amd64 && chmod +x ec2-instance-selector
mv ec2-instance-selector /usr/local/bin/
ec2-instance-selector --version

# 적절한 인스턴스 스펙 선택을 위한 도구 사용
ec2-instance-selector --vcpus 2 --memory 4 --gpus 0 --current-generation -a x86_64 --deny-list 't.*' --output table-wide
Instance Type   VCPUs   Mem (GiB)  Hypervisor  Current Gen  Hibernation Support  CPU Arch  Network Performance  ENIs    GPUs    GPU Mem (GiB)  GPU Info  On-Demand Price/Hr  Spot Price/Hr (30d avg)
-------------   -----   ---------  ----------  -----------  -------------------  --------  -------------------  ----    ----    -------------  --------  ------------------  -----------------------
c5.large        2       4          nitro       true         true                 x86_64    Up to 10 Gigabit     3       0       0              none      $0.096              $0.02837
c5a.large       2       4          nitro       true         false                x86_64    Up to 10 Gigabit     3       0       0              none      $0.086              $0.04022
c5d.large       2       4          nitro       true         true                 x86_64    Up to 10 Gigabit     3       0       0              none      $0.11               $0.03265
c6i.large       2       4          nitro       true         true                 x86_64    Up to 12.5 Gigabit   3       0       0              none      $0.096              $0.03425
c6id.large      2       4          nitro       true         true                 x86_64    Up to 12.5 Gigabit   3       0       0              none      $0.1155             $0.03172
c6in.large      2       4          nitro       true         true                 x86_64    Up to 25 Gigabit     3       0       0              none      $0.1281             $0.04267
c7i-flex.large  2       4          nitro       true         true                 x86_64    Up to 12.5 Gigabit   3       0       0              none      $0.09576            $0.02872
c7i.large       2       4          nitro       true         true                 x86_64    Up to 12.5 Gigabit   3       0       0              none      $0.1008             $0.02977

#Internally ec2-instance-selector is making calls to the DescribeInstanceTypes for the specific region and filtering the instances based on the criteria selected in the command line, in our case we filtered for instances that meet the following criteria:
- Instances with no GPUs
- of x86_64 Architecture (no ARM instances like A1 or m6g instances for example)
- Instances that have 2 vCPUs and 4 GB of RAM
- Instances of current generation (4th gen onwards)
- Instances that don’t meet the regular expression t.* to filter out burstable instance types

▶ Running a workload on Spot instances

#
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: busybox
    image: busybox
    command:
    - "/bin/sh"
    - "-c"
    - "while true; do date >> /home/pod-out.txt; cd /home; sync; sync; sleep 10; done"
  nodeSelector:
    eks.amazonaws.com/capacityType: SPOT
EOF

# 파드가 배포된 노드 정보 확인
kubectl get pod -owide

# 삭제
kubectl delete pod busybox

▶ (정보) Interruption Handling in EKS managed node groups with Spot capacity - Link

Spot 중단을 처리하기 위해 AWS Node Termination Handler와 같은 클러스터에 추가 자동화 도구를 설치할 필요가 없습니다. 관리형 노드 그룹은 사용자를 대신하여 Amazon EC2 Auto Scaling 그룹을 구성하고 다음과 같은 방식으로 Spot 중단을 처리합니다. To handle Spot interruptions, you do not need to install any extra automation tools on the cluster such as the AWS Node Termination Handler. A managed node group configures an Amazon EC2 Auto Scaling group on your behalf and handles the Spot interruption in following manner:
- Amazon EC2 Spot 용량 재조정은 Amazon EKS가 Spot 노드를 우아하게 비우고 재조정하여 Spot 노드가 중단 위험이 높을 때 애플리케이션 중단을 최소화할 수 있도록 활성화됩니다.
  - Amazon EC2 Spot Capacity Rebalancing is enabled so that Amazon EKS can gracefully drain and rebalance your Spot nodes to minimize application disruption when a Spot node is at elevated risk of interruption. For more information, see Amazon EC2 Auto Scaling Capacity Rebalancing in the Amazon EC2 Auto Scaling User Guide.
- 교체 Spot 노드가 부트스트랩되고 Kubernetes에서 Ready 상태가 되면 Amazon EKS는 재조정 권장 사항을 수신한 Spot 노드를 cordons하고 drains합니다. Spot 노드를 cordons하면 노드가 예약 불가능으로 표시되고 kube-scheduler가 해당 노드에서 새 포드를 예약하지 않습니다.
  - When a replacement Spot node is bootstrapped and in the Ready state on Kubernetes, Amazon EKS cordons and drains the Spot node that received the rebalance recommendation. Cordoning the Spot node ensures that the node is marked as unschedulable and kube-scheduler will not schedule any new pods on it. It also removes it from its list of healthy, active Spot nodes. Draining the Spot node ensures that running pods are evicted gracefully.
- 교체 Spot 노드가 준비 상태가 되기 전에 Spot 2분 중단 알림이 도착하면 Amazon EKS는 재균형 권장 사항을 받은 Spot 노드의 드레이닝을 시작합니다.
  - If a Spot two-minute interruption notice arrives before the replacement Spot node is in a Ready state, Amazon EKS starts draining the Spot node that received the rebalance recommendation.
이 프로세스는 Spot 중단이 도착할 때까지 교체 Spot 노드를 기다리는 것을 피하고, 대신 사전에 교체 노드를 조달하여 보류 중인 Pod의 스케줄링 시간을 최소화하는 데 도움이 됨. This process avoids waiting for replacement Spot node till Spot interruption arrives, instead it procures replacement in advance and helps in minimizing the scheduling time for pending pods.

6-2. BottleRocket AMI

[ 요약 ]

☞ BottleRocket AMI는 CIS 보안규정 level 별 준수할 수 있도록 만든 커스터마이징 AMI로 업무 특성 상 외부 보안기준에 맞추어 서비스를 제공해야 하는 경우, 사용하면 유용하다. 단점은, 기본적인 SSH 접근제한 등 일반적인 업무 상 어드민 or 관리자의 업무가 상당히(?!) 제약을 받는다.

▶ Bottlerocket 소개 : 컨테이너 실행을 위한 Linux 기반 운영 체제 - Home , Github , Docs , Docs2, Blog , Work , Youtube , Workshop

배경
- 현재 이용되는 컨테이너는 대부분 다양한 형식으로 패키징된 애플리케이션을 지원하도록 설계된 범용 운영 체제(OS)에서 실행됩니다. 그와 같은 운영 체제에는 수백 가지 패키지가 포함되어 있으며, 컨테이너화된 애플리케이션 하나를 실행하는 데 사용되는 패키지는 몇 개 되지 않더라도 자주 보안 및 유지 관리 업데이트를 해야 합니다. Bottlerocket은 보안에 중점을 두고, 컨테이너 호스팅에 필수적인 소프트웨어만 포함하므로 공격에 대한 노출 위험을 줄입니다. 여기에는 SELinux(Security-Enhanced Linux)가 함께 제공되어 추가 격리 모드를 지원하며 Linux 커널 기능의 일종인 Device Mapper의 진실성 대상(dm-verity)을 사용하므로 루트킷 기반 공격을 예방하는 데 도움이 됩니다. 이러한 보안 강화 기능 외에도 Bottlerocket 업데이트를 적용하여 원자 단위 방식으로 롤백하므로 업데이트 관리가 한층 간소화됩니다.
장점 - Kr
- 운영 비용 절감 및 관리 복잡성 감소로 가동 시간 증가 – Bottlerocket은 다른 Linux 배포판보다 리소스 공간이 작고 부팅 시간이 짧으며 보안 위협에 덜 취약합니다. Bottlerocket은 공간이 작아 스토리지, 컴퓨팅 및 네트워킹 리소스를 적게 사용하여 비용을 절감할 수 있습니다.
- 자동 OS 업데이트로 보안 강화 – Bottlerocket 업데이트는 필요한 경우 롤백할 수 있는 단일 단위로 적용됩니다. 이렇게 하면 시스템을 사용 불가 상태로 둘 수 있는 손상되거나 실패한 업데이트의 위험이 제거됩니다. Bottlerocket을 사용하면 보안 업데이트를 사용할 수 있는 즉시 중단을 최소화하는 방식으로 자동 적용하고 장애 발생 시 롤백할 수 있습니다.
- 프리미엄 지원 – AWS에서 제공하는 Amazon EC2 기반 Bottlerocket 빌드에는 Amazon EC2, Amazon EKS, Amazon ECR 등의 AWS 서비스에도 적용되는 동일한 AWS Support 플랜이 적용됩니다
- Bottlerocket의 설계 덕분에 OS 바이너리 업데이트와 보안 패치 주기를 분리하여 사전 페치된 이미지로 데이터 볼륨을 쉽게 연결할 수 있습니다.

고려 사항 - Docs
- Bottlerocket은 x86_64 및 arm64 프로세서가 있는 Amazon EC2 인스턴스를 지원합니다.
- Bottlerocket AMI를 Inferentia 칩이 있는 Amazon EC2 인스턴스와 함께 사용하는 것은 권장되지 않습니다.
- Bottlerocket 이미지에는 SSH 서버 또는 쉘이 포함되지 않습니다. 대체 액세스 방법을 사용하여 SSH를 허용할 수 있습니다.

[ 실습하기 ]

Case1. 노드 그룹 생성 및 노드 접속

Control / Admin Container 배치 참고 - Link

Step1. ng-br.yaml 파일 작성

cat << EOF > ng-br.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: myeks
  region: ap-northeast-2
  version: "1.31"

managedNodeGroups:
- name: ng-bottlerocket
  instanceType: m5.large
  amiFamily: Bottlerocket
  bottlerocket:
    enableAdminContainer: true
    settings:
      motd: "Hello, eksctl!"
  desiredCapacity: 1
  maxSize: 1
  minSize: 1
  labels:
    alpha.eksctl.io/cluster-name: myeks
    alpha.eksctl.io/nodegroup-name: ng-bottlerocket
    ami: bottlerocket
  subnets:
  - $PubSubnet1
  - $PubSubnet2
  - $PubSubnet3
  tags:
    alpha.eksctl.io/nodegroup-name: ng-bottlerocket
    alpha.eksctl.io/nodegroup-type: managed

- name: ng-bottlerocket-ssh
  instanceType: m5.large
  amiFamily: Bottlerocket
  desiredCapacity: 1
  maxSize: 1
  minSize: 1
  ssh:
    allow: true
    publicKeyName: $SSHKEYNAME
  labels:
    alpha.eksctl.io/cluster-name: myeks
    alpha.eksctl.io/nodegroup-name: ng-bottlerocket-ssh
    ami: bottlerocket
  subnets:
  - $PubSubnet1
  - $PubSubnet2
  - $PubSubnet3
  tags:
    alpha.eksctl.io/nodegroup-name: ng-bottlerocket-ssh
    alpha.eksctl.io/nodegroup-type: managed
EOF

Step2. 노드그룹 배포

#
cat ng-br.yaml
eksctl create nodegroup -f ng-br.yaml

# 노드의 OS 와 CRI 정보 등 확인
kubectl get node --label-columns=alpha.eksctl.io/nodegroup-name,ami,node.kubernetes.io/instance-type
kubectl get node -owide
NAME                                               STATUS   ROLES    AGE     VERSION               INTERNAL-IP     EXTERNAL-IP      OS-IMAGE                                KERNEL-VERSION                    CONTAINER-RUNTIME
ip-192-168-1-122.ap-northeast-2.compute.internal   Ready    <none>   7m11s   v1.31.4-eks-0f56d01   192.168.1.122   43.203.248.44    Bottlerocket OS 1.32.0 (aws-k8s-1.31)   6.1.124                           containerd://1.7.24+bottlerocket
ip-192-168-1-68.ap-northeast-2.compute.internal    Ready    <none>   157m    v1.31.5-eks-5d632ec   192.168.1.68    43.202.4.32      Amazon Linux 2023.6.20250203            6.1.127-135.201.amzn2023.x86_64   containerd://1.7.25
ip-192-168-2-27.ap-northeast-2.compute.internal    Ready    <none>   157m    v1.31.5-eks-5d632ec   192.168.2.27    16.184.10.127    Amazon Linux 2023.6.20250203            6.1.127-135.201.amzn2023.x86_64   containerd://1.7.25
ip-192-168-3-183.ap-northeast-2.compute.internal   Ready    <none>   157m    v1.31.5-eks-5d632ec   192.168.3.183   3.34.134.13      Amazon Linux 2023.6.20250203            6.1.127-135.201.amzn2023.x86_64   containerd://1.7.25
ip-192-168-3-245.ap-northeast-2.compute.internal   Ready    <none>   7m14s   v1.31.4-eks-0f56d01   192.168.3.245   43.200.172.226   Bottlerocket OS 1.32.0 (aws-k8s-1.31)   6.1.124                           containerd://1.7.24+bottlerocket

# 인스턴스 IP 확인
aws ec2 describe-instances --query "Reservations[*].Instances[*].{InstanceID:InstanceId, PublicIPAdd:PublicIpAddress, PrivateIPAdd:PrivateIpAddress, InstanceName:Tags[?Key=='Name']|[0].Value, Status:State.Name}" --filters Name=instance-state-name,Values=running --output table

#
BRNode1=<ng-bottlerocket EC2 유동공인 IP>
BRNode2=<ng-bottlerocket-ssh EC2 유동공인 IP>
BRNode1=43.203.248.44
BRNode2=43.200.172.226

Step3. SSH 접속 테스트

# SSH 접속 테스트
ssh $BRNode1
ssh $BRNode2
-----------------
# AL2 기반이여, 문제 해결을 위한 도구 포함.
# sudo sheltie 실행 시 you into a root shell in the Bottlerocket host's root filesystem.
whoami
pwd
ip -c a
lsblk
sudo yum install htop which -y
htop
which bash
which sh
ps
ps -ef
ps 1
sestatus
getenforce

sudo sheltie
whoami
pwd
ip -c a
lsblk
yum install jq -y
ps
ps -ef
ps 1
sestatus
getenforce

exit
exit
-----------------

Step4. AWS SSM 으로 노드 접속 및 apiclient 사용 - CIS , CIS-K8S

# ng-bottlerocket 인스턴스 ID 필터링
aws ec2 describe-instances --filters "Name=tag:eks:nodegroup-name,Values=ng-bottlerocket" | jq -r '.[][0]["Instances"][0]["InstanceId"]'

# Run the below command to create an SSM session with bottlerocket node
aws ssm start-session --target $(aws ec2 describe-instances --filters "Name=tag:eks:nodegroup-name,Values=ng-bottlerocket" | jq -r '.[][0]["Instances"][0]["InstanceId"]')
-----------------
# Bottlerocket의 Control Container 진입
# 이 컨테이너는 Bottlerocket API에 접근할 수 있도록 해주며, 이를 통해 시스템을 점검하고 설정을 변경할 수 있음.
# 이를 위해 apiclient 도구를 사용하는 것이 일반적입니다. 예) 시스템 점검 apiclient -u /settings | jq
# 고급 디버깅을 위해서는 Admin Container를 사용 : 호스트에 대한 root 접근을 허용
# Admin Container를 활성화한 후, SSH로 접근 가능.

# apiclient 사용
apiclient --help
apiclient -u /settings | jq
apiclient get | jq
apiclient get | grep motd

# CIS benchmark for Bottlerocket 
apiclient report cis
Benchmark name:  CIS Bottlerocket Benchmark
Version:         v1.0.0
Reference:       https://www.cisecurity.org/benchmark/bottlerocket
Benchmark level: 1
Start time:      2025-02-16T09:15:30.964119078Z

[SKIP] 1.2.1     Ensure software update repositories are configured (Manual)
[PASS] 1.3.1     Ensure dm-verity is configured (Automatic)
[PASS] 1.4.1     Ensure setuid programs do not create core dumps (Automatic)
[PASS] 1.4.2     Ensure address space layout randomization (ASLR) is enabled (Automatic)
[PASS] 1.4.3     Ensure unprivileged eBPF is disabled (Automatic)
[PASS] 1.5.1     Ensure SELinux is configured (Automatic)
[SKIP] 1.6       Ensure updates, patches, and additional security software are installed (Manual)
[PASS] 2.1.1.1   Ensure chrony is configured (Automatic)
[PASS] 3.2.5     Ensure broadcast ICMP requests are ignored (Automatic)
[PASS] 3.2.6     Ensure bogus ICMP responses are ignored (Automatic)
[PASS] 3.2.7     Ensure TCP SYN Cookies is enabled (Automatic)
[SKIP] 3.4.1.3   Ensure IPv4 outbound and established connections are configured (Manual)
[SKIP] 3.4.2.3   Ensure IPv6 outbound and established connections are configured (Manual)
[PASS] 4.1.1.1   Ensure journald is configured to write logs to persistent disk (Automatic)
[PASS] 4.1.2     Ensure permissions on journal files are configured (Automatic)

Passed:          11
Failed:          0
Skipped:         4
Total checks:    15

Compliance check result: PASS

# Level 2 checks 
apiclient report cis -l 2


# CIS Kubernetes benchmark : Level 1 of the CIS Benchmark. 
apiclient report cis-k8s


#
enable-admin-container
enter-admin-container
whoami
pwd
ip -c a
lsblk
sudo yum install htop -y
htop
ps
ps -ef
ps 1
sestatus
getenforce

sudo sheltie
whoami
pwd
ip -c a
lsblk
yum install jq -y
ps
ps -ef
ps 1
sestatus
getenforce
exit
exit

#
disable-admin-container
whoami
pwd
ip -c a
sudo yum install git -y
sestatus
getenforce
exit
-----------------

▶ Updating the OS - Blog , Link

# checks to see whether there is a new version of the installed variant
apiclient update check | jq

# downloads the update and verifies that it has been staged
apiclient update apply

# activates the update and reboots the system
apiclient reboot

▶ 파드 배포

#
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: busybox
spec:
  terminationGracePeriodSeconds: 3
  containers:
  - name: busybox
    image: busybox
    command:
    - "/bin/sh"
    - "-c"
    - "while true; do date >> /home/pod-out.txt; cd /home; sync; sync; sleep 10; done"
  nodeSelector:
    ami: bottlerocket
EOF

# 파드가 배포된 노드 정보 확인
kubectl get pod -owide

#
kubectl exec -it busybox -- tail -f /home/pod-out.txt

# 삭제
kubectl delete pod busybox

[ 중요 - 자원정리 ]

- **`eksctl delete nodegroup -c $CLUSTER_NAME -n ng-bottlerocket`**
- **`eksctl delete nodegroup -c $CLUSTER_NAME -n ng-bottlerocket-ssh`**

[ 실습 후 자원정리 ]

(실습 했을 경우) AWS ECR 저장소 삭제
Amazon EKS 클러스터 삭제(10분 정도 소요): eksctl delete cluster --name $CLUSTER_NAME
(클러스터 삭제 완료 확인 후) AWS CloudFormation 스택 삭제 : aws cloudformation delete-stack --stack-name myeks
EKS 배포 후 실습 편의를 위한 변수 설정 삭제 : macOS : vi ~/.zshrc , Windows(WSL2) : vi ~/.bashrc

[ 마무리 ]

이번 과정을 통해 EKS에서 사용하는 볼륨 타입별 특징과 장/단점을 비교해 볼 수 있었다. 항상 느끼는 거지만, 인프라의 편리함을 추구하다 보면 보안과 비용 이슈라는 장벽을 마주치게 되고 해당 문제들을 넘어설 때 보다 안정적인 서비스를 갖출 수 있다는 것이다. 짧은 듯, 긴 과정을 거치면서 조금씩 EKS 환경에 대한 이해를 더해 가는 것 같다.

[ 참고 링크 ]

[ 개념 길라잡이 ]

[AWS Docs/EKS] Storage
[K8S Docs] Storage Example Ephemeral
[Workshop/EKS] Storage
[AWS Blog] Kubernetes를 위한 영구 스토리지 적용하기 - 링크
[Youtube] Kubecon Tutorial: What Went Wrong with My Persistent Data - 링크
Amazon EKS의 관리형 노드 그룹이 새로 업그레이드 전략에 최소한의 업데이트 전략을 지원 - Link
[책 추천] 스토리지를 중심으로 기본 이론 부터, 하이버바이저/쿠버네티스에 연동까지 약식(?)으로 정리 - Link

▶ SSM 동작원리 : gasida님 정리 : Blog

[ 환경 설치 관련 ]

▶ ssm manager plug-In 설치 : Link

저작자표시 비영리 변경금지 (새창열림)