๋กœ์ผ“๐Ÿพ
article thumbnail
๋ฐ˜์‘ํ˜•

 

 

๋งค์šฐ๋งค์šฐ๋งค์šฐ ์ค‘์š”!

Karpenter v0.32 ๋ฒ„์ „๋ถ€ํ„ฐ ๋ฆฌ์†Œ์Šค ์ด๋ฆ„๋“ค์ด ๋ณ€๊ฒฝ๋˜์–ด ์ฐธ๊ณ  ๋ฐ”๋ž๋‹ˆ๋‹ค. ๊ฐœ๋…์€ ๋˜‘๊ฐ™์Šต๋‹ˆ๋‹ค.
(ex. Provisioners -> NodePools)

 

AWS EKS ๋ฅผ ํ†ตํ•ด ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๊ตฌ์ถ•ํ•˜๋ฉด Data Plane (์ดํ•˜ Node) ๋ฅผ ๋‹ค์–‘ํ•œ ๋ฐฉ์‹์œผ๋กœ ๊ตฌ์ถ•ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 

Managed Node Group, Fargate ๊ทธ๋ฆฌ๊ณ  ์ด๋Ÿฌํ•œ ๋ฆฌ์†Œ์Šค๋ฅผ ์œ ์—ฐํ•˜๊ฒŒ ๊ด€๋ฆฌํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ฃผ๋Š” AWS Auto Scaling, Karpenter ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. 

 

์ด๋ฒˆ ๊ธ€์—์„œ๋Š” Karpenter ๋ฅผ ํ†ตํ•ด AWS Auto Scaling ๋ณด๋‹ค ์œ ์—ฐํ•˜๊ฒŒ Node ๋ฅผ ๊ด€๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณด๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค.

 

์šฐ์„  ๊ฐ„๋‹จํ•˜๊ฒŒ ์•Œ์•„๋ณด๋„๋ก ํ•˜์ฃ .

 

๋ ›์ธ ๋‘๋”์ฝ”๋“œ!

 

Karpenter ๊ฐ€ ๋ญ์ฃ ?


์—ฌ๊ธฐ์„œ ๋งํ•˜๋Š” ๋…ธ๋“œ == ์ธ์Šคํ„ด์Šค == ์„œ๋ฒ„ ๋Š” ๋ชจ๋‘ ๊ฐ™์€ ๊ฐœ๋…์ž…๋‹ˆ๋‹ค.

 

Karpenter ๋ฅผ ์•Œ์•„๋ณด๊ธฐ ์ „์— AWS Auto Scaling ์— ๋Œ€ํ•ด ๋จผ์ € ์•Œ์•„๋ณด์ฃ .

AWS Auto Scaling ์€ Auto Scaling Group ์œผ๋กœ ๋ฌถ์ธ ์ธ์Šคํ„ด์Šค๋ฅผ ์œ ์—ฐํ•˜๊ฒŒ ๋™์ž‘์‹œํ‚ฌ ์ˆ˜ ์žˆ๋„๋ก ๋„์™€์ค๋‹ˆ๋‹ค.

์‹คํ–‰๋˜๋Š” ์ตœ์†Œ ๊ฐฏ์ˆ˜์™€ ์ตœ๋Œ€ ๊ฐฏ์ˆ˜๋ฅผ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์ฃ .

AWS Auto Scaling

์Œ์‹ ๋ฐฐ๋‹ฌ ์•ฑ์—์„œ ๋ฐค 10์‹œ์— ๋งŒ์› ํ• ์ธ๋˜๋Š” ์น˜ํ‚จ ํ• ์ธ ์ฟ ํฐ์„ 1์‹œ๊ฐ„ ๋™์•ˆ 1000 ์žฅ์„ ์„ ์ฐฉ์ˆœ์œผ๋กœ ๋ฐ›์„ ์ˆ˜ ์žˆ๋Š” ์ด๋ฒคํŠธ๋ฅผ ์ง„ํ–‰ํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ด๋ด…์‹œ๋‹ค.

 

๋ฐค 10์‹œ์ „์—๋Š” ํŠธ๋ž˜ํ”ฝ์ด ํ‰์˜จํ•˜๋‹ค๊ฐ€~ 10์‹œ๊ฐ€ ๋˜๋Š” ์ˆœ๊ฐ„ ์Œ์‹ ๋ฐฐ๋‹ฌ ์•ฑ์€ ํ‰์†Œ ๊ฐ™์ง€ ์•Š๋Š” ํŠธ๋ž˜ํ”ฝ์— ๋†€๋ผ ์ฃฝ์–ด๋ฒ„๋ฆด ๊ฒ๋‹ˆ๋‹ค.

์•ฑ์ด ์‹คํ–‰๋˜๊ณ  ์žˆ๋Š” ์„œ๋ฒ„์—๋Š” ์ด๋ฅผ ๊ฐ๋‹นํ•  ๋งŒํ•œ ๋ฆฌ์†Œ์Šค๊ฐ€ ์—†๋Š”๊ฑฐ์ฃ .

 

์ด๋•Œ AWS Auto Scaling  ์„ ์‚ฌ์šฉํ•˜๋ฉด ์ด๋Ÿฐ ์ƒํ™ฉ์— ๋Œ€์ฒ˜ํ•˜๊ธฐ ์ˆ˜์›”ํ•ฉ๋‹ˆ๋‹ค.

 

์•ฑ์ด ์‹คํ–‰๋˜๊ณ  ์žˆ๋Š” ์ธ์Šคํ„ด์Šค๋ฅผ Auto Scaling Group ์œผ๋กœ ๋ฌถ์–ด์„œ ์ตœ์†Œ ๊ฐฏ์ˆ˜ 1๊ฐœ, ์ตœ๋Œ€ ๊ฐฏ์ˆ˜ 3๊ฐœ๋ฅผ ์ง€์ •ํ•˜๊ณ  ์ธ์Šคํ„ด์Šค์˜ ๋ฆฌ์†Œ์‹ฑ์ด 50% ์ด์ƒ ์‚ฌ์šฉ๋˜๋ฉด ์ž๋™์œผ๋กœ ์Šค์ผ€์ผ ์•„์›ƒ์„ ํ•ด๋ฒ„๋ฆฌ๋ฉด ๋˜์ฃ .

 

๊ทธ๋Ÿผ ์•Œ์•„์„œ ์ธ์Šคํ„ด์Šค๊ฐ€ ์ฆ๊ฐ€ํ•˜๊ฒŒ ๋  ๊ฒƒ์ด๊ณ , ์ถ”๊ฐ€๋œ ์ธ์Šคํ„ด์Šค์— ๋Œ€ํ•ด์„œ๋Š” ๋กœ๋“œ ๋ฐธ๋Ÿฐ์‹ฑ ์ฒ˜๋ฆฌ๋ฅผ ํ•ด๋ฒ„๋ฆฌ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

 

Karpenter ๋˜ํ•œ AWS Auto Scaling ์ฒ˜๋Ÿผ ์œ ๋™์ ์œผ๋กœ ์ธ์Šคํ„ด์Šค์˜ ๊ฐฏ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ๊ณ , ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ ์ฐจ์ด์ ์ด ์žˆ๋‹ค๋ฉด Karpenter ๊ฐ€ ํ›จ์”ฌ ์œ ์—ฐํ•˜๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

Karpenter

Karpenter ๋Š” ์ฟ ๋ฒ„๋„คํ‹ฐ์Šค ํ™˜๊ฒฝ์—์„œ ํ• ๋‹นํ•  ์ˆ˜ ์žˆ๋Š” ๋…ธ๋“œ๊ฐ€ ์—†์–ด pending ์ƒํƒœ์— ์žˆ๋Š” ํŒŒ๋“œ๊ฐ€ ์žˆ์„ ๊ฒฝ์šฐ ์ด๋ฅผ ๊ฐ์ง€ํ•˜์—ฌ ์ž๋™์œผ๋กœ ๋…ธ๋“œ๋ฅผ ๋Š˜๋ ค์ค๋‹ˆ๋‹ค. ์—ฌ๊ธฐ๊นŒ์ง€๋Š” ๋น„์Šทํ•ด๋ณด์ด์ง€๋งŒ ํ•„์‚ด๊ธฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

 

AWS Auto Scaling ์€ ์‚ฌ์šฉ์ž๊ฐ€ ์ง€์ •ํ•œ ์ธ์Šคํ„ด์Šค ์œ ํ˜• ๋‚ด์—์„œ๋งŒ ๋™์ž‘ํ•˜๊ฒŒ ๋˜๋Š”๋ฐ Karpenter ๋Š” ํŒŒ๋“œ๋ฅผ ๋ฐฐํฌํ•˜๋Š”๋ฐ ์žˆ์–ด ํ•„์š”ํ•œ ๋ฆฌ์†Œ์Šค๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ณธ์ธ์ด ์•Œ์•„์„œ ๋…ธ๋“œ๋ฅผ ํ”„๋กœ๋น„์ €๋‹ ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ Karpenter ๋Š” ์œ„ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ์„œ๋กœ ๋‹ค๋ฅธ ๋…ธ๋“œ์— ๋ฐฐํฌ๋œ ํŒŒ๋“œ๋“ค์˜ ๋ฆฌ์†Œ์Šค๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ํ•˜๋‚˜์˜ ๋…ธ๋“œ์— ์žฌ๋ฐฐ์น˜ ํ•ด์ค๋‹ˆ๋‹ค.

 

์ด๊ฒŒ ๋ฌด์Šจ ๋ง์ด๋ƒ๋ฉด, 4๊ฐœ์˜ ํŒŒ๋“œ๋ฅผ ๋ฐฐํฌ ํ•  ์ˆ˜ ์žˆ๋Š” ์ธ์Šคํ„ด์Šค๊ฐ€ 2๊ฐœ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  8๊ฐœ์˜ ํŒŒ๋“œ๊ฐ€ ์ด๋ฏธ ์ž˜ ์‹คํ–‰๋˜๊ณ  ์žˆ์ฃ .

์ด๋•Œ ์ถ”๊ฐ€์ ์œผ๋กœ 3๊ฐœ์˜ ํŒŒ๋“œ๋ฅผ ๋” ๋ฐฐํฌํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๋ฐฐํฌํ•  ์ธ์Šคํ„ด์Šค๊ฐ€ ์—†์œผ๋ฏ€๋กœ ํŒŒ๋“œ๋“ค์€ pending ์ƒํƒœ๊ฐ€ ๋  ๊ฒƒ์ด๊ณ , ์ด๋ฅผ Karpenter ๊ฐ€ ๊ฐ์ง€ํ•˜์—ฌ ์ƒˆ๋กœ์šด ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“ค ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ถ”๊ฐ€์ ์ธ ์ธ์Šคํ„ด์Šค๊ฐ€ ํ”„๋กœ๋น„์ €๋‹๋˜๊ณ  pending ์ƒํƒœ์— ๋น ์ ธ์žˆ๋Š” ํŒŒ๋“œ๋“ค์€ ์ถ”๊ฐ€๋œ ์ธ์Šคํ„ด์Šค์— ๋ฐฐํฌ๊ฐ€ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. 

 

๊ทธ๋Ÿฐ๋ฐ ์œ„์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ํ•œ ์ธ์Šคํ„ด์Šค๋ฅผ ์ด 4๊ฐœ์˜ ํŒŒ๋“œ๋ฅผ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ 3๊ฐœ์˜ ํŒŒ๋“œ๋งŒ ์ถ”๊ฐ€ ๋ฐฐํฌ๋ฅผ ํ–ˆ์œผ๋ฏ€๋กœ 1๊ฐœ์˜ ํŒŒ๋“œ๋ฅผ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๋Š” ๋งŒํผ์˜ ๋ฆฌ์†Œ์Šค๊ฐ€ ์—ฌ๊ธฐ์„œ๋Š” ๋†€๊ณ  ์žˆ๋Š” ๊ฒ๋‹ˆ๋‹ค. ๋ฆฌ์†Œ์Šค ๋˜ํ•œ ๋น„์šฉ์œผ๋กœ ์ด์–ด์ง€๊ฒŒ ๋•Œ๋ฌธ์— ์ตœ๋Œ€ํ•œ ํ™œ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹๊ฒ ์ฃ ?

 

์ด๋•Œ Karpenter ๋ฅผ ์ด์šฉํ•˜๋ฉด ๋ฆฌ์†Œ์Šค๋ฅผ ์ตœ๋Œ€ํ•œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋นˆ๊ณต๊ฐ„์„ ์–ต์ง€๋กœ ์ฑ„์›Œ์ฃผ๋Š” ๊ฒƒ์€ ์•„๋‹ˆ๊ณ , ์• ์ดˆ์— ์ด 3๊ฐœ์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ํ•ฉ์ณ๋ฒ„๋ฆฝ๋‹ˆ๋‹ค. ์ฆ‰, ์ตœ๋Œ€ 11๊ฐœ์˜ ํŒŒ๋“œ๋ฅผ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๋Š” ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒˆ๋กญ๊ฒŒ ๋งŒ๋“ค๊ณ  ์—ฌ๊ธฐ์— ๋‹ค์‹œ ์žฌ๋ฐฐ์น˜ํ•ด์ค๋‹ˆ๋‹ค.

 

ํ”„๋กœ๋น„์ €๋‹ ์†๋„ ๋•Œ๋ฌธ์— ์Šค์ผ€์ผ ์•„์›ƒ์ด ๋Š๋ฆด ๊ฑฐ๋ผ๊ณ  ์ƒ๊ฐํ–ˆ๋Š”๋ฐ ์ƒ๊ฐ๋ณด๋‹ค ๋นจ๋ž์Šต๋‹ˆ๋‹ค

 

์ด๋Ÿฐ ๊ธฐ๋Šฅ ๋ง๊ณ ๋„ Karpenter ๋ฅผ ์ด์šฉํ•˜๋ฉด AWS Auto Scaling ๋ฅผ ์ด์šฉํ–ˆ์„ ๋•Œ๋ณด๋‹ค ๋” ๋งŽ์€ ์ด์ ์„ ๊ฐ€์ ธ๊ฐˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

Karpenter ๋ฅผ ์‚ฌ์šฉํ•ด๋ณด์ž


๊ทธ๋Ÿผ ์ด์ œ Karpenter ๊ฐ€ ๋ฌด์—‡์ธ์ง€ ์•Œ์•„๋ดค์œผ๋‹ˆ EKS ์— ํ™˜๊ฒฝ์—์„œ ์„ค์น˜ํ•ด์„œ ํ•œ๋ฒˆ ์‚ฌ์šฉํ•ด๋ณด์ฃ !

 

์ง„ํ–‰๋œ ํ™˜๊ฒฝ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค

  • EKS v1.23
  • Terraform v1.2.7

Terraform ์ด ์•„๋‹Œ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ ์ง„ํ–‰ํ•˜๋ ค๋ฉด ์—ฌ๊ธฐ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”!

Terraform ์œผ๋กœ ์ž‘์„ฑ๋œ Karpenter ์ฝ”๋“œ๋Š” ์—ฌ๊ธฐ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”!

 

eks_managed_node_groups = {
    karpenter = {
      instance_types = ["t3.medium"]

      min_size     = 1
      max_size     = 2
      desired_size = 1

      iam_role_additional_policies = [
        # Required by Karpenter
        "arn:${local.partition}:iam::aws:policy/AmazonSSMManagedInstanceCore"
      ]
    }
  }

...

resource "aws_iam_instance_profile" "karpenter" {
  name = "KarpenterNodeInstanceProfile-${local.name}"
  role = module.eks.eks_managed_node_groups["karpenter"].iam_role_name
}

Kapenter ๋ฅผ ์‹คํ–‰ํ•  ๊ด€๋ฆฌํ˜• ๋…ธ๋“œ ๊ทธ๋ฃน์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๊ด€๋ฆฌํ˜• ๋…ธ๋“œ ๊ทธ๋ฃน ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ํŒŒ๊ฒŒ์ดํŠธ๋กœ๋„ ๋ฐฐํฌ๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

Karpenter ๋ฅผ Karpenter ๋กœ ๊ด€๋ฆฌ๋˜๋Š” ๋…ธ๋“œ์— ๋ฐฐํฌํ•˜๋Š” ๊ฒƒ์€ ๊ถŒ์žฅ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค

 

์ฝ”๋“œ๋ฅผ ๋ณด๋ฉด iam_role_additional_policies ๋ถ€๋ถ„์— AmazonSSMManagedInstanceCore ๋ผ๋Š” AWS ๊ด€๋ฆฌํ˜• ์ •์ฑ…์ด ์ถ”๊ฐ€๋ฉ๋‹ˆ๋‹ค.

AmazonSSMManagedInstanceCore

์ด ์ •์ฑ…์„ EC2 ์ธ์Šคํ„ด์Šค์— ์—ฐ๊ฒฐํ•˜๋ฉด, ์ธ์Šคํ„ด์Šค๊ฐ€ Systems Manager ์˜ ํ•ต์‹ฌ ๊ธฐ๋Šฅ๊ณผ ์ถ”๊ฐ€ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, Session Manager, CloudWatch, S3 ๋“ฑ์˜ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์ •์ฑ…์€ ์ธ์Šคํ„ด์Šค์— ๋Œ€ํ•œ ์ตœ์†Œํ•œ์˜ ๊ถŒํ•œ์„ ๋ถ€์—ฌํ•˜๋ฏ€๋กœ, ๋ณด์•ˆ์ ์œผ๋กœ ์ข‹์Šต๋‹ˆ๋‹ค.

 

๊ด€๋ฆฌํ˜• ๋…ธ๋“œ ๊ทธ๋ฃน์œผ๋กœ ๊ด€๋ฆฌ๋˜๋Š” EC2 ์ธ์Šคํ„ด์Šค์—๋Š” ์ž๋™์ ์œผ๋กœ IAM Profile ์ด ์ง€์ •๋˜๋Š”๋ฐ ๊ธฐ๋ณธ ์ •์ฑ…์œผ๋กœ๋Š” AmazonEKSWorkerNodePolicy, AmazonEC2ContainerRegistryReadOnly, AmazonEKS_CNI_Policy ๊ฐ€ ์„ ํƒ๋˜๋Š”๋ฐ AmazonSSMManagedInstanceCore ๋˜ํ•œ ์ถ”๊ฐ€์ ์œผ๋กœ ํ•„์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ถ”๊ฐ€ํ•ด์ค๋‹ˆ๋‹ค.

 

aws_iam_instance_profile ์„ ํ†ตํ•ด์„œ Karpenter ๊ฐ€ ์ƒ์„ฑํ•˜๋Š” ์ธ์Šคํ„ด์Šค์˜ IAM Profile ์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

 

node_security_group_additional_rules = {
    # Control plane invoke Karpenter webhook
    ingress_karpenter_webhook_tcp = {
      description                   = "Control plane invoke Karpenter webhook"
      protocol                      = "tcp"
      from_port                     = 8443
      to_port                       = 8443
      type                          = "ingress"
      source_cluster_security_group = true
    }
  }

 

Karpenter Webhook ์„ ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ—ˆ์šฉํ•ด์ค๋‹ˆ๋‹ค.

 

Terraform ์œผ๋กœ EKS ๋ฅผ ์ƒ์„ฑํ•˜๋ฉด ๊ธฐ๋ณธ์œผ๋กœ ํด๋Ÿฌ์Šคํ„ฐ ๋ณด์•ˆ ๊ทธ๋ฃน๊ณผ ์ถ”๊ฐ€ ๋ณด์•ˆ ๊ทธ๋ฃน์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค.

ํด๋Ÿฌ์Šคํ„ฐ ๋ณด์•ˆ ๊ทธ๋ฃน : ๊ด€๋ฆฌํ˜• ๋…ธ๋“œ ๊ทธ๋ฃน, ํŒŒ๊ฒŒ์ดํŠธ๋ฅผ ํฌํ•จํ•œ ํด๋Ÿฌ์Šคํ„ฐ ๋‚ด ํ†ต์‹ ์„ ์ œ์–ดํ•˜๋Š” ๋ณด์•ˆ ๊ทธ๋ฃน
์ถ”๊ฐ€ ๋ณด์•ˆ ๊ทธ๋ฃน : ํด๋Ÿฌ์Šคํ„ฐ ๋ณด์•ˆ ๊ทธ๋ฃน ์™ธ ๊ทœ์น™์„ ์ง€์ •ํ•  ์ˆ˜ ์žˆ๋Š” ๋ณด์•ˆ ๊ทธ๋ฃน

 

์ถ”๊ฐ€ ๋ณด์•ˆ ๊ทธ๋ฃน์—๋Š” ๋‹ค์‹œ ํ•œ๋ฒˆ ํด๋Ÿฌ์Šคํ„ฐ์šฉ ๋ณด์•ˆ ๊ทธ๋ฃน๊ณผ ๋…ธ๋“œ์šฉ ๋ณด์•ˆ ๊ทธ๋ฃน์ด ์กด์žฌํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์‹ค์งˆ์ ์œผ๋กœ ํด๋Ÿฌ์Šคํ„ฐ์— ํ•„์š”ํ•œ ์ถ”๊ฐ€์ ์ธ ๋ณด์•ˆ ๊ทธ๋ฃน ๊ทœ์น™์€ ์ถ”๊ฐ€ ๋ณด์•ˆ ๊ทธ๋ฃน์—์„œ ์ด๋ค„์ง‘๋‹ˆ๋‹ค. (์ด ๋ถ€๋ถ„์€ ์ •ํ™•ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค)

 

๋…ธ๋“œ์šฉ ์ถ”๊ฐ€ ๋ณด์•ˆ ๊ทธ๋ฃน์— ์ƒ์„ฑ๋œ ๊ทœ์น™๋“ค

์ด๋Ÿฌํ•œ ๊ทœ์น™๋“ค์ด ์ƒ์„ฑ๋˜๋Š” ์ด์œ ๋Š” Karpenter ๊ฐ€ ์ธ์Šคํ„ด์Šค๋ฅผ ์ถ”๊ฐ€๋กœ ์ƒ์„ฑํ–ˆ๋‹ค๊ณ  ํ•ด์„œ ๊ณง๋ฐ”๋กœ EKS ํด๋Ÿฌ์Šคํ„ฐ์™€ ํ†ต์‹ ํ•  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

 

ํ•œ๊ฐ€์ง€ ๋ˆˆ ์—ฌ๊ฒจ๋ณผ ๋ถ€๋ถ„์€ ์ฒซ๋ฒˆ์งธ ๊ทœ์น™์ž…๋‹ˆ๋‹ค. ์„ค๋ช…์— "Cluster API to node groups" ๋ผ๊ณ  ๋‚˜์™€์žˆ๋Š”๋ฐ ์ด ๊ทœ์น™์œผ๋กœ ์ธํ•ด ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ Karpenter ๋กœ ์ƒ์„ฑ๋œ ์ธ์Šคํ„ด์Šค์— API ํ†ต์‹ ์„ ๋ณด๋‚ผ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ฆ‰, kubelet ๊ณผ ํ†ต์‹ ์„ ๊ฐ€๋Šฅ์ผ€ ํ•ด์ฃผ๋Š” ๊ทœ์น™์ธ๊ฑฐ์ฃ .

 

๋˜ํ•œ ๋…ธ๋“œ์šฉ ์ถ”๊ฐ€ ๋ณด์•ˆ ๊ทธ๋ฃน ๋ง๊ณ  ํด๋Ÿฌ์Šคํ„ฐ์šฉ ์ถ”๊ฐ€ ๋ณด์•ˆ ๊ทธ๋ฃน์„ ์‚ดํŽด๋ณด๋ฉด ๊ทœ์น™์ด ๋ฐ˜๋Œ€๋กœ ๋˜์–ด์žˆ๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํด๋Ÿฌ์Šคํ„ฐ์šฉ ์ถ”๊ฐ€ ๋ณด์•ˆ ๊ทธ๋ฃน์— ์ƒ์„ฑ๋œ ๊ทœ์น™๋“ค

ํด๋Ÿฌ์Šคํ„ฐ์šฉ ์ถ”๊ฐ€ ๋ณด์•ˆ ๊ทธ๋ฃน์˜ ์ฒซ๋ฒˆ์งธ ๊ทœ์น™์€ "Node groups to cluster API" ์ž…๋‹ˆ๋‹ค.

 

์ง„ํ–‰ํ•˜๊ธฐ ์•ž์„œ ๋ณด์•ˆ ๊ทธ๋ฃน ์†Œ์Šค๋ผ๋Š” ํ•„๋“œ์—๋Š” 192.168.0.10/24 ์™€ ๊ฐ™์€ CIDR ๊ฐ€ ์•„๋‹Œ ๋ณด์•ˆ ๊ทธ๋ฃน์ด ์ง€์ •์ด ๋˜์–ด ์žˆ๋Š”๋ฐ ์ด์— ๋Œ€ํ•ด ์ž ๊น ์„ค๋ช…๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.

 

๋ณด์•ˆ๊ทธ๋ฃน ์ธ๋ฐ”์šด๋“œ ๋ฃฐ์—์„œ ์†Œ์Šค๋กœ ๋‹ค๋ฅธ ๋ณด์•ˆ๊ทธ๋ฃน์„ ์ง€์ •ํ•˜๋Š” ์˜๋ฏธ (์—ฌ๊ธฐ ์ฐธ๊ณ )

AWS ๋ณด์•ˆ๊ทธ๋ฃน์—์„œ ์†Œ์Šค๋ฅผ ์ง€์ •ํ•  ๋•Œ ๋‹ค๋ฅธ ๋ณด์•ˆ๊ทธ๋ฃน์„ ์ง€์ •ํ•˜๋ฉด, ํ•ด๋‹น ๋ณด์•ˆ๊ทธ๋ฃน์— ์—ฐ๊ฒฐ๋œ ๋ฆฌ์†Œ์Šค๋“ค์ด ํŠธ๋ž˜ํ”ฝ์„ ๋ณด๋‚ผ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, EC2 ์ธ์Šคํ„ด์Šค์˜ ์ธ๋ฐ”์šด๋“œ ๋ฃฐ์—์„œ ์†Œ์Šค๋กœ RDS ๋ณด์•ˆ๊ทธ๋ฃน์„ ์ง€์ •ํ•˜๋ฉด, RDS ๋ณด์•ˆ๊ทธ๋ฃน์— ์—ฐ๊ฒฐ๋œ DB ์ธ์Šคํ„ด์Šค๋“ค์ด EC2 ์ธ์Šคํ„ด์Šค๋กœ ํŠธ๋ž˜ํ”ฝ์„ ๋ณด๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด, IP ์ฃผ์†Œ๋‚˜ CIDR ๋ธ”๋ก์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ๋ณด์•ˆ๊ทธ๋ฃน์˜ ๋ฆฌ์†Œ์Šค๋“ค์ด ๋ณ€๊ฒฝ๋˜์–ด๋„ ๋ฃฐ์„ ์ˆ˜์ •ํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ๋‹ค๋ฅธ AWS ๊ณ„์ •์˜ ๋ณด์•ˆ๊ทธ๋ฃน์„ ์ง€์ •ํ•  ๋•Œ๋Š” ๊ณ„์ • ๋ฒˆํ˜ธ๋„ ํ•จ๊ป˜ ์ง€์ •ํ•ด์•ผ ํ•˜๊ณ , ๋‹ค๋ฅธ ๋ฆฌ์ „์˜ ๋ณด์•ˆ๊ทธ๋ฃน์€ ์ง€์ •ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

 

๋ณด์•ˆ๊ทธ๋ฃน ์ธ๋ฐ”์šด๋“œ ๋ฃฐ์—์„œ ์†Œ์Šค๋กœ ์ž๊ธฐ ์ž์‹ ์„ ์ง€์ •ํ•˜๋Š” ์˜๋ฏธ (์—ฌ๊ธฐ ์ฐธ๊ณ )

AWS ๋ณด์•ˆ๊ทธ๋ฃน์—์„œ ์†Œ์Šค๋ฅผ ์ง€์ •ํ•  ๋•Œ ์ž๊ธฐ ์ž์‹ ์„ ์ง€์ •ํ•˜๋ฉด, ํ•ด๋‹น ๋ณด์•ˆ๊ทธ๋ฃน์— ์—ฐ๊ฒฐ๋œ ๋ชจ๋“  ๋ฆฌ์†Œ์Šค๋“ค์ด ์„œ๋กœ ํŠธ๋ž˜ํ”ฝ์„ ์ฃผ๊ณ ๋ฐ›์„ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, EC2 ์ธ์Šคํ„ด์Šค์˜ ์ธ๋ฐ”์šด๋“œ ๋ฃฐ์—์„œ ์†Œ์Šค๋กœ ์ž๊ธฐ ์ž์‹ ์„ ์ง€์ •ํ•˜๋ฉด, ๊ฐ™์€ ๋ณด์•ˆ๊ทธ๋ฃน์— ์—ฐ๊ฒฐ๋œ ๋‹ค๋ฅธ EC2 ์ธ์Šคํ„ด์Šค๋“ค์ด ํŠธ๋ž˜ํ”ฝ์„ ๋ณด๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.  ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด, ๋ฆฌ์†Œ์Šค๋“ค์˜ IP ์ฃผ์†Œ๊ฐ€ ๋ณ€๊ฒฝ๋˜์–ด๋„ ๋ฃฐ์„ ์ˆ˜์ •ํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ์ž๊ธฐ ์ž์‹ ์„ ์ง€์ •ํ•  ๋•Œ๋Š” ๋ณด์•ˆ๊ทธ๋ฃน์˜ ID๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๊ณ , ์ด๋ฆ„์ด๋‚˜ ํƒœ๊ทธ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

 

node_security_group_additional_rules ์—์„œ source_cluster_security_group = true ์ธ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋Š”๋ฐ ์ด ๊ทœ์น™์˜ ์†Œ์Šค๋ฅผ cluster ๊ฐ€ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๋ณด์•ˆ ๊ทธ๋ฃน์œผ๋กœ ์ง€์ •ํ•˜๊ฒ ๋‹ค๋Š” ๊ฒ๋‹ˆ๋‹ค. ์ฆ‰, ์†Œ์Šค๋กœ ํด๋Ÿฌ์Šคํ„ฐ์šฉ ๋ณด์•ˆ ๊ทธ๋ฃน์„ ์ง€์ •ํ•จ์œผ๋กœ์จ ํด๋Ÿฌ์Šคํ„ฐ์™€ ๋…ธ๋“œ ๊ฐ„ ํ†ต์‹ ์„ ํ—ˆ์šฉํ•ด์ฃผ๋Š” ๊ฒƒ์ด์ง€์š”.

 

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 3.0"

  name = local.name
  cidr = "10.0.0.0/16"

  azs             = ["${local.region}a", "${local.region}b", "${local.region}c"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets  = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]

  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

  public_subnet_tags = {
    "kubernetes.io/cluster/${local.name}" = "shared"
    "kubernetes.io/role/elb"              = 1
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${local.name}" = "shared"
    "kubernetes.io/role/internal-elb"     = 1
    # Tags subnets for Karpenter auto-discovery
    "karpenter.sh/discovery" = local.name
  }

  tags = local.tags
}

Karpenter ๋Š” ํ”„๋ผ์ด๋น— ์„œ๋ธŒ๋„ท์—์„œ๋งŒ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์–ด๋Š ํ”„๋ผ์ด๋น— ์„œ๋ธŒ๋„ท์„ ์‚ฌ์šฉํ•  ๊ฒƒ์ธ์ง€ ์ง€์ •์„ ํ•ด์ฃผ์–ด์•ผํ•˜๊ธฐ ๋•Œ๋ฌธ์— private_subnet_tags ์— karpenter.sh/discovery ๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. local.name ์€ eks ์ด๋ฆ„์ž…๋‹ˆ๋‹ค.

 

๋˜ํ•œ ์ด๋ ‡๊ฒŒ ์ถ”๊ฐ€๋œ ๋ถ€๋ถ„์€ ์ดํ›„ K8S Karpenter Provisioner ๋ฆฌ์†Œ์Šค์—์„œ subnetSelector ๋ถ€๋ถ„์—์„œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

 

์ง€๊ธˆ๊นŒ์ง€ karpenter ๊ฐ€ ์ƒ์„ฑํ•œ ์ธ์Šคํ„ด์Šค์—์„œ ํ•„์š”ํ•œ ๋ถ€๋ถ„๋“ค์„ Terraform ์œผ๋กœ ์ƒ์„ฑํ–ˆ๋‹ค๋ฉด, ์ด๋ฒˆ์—๋Š” karpenter pod ์—์„œ ํ•„์š”ํ•œ ๋ถ€๋ถ„์„ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

 

karpenter pod ๋Š” ์œ„์—์„œ ์ง€์ •ํ•œ ๊ด€๋ฆฌํ˜• ๋…ธ๋“œ ๊ทธ๋ฃน์— ์ƒ์„ฑ์ด ๋ฉ๋‹ˆ๋‹ค. ์ดํ›„ ํ•„์š”ํ•œ ๋ฆฌ์†Œ์Šค๊ฐ€ ์žˆ์„ ๊ฒฝ์šฐ karpenter pod ์— ์˜ํ•ด ์ž‘์—…์ด ์‹คํ–‰์ด ๋˜๋Š”๋ฐ, karpenter pod ๋Š” AWS ์— ์ ‘๊ทผํ•  ์ž๊ฒฉ์ด ๋‹น์—ฐํžˆ ์—†์Šต๋‹ˆ๋‹ค. 

 

๋”ฐ๋ผ์„œ karpenter pod ๊ฐ€ AWS ์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๋„๋ก IRSA ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. (IRSA ๊ฐ€ ๊ถ๊ธˆํ•˜๋‹ค๋ฉด ์—ฌ๊ธฐ ์ฐธ๊ณ )

module "karpenter_irsa" {
  source  = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  version = "~> 4.21.1"

  role_name                          = "karpenter-controller-${local.name}"
  attach_karpenter_controller_policy = true

  karpenter_controller_cluster_id = module.eks.cluster_id
  karpenter_controller_ssm_parameter_arns = [
    "arn:${local.partition}:ssm:*:*:parameter/aws/service/*"
  ]
  karpenter_controller_node_iam_role_arns = [
    module.eks.eks_managed_node_groups["karpenter"].iam_role_arn
  ]

  oidc_providers = {
    ex = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["karpenter:karpenter"]
    }
  }
}

 

Terraform ์—์„œ ์„ค์ •์ด ํ•„์š”ํ•œ ๋ถ€๋ถ„์€ ๋ชจ๋‘ ๋๋‚ฌ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿผ EKS ๋ฅผ ์ ์šฉ์„ ํ•˜๊ณ , ์ถ”๊ฐ€์ ์œผ๋กœ Helm ์„ ํ†ตํ•ด Karpenter ๋ฅผ ์„ค์น˜ํ•ด๋ณด์ฃ .

 

#!/bin/bash

set -e

KARPENTER_VERSION="v0.27.0"
CLUSTER_NAME="<cluster name>"
KARPENTER_IAM_ROLE_ARN="arn:aws:iam::<id>:role/karpenter-controller-<cluster name>"

docker logout public.ecr.aws
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace karpenter --create-namespace \
  --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \
  --set settings.aws.clusterName=${CLUSTER_NAME} \
  --set settings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
  --set settings.aws.interruptionQueueName=${CLUSTER_NAME} \
  --wait

 

๊ทธ๋Ÿผ ์ด์ œ Karpenter ๊ฐ€ ์„ฑ๊ณต์ ์œผ๋กœ ์„ค์น˜๋˜์—ˆ๋Š”์ง€ ํ•œ๋ฒˆ ํ™•์ธํ•ด๋ณด์ฃ .

 

์•„๋ž˜ ๋ฆฌ์†Œ์Šค๋ฅผ ๋ฐฐํฌํ•ด์ค๋‹ˆ๋‹ค.

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: provisioner
spec:
  # ์ƒ์„ฑํ•  ์ธ์Šคํ„ด์Šค ์œ ํ˜•์„ ๊ตฌ์ฒด์ ์œผ๋กœ ์ •์˜
  requirements:
    - key: "node.kubernetes.io/instance-type"
      operator: In
      values: [ "g4dn.xlarge" ]
    - key: "topology.kubernetes.io/zone"
      operator: In
      values: [ "ap-northeast-2a", "ap-northeast-2b", "ap-northeast-2c", "ap-northeast-2d" ]
    - key: "karpenter.sh/capacity-type"
      operator: In
      values: [ "on-demand"]
    - key: "kubernetes.io/arch"
      operator: In
      values: [ "amd64" ]
  
  # ์ƒ์„ฑํ•  ์ธ์Šคํ„ด์Šค์˜ ์ตœ๋Œ€ ๋ฆฌ์†Œ์Šค
  limits:
    resources:
      cpu: "100"
      memory: 100Gi
      nvidia.com/gpu: 16
  
  # karpenter ํ”„๋กœ๋น„์ €๋‹ ๋ฉ”์ปค๋‹ˆ์ฆ˜ ์‚ฌ์šฉ
  # ttlSecondsAfterEmpty ์™€ consolidation ์€ ๋™์‹œ์— ์‚ฌ์šฉ ๋ถˆ๊ฐ€
  consolidation:
    enabled: true
  # ttlSecondsUntilExpired: 2592000 # 30 Days = 60 * 60 * 24 * 30 Seconds;      
  # ttlSecondsAfterEmpty: 30

  # ์ƒ์„ฑ๋œ ์ธ์Šคํ„ด์Šค(worker node)์— ์ง€์ •๋˜๋Š” label
  labels:
    role: ops
    provision: karpenter

  # ์ƒ์„ฑ๋œ ์ธ์Šคํ„ด์Šค(worker node)์— ์ง€์ •๋˜๋Š” taints
  taints:
    - key: nvidia.com/gpu
      value: "true"
      effect: NoSchedule

  provider:
    # ์ƒ์„ฑํ•œ ์ธ์Šคํ„ด์Šค์— ์–ด๋Š ๋ณด์•ˆ ๊ทธ๋ฃน์„ ์ ์šฉํ•  ๊ฒƒ์ธ์ง€ ๋ณด์•ˆ ๊ทธ๋ฃน์˜ ํƒœ๊ทธ๋กœ ์ง€์ •
    securityGroupSelector:
      Name: eks-node
      kubernetes.io/cluster/eks: owned
    
    # ์–ด๋Š ์„œ๋ธŒ๋„ท์— ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•  ๊ฒƒ์ธ์ง€ ํƒœ๊ทธ๋กœ ์ง€์ •
    subnetSelector:
      karpenter.sh/discovery: eks
    
    # ์ƒ์„ฑ๋œ ์ธ์Šคํ„ด์Šค์˜ ํƒœ๊ทธ๋ฅผ ์ง€์ •  
    tags:
      karpenter.sh/discovery: eks
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 10
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
          resources:
            limits:
              nvidia.com/gpu: "1"
      tolerations:
        - key: nvidia.com/gpu
          value: "true"
          effect: NoSchedule
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions: # OR
                  - key: "topology.kubernetes.io/zone" # AND
                    operator: "In"
                    values: [ "ap-northeast-2a", "ap-northeast-2b", "ap-northeast-2c", "ap-northeast-2d" ]
                  - key: "karpenter.sh/capacity-type" # AND
                    operator: "In"
                    values: [ "on-demand" ]
                  - key: "kubernetes.io/arch" # AND
                    operator: In
                    values: [ "amd64" ]
                  - key: "node.kubernetes.io/instance-type" # AND
                    operator: In
                    values: [ "g4dn.xlarge" ]
limits ๋ถ€๋ถ„์€ karpenter๊ฐ€ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋…ธ๋“œ์˜ ์ตœ๋Œ€ ๋ฆฌ์†Œ์Šค๋ฅผ ์„ค์ •ํ•˜๋Š” ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.
์˜ˆ๋ฅผ ๋“ค์–ด, limits.resources.cpu: "1000"์€ karpenter๊ฐ€ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋…ธ๋“œ์˜ ์ตœ๋Œ€ CPU ์ฝ”์–ด ์ˆ˜๋ฅผ 1000๊ฐœ๋กœ ์ œํ•œํ•œ๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค.
๋งˆ์ฐฌ๊ฐ€์ง€๋กœ limits.resources.memory: 1000Gi๋Š” karpenter๊ฐ€ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋…ธ๋“œ์˜ ์ตœ๋Œ€ ๋ฉ”๋ชจ๋ฆฌ ์šฉ๋Ÿ‰์„ 1000Gi๋กœ ์ œํ•œํ•œ๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค.
์ด๋Ÿฌํ•œ ์ œํ•œ์€ karpenter๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์€ ๋…ธ๋“œ๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ๋น„์šฉ์ด ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ณ , ์ ์ ˆํ•œ ํฌ๊ธฐ์˜ ๋…ธ๋“œ๋ฅผ ์„ ํƒํ•˜๋„๋ก ๋•์Šต๋‹ˆ๋‹ค.

 

 

๋ˆˆ์—ฌ๊ฒจ ๋ณผ ์˜ต์…˜์€ ๋ฐ”๋กœ Karpenter ์˜ ํ•ต์‹ฌ ๊ธฐ๋Šฅ๊ณผ ๊ด€๋ จ๋œ ttlSecondsAfterEmpty ์ž…๋‹ˆ๋‹ค.

ttlSecondsAfterEmpty ๋ง๊ณ ๋„ ๊ด€๋ จ๋œ ๊ธฐ๋Šฅ ์˜ต์…˜์œผ๋กœ๋Š” ttlSecondsUntilExpired, consolidation ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

 

karpenter์—์„œ ttlSecondsAfterEmpty์™€ ttlSecondsUntilExpired์˜ ์ฐจ์ด๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

ttlSecondsAfterEmpty๋Š” ๋…ธ๋“œ๊ฐ€ ๋น„์–ด์žˆ์„ ๋•Œ karpenter๊ฐ€ ๋…ธ๋“œ๋ฅผ ์ข…๋ฃŒํ•˜๋„๋ก ์„ค์ •ํ•˜๋Š” ๊ฐ’์ž…๋‹ˆ๋‹ค. ์ด ๊ธฐ๋Šฅ์€ ๊ฐ’์„ ์ •์˜ํ•˜์ง€ ์•Š์œผ๋ฉด ๋น„ํ™œ์„ฑํ™”๋ฉ๋‹ˆ๋‹ค.

ttlSecondsUntilExpired๋Š” ๋…ธ๋“œ๊ฐ€ ์ตœ๋Œ€ ๋‚˜์ด์— ๋„๋‹ฌํ–ˆ์„ ๋•Œ karpenter๊ฐ€ ๋…ธ๋“œ๋ฅผ ์ข…๋ฃŒํ•˜๋„๋ก ์„ค์ •ํ•˜๋Š” ๊ฐ’์ž…๋‹ˆ๋‹ค. ์ด ๊ธฐ๋Šฅ์€ ๊ฐ’์„ ์ •์˜ํ•˜์ง€ ์•Š์œผ๋ฉด ๋น„ํ™œ์„ฑํ™”๋ฉ๋‹ˆ๋‹ค. ์ด ์˜ต์…˜ ๊ฐ™์€ ๊ฒฝ์šฐ ์˜ค๋ž˜๋œ ์ธ์Šคํ„ด์Šค๋ฅผ ์ž๋™์œผ๋กœ ์—…๋ฐ์ดํŠธ ํ•˜๋ ค๊ณ  ํ•  ๋•Œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

 

consolidation ์— ๋Œ€ํ•ด์„œ

karpenter์—์„œ consolidation์€ ๋…ธ๋“œ์˜ ๋ฆฌ์†Œ์Šค ํ™œ์šฉ๋„๋ฅผ ๋†’์ด๊ณ  ๋น„์šฉ์„ ์ ˆ๊ฐํ•˜๊ธฐ ์œ„ํ•ด ์ž‘์—… ๋ถ€ํ•˜๋ฅผ ๋‹ค๋ฅธ ๋…ธ๋“œ๋กœ ์ด๋™์‹œํ‚ค๋Š” ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค. karpenter๋Š” consolidation์„ ์œ„ํ•ด ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๋…ธ๋“œ ์‚ญ์ œ: karpenter๋Š” ๋…ธ๋“œ์— ์žˆ๋Š” ๋ชจ๋“  ํŒŒ๋“œ๊ฐ€ ๋‹ค๋ฅธ ๋…ธ๋“œ๋กœ ์˜ฎ๊ฒจ์งˆ ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ณ , ๊ทธ๋ ‡๋‹ค๋ฉด ํ•ด๋‹น ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ttlSecondsAfterEmpty ๊ฐ’์œผ๋กœ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋…ธ๋“œ ๊ต์ฒด: karpenter๋Š” ์ƒˆ๋กœ์šด ์ธ์Šคํ„ด์Šค ํƒ€์ž…์ด๋‚˜ ์šฉ๋Ÿ‰ ์œ ํ˜•์œผ๋กœ ์ ํ•ฉํ•œ ๋…ธ๋“œ๋ฅผ ์ƒ์„ฑํ•˜๊ณ , ๊ธฐ์กด์˜ ๋น„ํšจ์œจ์ ์ธ ๋…ธ๋“œ์— ์žˆ๋Š” ํŒŒ๋“œ๋ฅผ ์ƒˆ๋กœ์šด ๋…ธ๋“œ๋กœ ์˜ฎ๊ธด ํ›„, ๊ธฐ์กด์˜ ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ consolidation.enabled ๊ฐ’์œผ๋กœ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

consolidation ๊ธฐ๋Šฅ์€ ํด๋Ÿฌ์Šคํ„ฐ์˜ ์•ˆ์ •์„ฑ๊ณผ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๊ณ , ์ŠคํŒŸ ์ธ์Šคํ„ด์Šค๋‚˜ ์˜จ๋””๋งจ๋“œ ์ธ์Šคํ„ด์Šค์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์šฉ๋Ÿ‰ ์œ ํ˜•์„ ํ˜ผํ•ฉํ•˜์—ฌ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ค๋‹ˆ๋‹ค.

 

consolidation ์™€ ttlSecondsAfterEmpty ๊ฐ™์ด ์‚ฌ์šฉ ํ•  ์ˆ˜ ์žˆ์„๊นŒ?

consolidation๊ณผ ttlSecondsAfterEmpty๋Š” ๊ฐ™์ด ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

๋‘ ๊ธฐ๋Šฅ์€ ์„œ๋กœ ๋ฐฐํƒ€์ ์ธ ๋ฐฉ์‹์œผ๋กœ ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. consolidation์€ ์ƒˆ๋กœ์šด ๋…ธ๋“œ๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ๊ธฐ์กด์˜ ๋…ธ๋“œ์— ์žˆ๋Š” ํŒŒ๋“œ๋ฅผ ์˜ฎ๊ธด ํ›„์— ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

๋ฐ˜๋ฉด์— ttlSecondsAfterEmpty์€ ๋…ธ๋“œ๊ฐ€ ๋น„์–ด์žˆ๋Š” ์ƒํƒœ์—์„œ ์ผ์ • ์‹œ๊ฐ„์ด ์ง€๋‚˜๋ฉด ๋ฐ”๋กœ ๋…ธ๋“œ๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ consolidation๊ณผ ttlSecondsAfterEmpty์„ ๊ฐ™์ด ์‚ฌ์šฉํ•˜๋ฉด ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. karpenter๋Š” provisioner ๋‹จ์œ„๋กœ consolidation๊ณผ ttlSecondsAfterEmpty ์ค‘ ํ•˜๋‚˜๋งŒ ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ œํ•œํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

 

karpenter pod ์˜ ๋กœ๊ทธ๋ฅผ ํ™•์ธํ•ด๋ณด์ฃ .

2023-03-16T06:35:03.384Z	INFO	controller.provisioner	computed new node(s) to fit pod(s)	{"commit": "dc3af1a", "nodes": 1, "pods": 1}
2023-03-16T06:35:03.384Z	INFO	controller.provisioner	launching machine with 1 pods requesting {"cpu":"1155m","memory":"120Mi","pods":"6"} from types t3.xlarge, t3.medium, t3.large	{"commit": "dc3af1a", "provisioner": "karpenter-provisioner"}
2023-03-16T06:35:03.509Z	DEBUG	controller.provisioner.cloudprovider	discovered security groups	{"commit": "dc3af1a", "provisioner": "karpenter-provisioner", "security-groups": ["sg-0b874fd0cb8b74351"]}
2023-03-16T06:35:03.708Z	DEBUG	controller.provisioner.cloudprovider	created launch template	{"commit": "dc3af1a", "provisioner": "karpenter-provisioner", "launch-template-name": "Karpenter-csg-sd-dev-eks-8333440626470183087", "launch-template-id": "lt-084967c0de086c425"}
2023-03-16T06:35:05.780Z	INFO	controller.provisioner.cloudprovider	launched new instance	{"commit": "dc3af1a", "provisioner": "karpenter-provisioner", "id": "i-0c206dbe4cb2d08e1", "hostname": "ip-10-146-5-13.ap-northeast-2.compute.internal", "instance-type": "t3.medium", "zone": "ap-northeast-2c", "capacity-type": "on-demand"}
2023-03-16T06:36:12.275Z	ERROR	controller.interruption	getting messages from queue, discovering queue url, fetching queue url, AWS.SimpleQueueService.NonExistentQueue: The specified queue does not exist for this wsdl version.

๋ฌธ์ œ ์—†์ด ์ž˜ ์‹คํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

 

์ดํ›„ deployment ์˜ replica ๋ฅผ 0 ์œผ๋กœ ์„ค์ •ํ•˜๋ฉด ttlSeocondsAfterEmpty: 30 ๋กœ ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ํ•ด๋‹น ์ธ์Šคํ„ด์Šค๋Š” 30์ดˆ ๋’ค ์‚ญ์ œ๋ฉ๋‹ˆ๋‹ค.

 

 

ํŠธ๋Ÿฌ๋ธ” ์ŠˆํŒ…


k logs -f -n karpenter karpenter-xxxx-xxxx ๋ฅผ ํ†ตํ•ด์„œ ๋กœ๊ทธ ํ™•์ธ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

 

eks:DescribeCluster ๊ถŒํ•œ๋ฌธ์ œ

iam-role-for-service-accounts-eks ํ…Œ๋ผํผ ๋ชจ๋“ˆ์„ ๊ฐ€์žฅ ์ตœ์‹  ๋ฒ„์ „์œผ๋กœ ์—…๋ฐ์ดํŠธ ํ•ฉ๋‹ˆ๋‹ค. 

๊ตฌ ๋ฒ„์ „์—๋Š” ์ด ๊ถŒํ•œ์ด ์ถ”๊ฐ€๋˜์–ด์žˆ์ง€ ์•Š์•„ ๋ฐœ์ƒํ•˜๋Š” ์—๋Ÿฌ์ž…๋‹ˆ๋‹ค.

 

Karpenter ๋กœ ์ƒ์„ฑํ•œ ์ธ์Šคํ„ด์Šค ์ƒํƒœ๊ฐ€ ๊ณ„์† NotReady 

์•„๋ฌด๋ฆฌ ๊ธฐ๋‹ค๋ ค๋„ Ready ๋กœ ๋ฐ”๋€Œ์ง€ ์•Š๋Š”๋‹ค๋ฉด ์ƒ์„ฑ๋œ ์ธ์Šคํ„ด์Šค์˜ ๋ณด์•ˆ๊ทธ๋ฃน์„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ํด๋Ÿฌ์Šคํ„ฐ์™€ karpenter web hook ์„ ํ†ต์‹ ํ•  ์ˆ˜ ์žˆ๋Š” ๊ทœ์น™์ด ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

 

securityGroupSelector ๋ถ€๋ถ„์—์„œ ์–ด๋Š ๋ณด์•ˆ ๊ทธ๋ฃน์ด ์„ ํƒ๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜๊ณ , ์–ด๋Š ๊ทœ์น™์„ ํ—ˆ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€๋„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

 

๋งŒ์•ฝ ๋ณด์•ˆ ๊ทธ๋ฃน์— ๋ฌธ์ œ๊ฐ€ ์—†๋‹ค๋ฉด, aws-auth configmap ์— Karpenter ์—์„œ ์‚ฌ์šฉํ•˜๋Š” ์ธ์Šคํ„ด์Šค ํ”„๋กœํ•„์„ ์•„๋ž˜์™€ ๊ฐ™์ด ์ถ”๊ฐ€ํ•ด์ฃผ์„ธ์š”.

 

 

๋งˆ๋ฌด๋ฆฌ


์ด์ƒ EKS ์—์„œ Karpenter ๋ฅผ ์„ค์น˜ํ•ด๋ณด๊ณ  ๋™์ž‘ํ•˜๋Š” ์ง€ ํ™•์ธํ•ด๋ณด์•˜์Šต๋‹ˆ๋‹ค.

Terraform ์œผ๋กœ EKS ๋ฅผ ์ƒ์„ฑํ–ˆ์„ ๋•Œ ์ƒ์„ฑ๋œ ๋ณด์•ˆ ๊ทธ๋ฃน์— ๋Œ€ํ•ด์„œ๋Š” ์ž์„ธํžˆ ๋“ค์—ฌ๋‹ค ๋ณด์ง€ ์•Š์•˜๋Š”๋ฐ ์ด๋ฒˆ์— Karpenter ๋ฅผ ์„ค์น˜ํ•˜๋ฉด์„œ ์–ด๋–ค ๋ณด์•ˆ ๊ทธ๋ฃน์ด ์ƒ์„ฑ๋˜๊ณ  ์–ด๋Š ๊ทœ์น™ ๋•๋ถ„์— ํด๋Ÿฌ์Šคํ„ฐ์™€ ๋…ธ๋“œ๋“ค์ด ํ†ต์‹ ์ด ๊ฐ€๋Šฅํ•œ์ง€ ์•Œ ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

 

๋˜ํ•œ Karpenter ๋ฅผ ์ด์šฉํ•˜๋ฉด ๋น„์šฉ์„ ํ›จ์”ฌ ๋” ํšจ์œจ์ ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์„ ๊ฑฐ ๊ฐ™์Šต๋‹ˆ๋‹ค.

 

๊ทธ๋Ÿฌ๋ฉด ์˜ค๋Š˜์€ ์—ฌ๊ธฐ๊นŒ์ง€!

 

 

 

๋ฐ˜์‘ํ˜•
profile on loading

Loading...