Kubernetes Pattern: Managed Lifecycle

Notice

Recent Posts

Recent Comments

Link

« 2025/08 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Tags more

Archives

Today

Total

관리 메뉴

근묵자흑

Kubernetes Pattern: Managed Lifecycle 본문

k8s/kubernetes-pattern

Kubernetes Pattern: Managed Lifecycle

Luuuuu 2025. 8. 23. 13:24

클라우드 네이티브 환경에서 컨테이너의 수명주기 관리는 단순히 "시작"과 "종료"만으로는 충분하지 않습니다. 이 장에서는 Kubernetes의 Managed Lifecycle 패턴을 실제 코드와 테스트를 통해 깊이 있게 살펴보겠습니다.

왜 Managed Lifecycle이 중요한가?

실제 운영 환경에서 마주치는 상황들을 생각해보세요:

Rolling Update 중 데이터 손실: 진행 중인 요청이 완료되지 않은 채 Pod가 종료
Cold Start 문제: Java 애플리케이션이 준비되기 전에 트래픽 수신
상태 저장 실패: 갑작스러운 종료로 인한 데이터 불일치
연결 누수: 외부 서비스 연결이 제대로 해제되지 않음

이러한 문제들을 해결하는 것이 바로 Managed Lifecycle 패턴입니다.

실습을 통한 Managed Lifecycle 분석

이번장의 실습 코드는 GitHub 저장소에서 확인할 수 있습니다.

컨테이너 수명주기 이벤트 상세 분석

1. PostStart Hook - 컨테이너 초기화의 핵심

PostStart Hook은 컨테이너가 생성된 직후 실행되는 훅입니다. 중요한 점은 ENTRYPOINT와 비동기적으로 실행된다는 것입니다.

실제 구현 예제

apiVersion: v1
kind: Pod
metadata:
  name: poststart-demo
spec:
  containers:
  - name: app
    image: busybox:1.35
    lifecycle:
      postStart:
        exec:
          command:
          - /bin/sh
          - -c
          - |
            echo "PostStart 시작: $(date)"
            # 캐시 워밍업
            for i in $(seq 1 5); do
              echo "캐시 항목 $i 로드..."
              sleep 1
            done
            # 서비스 레지스트리 등록
            echo "서비스 등록: $(hostname)"
            touch /tmp/app-initialized

테스트 결과 분석

실제 테스트를 통해 확인한 중요한 발견:

=== Main 로그 ===
메인 프로세스 시작: 2025-08-23 03:13:00
메인 프로세스 실행 중 - 1초: 2025-08-23 03:13:00

=== PostStart 로그 ===
PostStart 시작: 2025-08-23 03:13:00  # 동일 시각!
PostStart 실행 중 - 1초: 2025-08-23 03:13:00

핵심 포인트:

PostStart와 메인 프로세스가 정확히 같은 시각에 시작
두 프로세스가 병렬로 실행됨
⚠️ PostStart가 실패하면 컨테이너가 재시작됨

2. PreStop Hook - 우아한 종료의 시작

PreStop Hook은 SIGTERM 시그널 이전에 실행되는 블로킹 호출입니다.

실제 구현 예제

lifecycle:
  preStop:
    exec:
      command:
      - /bin/sh
      - -c
      - |
        echo "PreStop 시작: $(date)"

        # 1. 헬스체크 비활성화
        rm -f /tmp/ready

        # 2. 로드밸런서에서 제거
        echo "서비스 레지스트리에서 제거..."
        sleep 3

        # 3. 활성 연결 드레이닝
        CONNECTIONS=5
        while [ $CONNECTIONS -gt 0 ]; do
          echo "활성 연결: $CONNECTIONS"
          sleep 1
          CONNECTIONS=$((CONNECTIONS - 1))
        done

        echo "PreStop 완료: $(date)"

종료 시퀀스 다이어그램

sequenceDiagram
    participant K as Kubernetes
    participant P as Pod
    participant A as Application

    K->>P: Delete 명령
    P->>A: PreStop Hook 실행
    Note over A: 정리 작업 수행<br/>(동기적 실행)
    A-->>P: PreStop 완료
    P->>A: SIGTERM 전송
    Note over A: Graceful Shutdown
    A-->>P: 프로세스 종료
    Note over K,P: terminationGracePeriodSeconds 경과 시
    K->>P: SIGKILL (강제 종료)

3. SIGTERM 처리 - Go 애플리케이션 실제 구현

Production 레벨의 Go 애플리케이션 예제:

package main

import (
    "context"
    "fmt"
    "log"
    "net/http"
    "os"
    "os/signal"
    "sync"
    "syscall"
    "time"
)

var (
    activeConnections int64
    mu                sync.Mutex
    serverShutdown    = false
)

func main() {
    // HTTP 서버 설정
    server := &http.Server{
        Addr:    ":8080",
        Handler: connectionCounter(mux),
    }

    // SIGTERM 시그널 처리
    sigterm := make(chan os.Signal, 1)
    signal.Notify(sigterm, syscall.SIGTERM, syscall.SIGINT)

    // 서버 시작
    go func() {
        log.Println("서버 시작 중...")
        server.ListenAndServe()
    }()

    // SIGTERM 대기
    <-sigterm
    log.Println("SIGTERM 수신, Graceful Shutdown 시작...")
    serverShutdown = true

    // Graceful shutdown
    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()

    gracefulShutdown(ctx, server)
}

func gracefulShutdown(ctx context.Context, server *http.Server) error {
    // 1. 새로운 요청 거부
    log.Println("새로운 요청 수신 중단...")

    // 2. 활성 연결 대기
    ticker := time.NewTicker(1 * time.Second)
    defer ticker.Stop()

    for {
        select {
        case <-ctx.Done():
            return ctx.Err()
        case <-ticker.C:
            mu.Lock()
            if activeConnections == 0 {
                mu.Unlock()
                break
            }
            log.Printf("활성 연결 %d개 대기 중...", activeConnections)
            mu.Unlock()
        }
    }

    // 3. 서버 종료
    return server.Shutdown(ctx)
}

실전 테스트 결과 와 결론

테스트 환경

플랫폼: Minikube on macOS
Kubernetes: v1.28

1. PostStart Hook 테스트 결과

시나리오: 메인 컨테이너와 PostStart의 실행 타이밍 검증

결과:

동시 시작 확인 (03:13:00)
비동기 병렬 실행 확인
PostStart 10초, 메인 프로세스 20초 독립 실행

실무 적용 팁:

# 잘못된 사용 - 메인 프로세스 의존성
postStart:
  exec:
    command: ["curl", "http://localhost:8080/init"]  # 실패할 수 있음!

# 올바른 사용 - 독립적인 초기화
postStart:
  exec:
    command: 
    - sh
    - -c
    - |
      # 외부 의존성 초기화
      curl -X POST http://service-registry/register
      # 캐시 워밍업
      /app/cache-warmer.sh

2. PreStop Hook 테스트 결과

시나리오: Pod 삭제 시 PreStop → SIGTERM 순서 검증

결과:

PreStop 먼저 실행 (블로킹)
PreStop 완료 후 SIGTERM 전송
전체 종료 시간 30초 (grace period 준수)

실무 적용 팁:

spec:
  terminationGracePeriodSeconds: 60  # PreStop + SIGTERM 처리 충분한 시간
  containers:
  - name: app
    lifecycle:
      preStop:
        exec:
          command:
          - sh
          - -c
          - |
            # 1. 즉시 헬스체크 실패 처리
            touch /tmp/terminating

            # 2. 트래픽 유입 차단 (중요!)
            sleep 5  # 로드밸런서 업데이트 대기

            # 3. 진행 중인 요청 완료
            while [ $(netstat -an | grep ESTABLISHED | wc -l) -gt 0 ]; do
              sleep 1
            done

3. 전체 수명주기 통합 테스트

시나리오: Init Container → PostStart → Running → PreStop → SIGTERM

결과 타임라인:

시간	이벤트	상태
T+0s	Init Container 시작	Pending
T+5s	Init Container 완료	PodInitializing
T+6s	Main Container + PostStart 시작	ContainerCreating
T+10s	PostStart 완료	Running
T+15s	Readiness Probe 성공	Ready
T+60s	Pod 삭제 명령	Terminating
T+61s	PreStop 시작	Terminating
T+70s	PreStop 완료, SIGTERM 전송	Terminating
T+75s	Graceful Shutdown 완료	Terminated

Production 프랙티스

1. Zero-Downtime 배포 체크리스트

apiVersion: apps/v1
kind: Deployment
metadata:
  name: production-app
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0  # Zero downtime!
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: app
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          successThreshold: 1
          failureThreshold: 3

        lifecycle:
          preStop:
            exec:
              command:
              - sh
              - -c
              - |
                # 트래픽 차단
                touch /tmp/terminating
                # LB 업데이트 대기
                sleep 10
                # 연결 드레이닝
                /app/drain-connections.sh

흔한 실수와 해결책

실수 1: PostStart에서 메인 프로세스 의존

# 실패할 수 있음
postStart:
  exec:
    command: ["curl", "http://localhost:8080/ready"]

해결책: PostStart는 독립적으로 실행 가능해야 함

실수 2: 불충분한 Grace Period

# PreStop이 완료되기 전에 SIGKILL
terminationGracePeriodSeconds: 10
lifecycle:
  preStop:
    exec:
      command: ["sleep", "30"]  # 실패!

해결책: PreStop + SIGTERM 처리 시간 고려

실수 3: Readiness Probe 없이 PreStop 사용

# PreStop 실행 중에도 트래픽 수신
lifecycle:
  preStop:
    exec:
      command: ["/cleanup.sh"]
# readinessProbe 없음!

해결책: PreStop에서 readiness 실패 처리

실전 시나리오별 구현 가이드

시나리오 1: 마이크로서비스 메시 환경

lifecycle:
  postStart:
    exec:
      command:
      - sh
      - -c
      - |
        # Istio sidecar 대기
        until curl -fsI http://localhost:15021/healthz/ready; do
          sleep 1
        done
        # 서비스 등록
        curl -X POST http://localhost:15000/clusters

시나리오 2: 상태 저장 애플리케이션

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: database
spec:
  podManagementPolicy: Parallel
  template:
    spec:
      terminationGracePeriodSeconds: 120
      containers:
      - name: db
        lifecycle:
          preStop:
            exec:
              command:
              - sh
              - -c
              - |
                # 복제 중단
                mysql -e "STOP SLAVE"
                # 트랜잭션 완료 대기
                mysql -e "SET GLOBAL innodb_fast_shutdown=0"
                # 체크포인트
                mysql -e "FLUSH LOGS"

시나리오 3: 배치 작업

apiVersion: batch/v1
kind: Job
metadata:
  name: data-processor
spec:
  template:
    spec:
      restartPolicy: OnFailure
      containers:
      - name: processor
        lifecycle:
          preStop:
            exec:
              command:
              - sh
              - -c
              - |
                # 체크포인트 저장
                /app/save-checkpoint.sh
                # 부분 결과 업로드
                /app/upload-partial.sh

결론 핵심

PostStart는 초기화, PreStop은 정리: 각각의 역할을 명확히 구분
비동기 vs 동기: PostStart는 비동기, PreStop은 동기 실행
Grace Period 충분히: PreStop + SIGTERM 처리 시간 고려
Health Probes 활용: 트래픽 제어의 핵심
테스트 자동화: 수명주기 시나리오를 CI/CD에 포함

참고 자료

'k8s > kubernetes-pattern' 카테고리의 다른 글

Kubernetes Pattern: Health Probe Pattern(정상상태 점검 패턴) (2)	2025.08.16
Kubernetes Pattern: 선언적 배포(Declarative Deployment) + FluxCD (3)	2025.08.09
Kubernetes Pattern: 예측 가능한 요구사항(Predictable Demands) (7)	2025.08.02

'k8s/kubernetes-pattern' Related Articles

근묵자흑

Kubernetes Pattern: Managed Lifecycle 본문

Kubernetes Pattern: Managed Lifecycle

왜 Managed Lifecycle이 중요한가?

실습을 통한 Managed Lifecycle 분석

컨테이너 수명주기 이벤트 상세 분석

1. PostStart Hook - 컨테이너 초기화의 핵심

실제 구현 예제

테스트 결과 분석

2. PreStop Hook - 우아한 종료의 시작

실제 구현 예제

종료 시퀀스 다이어그램

3. SIGTERM 처리 - Go 애플리케이션 실제 구현

실전 테스트 결과 와 결론

테스트 환경

1. PostStart Hook 테스트 결과

2. PreStop Hook 테스트 결과

3. 전체 수명주기 통합 테스트

Production 프랙티스

1. Zero-Downtime 배포 체크리스트

흔한 실수와 해결책

실수 1: PostStart에서 메인 프로세스 의존

실수 2: 불충분한 Grace Period

실수 3: Readiness Probe 없이 PreStop 사용

실전 시나리오별 구현 가이드

시나리오 1: 마이크로서비스 메시 환경

시나리오 2: 상태 저장 애플리케이션

시나리오 3: 배치 작업

결론 핵심

참고 자료

'k8s > kubernetes-pattern' 카테고리의 다른 글

티스토리툴바