All posts

My Pod Got OOMKilled And I Took It Personally

It was 3am. The pager went off. My pod had been killed—murdered, really—by the OOM killer. No warning. No goodbye. Just gone.

The Autopsy

resources:
  requests:
    memory: '128Mi'
  limits:
    memory: '128Mi' # hubris

I had set requests equal to limits. “Best practice,” they said. “Predictable behavior,” they said. They didn’t mention that my app would casually allocate 200MB during a garbage collection spike.

The Stages of Grief

  1. Denial: “The metrics must be wrong”
  2. Anger: “Who wrote this garbage collector”
  3. Bargaining: “What if I just set limits to 10Gi”
  4. Depression: stares at Grafana dashboard
  5. Acceptance: “Memory is a social construct”

The Fix

resources:
  requests:
    memory: '128Mi'
  limits:
    memory: '512Mi' # wisdom

Sometimes you just need room to breathe.

“The cloud is just someone else’s computer, waiting to kill your processes.” — Ancient DevOps Proverb