Pizza delivery used as a metaphor for SLIs SLOs and error budgets in SRE

🍕 SLIs, SLOs, and Error Budgets Explained with Pizza Delivery

It’s Friday night.

You’re hungry.

You open your favorite pizza app and place an order.

The app promises:

“Delivery in about 30 minutes.”

And you think:

Nice.

But what happens if the pizza arrives in 31 minutes?

Do you panic?

Do you call the police?

Of course not.

Because everyone understands something important:

Perfect service doesn’t exist. Reliable service does.

And that’s exactly what SLIs, SLOs, and Error Budgets are about.


The Pizza Reality

Let’s imagine a pizza place trying to deliver perfectly every time.

Every order arrives:

  • exactly on time
  • never late
  • never wrong

Sounds amazing, right?

It would also mean:

  • refusing orders during busy hours
  • stopping new menu experiments
  • hiring far too many drivers

In other words: innovation stops.

The restaurant would be reliable…

but it would never improve.

That’s why real systems accept a tiny amount of imperfection.


Step 1: The SLI — Measuring the Experience

SLI = Service Level Indicator

This is simply a measurement of how things are going.

For pizza delivery, an SLI might be:

  • Percentage of pizzas delivered within 30 minutes
  • Percentage of correct orders
  • Percentage of warm pizzas

SLIs answer the question:

“How are we actually performing?”

It’s just a metric.

No promises yet.


Step 2: The SLO — The Promise

SLO = Service Level Objective

Now the pizza place makes a promise:

“95% of pizzas will arrive within 30 minutes.”

That’s the SLO.

Not 100%.

Because 100% reliability would require:

  • stopping the kitchen when it gets busy
  • rejecting orders
  • never trying new recipes

Instead, the restaurant sets a realistic target.

Good enough to keep customers happy.

Flexible enough to keep improving.


Step 3: The Error Budget — The Freedom to Experiment

Here’s where things get interesting.

If the SLO is 95%, that means 5% of orders are allowed to be late.

That 5% is called the error budget.

It’s not a failure.

It’s intentional breathing room.

The restaurant can use that budget to:

  • test new pizza ovens
  • hire new drivers
  • try a new routing system

Sometimes experiments cause small delays.

And that’s okay.

Because the error budget exists specifically for that.


Why This Matters for Engineering Teams

Without error budgets, teams are pushed toward perfection.

Perfection sounds nice.

But in practice it leads to:

  • fear of deployments
  • endless caution
  • burned-out engineers

Error budgets create balance.

If the service is healthy, teams can move fast.

If the service becomes unreliable, teams slow down and focus on stability.

It’s a governor on risk, not a punishment.


The Pizza Lesson

Imagine a pizza place promising:

“100% of pizzas arrive within 30 minutes forever.”

You would know that’s unrealistic.

Good engineering teams know that too.

Instead they say:

“We’ll be reliable almost all the time.”

And when something goes wrong, they learn and improve.


What This Means in Real Life

Remember this:

  • SLI = How we measure the service
  • SLO = The reliability promise
  • Error Budget = The allowed imperfection

Together they create something powerful:

A system that is reliable for users

and sustainable for engineers.

Because perfect systems don’t exist.

But good pizza—and good reliability—are close enough.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

WordPress Cookie Notice by Real Cookie Banner