Skip to content

Runbook: Stripe Webhook Failures

Trigger: Stripe events are failing, delayed, or billing state is not syncing after checkout. Impact: Subscription state, expert review payment state, and post-checkout UX can drift from Stripe reality.

Quick triage

  1. Confirm whether the failure is at webhook delivery time or after a successful webhook.
  2. Check backend health:
    bash
    curl -fsS http://localhost:7000/actuator/health
  3. Check whether the issue is isolated to one checkout or affects all recent events.
  4. Inspect backend logs around POST /api/v1/billing/stripe/webhook.

Common failure classes

400 Missing Stripe signature header

Check:

  • whether Stripe is calling the correct endpoint
  • whether another proxy or test harness is stripping Stripe-Signature
  • whether the request is actually coming from Stripe and not a manual replay without headers

Signature verification failure

Check:

  • the configured Stripe webhook secret
  • whether test-mode events are being sent to a live-mode secret or the reverse
  • whether the environment was rotated without updating the running backend

Duplicate or replayed events

Duplicates are expected occasionally. The backend should no-op them based on Stripe event ID persistence.

Check:

  • whether the duplicate was harmlessly ignored
  • whether the original event was ever processed successfully
  • whether event persistence is failing and causing repeated work

Checkout finished but billing state did not update

Check:

  • whether checkout.session.completed or checkout.session.async_payment_succeeded arrived
  • whether the user also reached the client-side checkout confirmation flow
  • whether the stuck object is a subscription, payment record, or expert review order

Useful related routes:

  • POST /api/v1/billing/checkout/confirm
  • GET /api/v1/billing/expert-review/orders/by-checkout-session?sessionId=...

Checkout expired unexpectedly

Check:

  • whether Stripe sent checkout.session.expired
  • whether the checkout URL was abandoned for too long
  • whether pending expert review orders were correctly marked canceled

Event coverage to verify

The current backend explicitly handles:

  • checkout.session.completed
  • checkout.session.async_payment_succeeded
  • checkout.session.expired
  • customer.subscription.created
  • customer.subscription.updated
  • customer.subscription.deleted

If the failing event type is outside this set, treat it as a coverage gap rather than an outage in an already-supported handler.

Escalation

If paid checkout state is drifting for more than 5 minutes:

  1. Page the backend owner
  2. Capture affected Stripe session IDs and internal order or payment IDs
  3. Record whether the failure is delivery, signature validation, dedupe, or handler coverage

Post-action

  1. Link the incident to Billing and Stripe.
  2. Update docs if a newly observed event type or failure mode was missing from the reference.