Stripe Webhook Handling for API Billing: Payment Failures, Renewals, and the Race Conditions That Will Break Your Key State
The payment integration isn't done when you ship the checkout flow. The hard part is webhooks: asynchronous events from Stripe that tell you when subscriptions renew, payments fail, and customers cancel. If you handle these incorrectly, you'll have users who paid but can't access the API, or users who stopped paying but still have access.
This is the state machine problem at the center of subscription billing.
The events that matter for API billing
Stripe fires dozens of event types. For a metered API billing model, you need to handle five:
customer.subscription.created → new subscriber, activate key tier
customer.subscription.updated → plan change, update rate limits
customer.subscription.deleted → cancellation, downgrade to free tier
invoice.payment_succeeded → recurring renewal, extend access
invoice.payment_failed → billing failure, implement grace period
Everything else (payment_intent, charge, coupon events) is noise for this use case. Be explicit about what you handle; silently drop everything else.
The idempotency requirement
Stripe delivers webhooks at least once — not exactly once. Network failures, Stripe's retry logic, and your own 5xx responses mean the same event may arrive 2-3 times. Every handler must be idempotent.
The standard pattern: store the Stripe event ID before processing, check it before processing again:
async def handle_webhook(payload: bytes, signature: str) -> None:
event = stripe.Webhook.construct_event(payload, signature, WEBHOOK_SECRET)
event_id = event["id"]
# Check if already processed
if await redis.get(f"webhook:processed:{event_id}"):
return # Idempotent — already handled
# Process the event
await dispatch_event(event)
# Mark as processed (TTL: 24h to prevent infinite storage)
await redis.setex(f"webhook:processed:{event_id}", 86400, "1")
This is not optional. Without idempotency, a Stripe retry during your API's brief downtime will double-process every billing event that fired while you were down.
The race condition between webhook delivery and your database
Stripe fires customer.subscription.created within seconds of checkout completion. Your user is redirected back to your success page at roughly the same time. The race: your success-page handler and your webhook handler both try to activate the same subscription, and they race against each other.
The pattern that prevents this: use upsert logic in your subscription table, keyed on the Stripe subscription ID:
INSERT INTO subscriptions (stripe_subscription_id, customer_id, status, tier, key_hash)
VALUES ($1, $2, $3, $4, $5)
ON CONFLICT (stripe_subscription_id)
DO UPDATE SET status = EXCLUDED.status, tier = EXCLUDED.tier;
Whichever handler wins, the result is the same. No duplicate rows, no double-activation, no race.
Payment failure handling: grace periods are not optional
When invoice.payment_failed fires, your instinct might be to immediately revoke API access. Don't. Stripe's retry logic will attempt payment 3-4 times over a configurable window (typically 7-14 days). If you cut access on the first failure, you'll enrage customers who had a temporary card issue.
The correct model:
async def on_payment_failed(subscription_id: str, attempt_count: int) -> None:
if attempt_count == 1:
# First failure: warning email, no access change
await send_billing_warning_email(subscription_id)
elif attempt_count == 2:
# Second failure: rate-limit to free tier level
await downgrade_to_grace_tier(subscription_id)
elif attempt_count >= 3:
# Final failure: full downgrade
await downgrade_to_free(subscription_id)
await send_subscription_ended_email(subscription_id)
The grace tier is a free-tier rate limit applied to a paid-tier key — the user can still make API calls, but at reduced throughput. This preserves the relationship while the billing issue resolves, and it's what prevents the support ticket that says "you cancelled my subscription without warning."
Subscription renewal and the token counter reset
For metered subscriptions, Stripe resets the usage meter at each billing cycle boundary. Your internal state needs to match. When invoice.payment_succeeded fires for a renewal:
- Verify the subscription is still active
- Extend the
valid_untiltimestamp on the API key - Reset any daily/monthly call counters to zero
- Update the subscription item ID (this changes on plan upgrades — see below)
The subscription item ID is the crucial one. When a customer upgrades from PRO to ULTRA, Stripe creates a new subscription item with a new ID. If you keep reporting usage to the old subscription item ID, Stripe won't credit the right plan. Always read the subscription item ID from the webhook payload, not from your database cache.
async def on_payment_succeeded(event: dict) -> None:
invoice = event["data"]["object"]
subscription_id = invoice["subscription"]
# Fetch fresh subscription from Stripe (not from cache)
subscription = stripe.Subscription.retrieve(subscription_id)
# Get the current subscription item ID (may have changed on plan upgrade)
subscription_item_id = subscription["items"]["data"][0]["id"]
# Update our records with the current item ID
await update_subscription(subscription_id, {
"subscription_item_id": subscription_item_id,
"valid_until": datetime.fromtimestamp(subscription["current_period_end"]),
"status": "active",
})
The cancellation path
customer.subscription.deleted fires on cancellation. Two modes:
- Immediate cancellation: access ends at the moment of cancellation
- Cancel at period end: Stripe fires this event at the actual end of the paid period
Stripe's default is cancel-at-period-end, which is correct for most API billing models. Your webhook handler should check cancel_at_period_end before revoking:
async def on_subscription_deleted(subscription: dict) -> None:
# For cancel-at-period-end, this fires when the period actually ends
# The subscription is fully over — downgrade immediately
await downgrade_key_to_free(subscription["id"])
await send_cancellation_confirmation_email(subscription["id"])
Don't try to preserve some kind of "partial period" access. When Stripe fires subscription.deleted, the period is over.
Webhook endpoint security
Never process a webhook without verifying the Stripe signature. The webhook payload includes a signature header (Stripe-Signature) that you verify against your endpoint's signing secret. Skipping this means anyone can POST to your webhook URL and manipulate billing state.
try:
event = stripe.Webhook.construct_event(
payload,
request.headers.get("Stripe-Signature"),
STRIPE_WEBHOOK_SECRET
)
except stripe.error.SignatureVerificationError:
raise HTTPException(status_code=400, detail="Invalid webhook signature")
Return 200 quickly. Stripe will retry if you return 4xx or 5xx, and it will cancel retries if you take more than 30 seconds. Put long-running work (email sending, propagation to downstream systems) in a background task:
@router.post("/webhooks/stripe")
async def stripe_webhook(request: Request, background_tasks: BackgroundTasks):
payload = await request.body()
# Verify + basic parsing (fast)
event = verify_and_parse(payload, request.headers)
# Delegate heavy work to background
background_tasks.add_task(process_stripe_event, event)
return {"received": True}
The state machine you're actually building
At its core, Stripe webhook handling is a state machine. API key access tier is a function of subscription state, and subscription state changes via Stripe events. Drawing this out explicitly helps:
[no subscription] → subscription.created → [active: paid tier]
[active: paid tier] → invoice.payment_failed → [grace: free tier limits]
[grace] → invoice.payment_succeeded → [active: paid tier]
[grace] → subscription.deleted → [inactive: free tier]
[active] → subscription.updated → [active: new tier]
[active] → subscription.deleted → [inactive: free tier]
Every arrow is a webhook. Every state is a row in your subscriptions table. The bugs happen when the state machine is implicit — when the transitions are scattered across handlers without a clear model of what state means. Make it explicit, test the transitions with Stripe's test mode event triggers (stripe trigger invoice.payment_failed), and the billing layer will be boring in exactly the way you want it to be.
Next: free-to-paid conversion architecture — specifically, why the conversion surface for B2A is different from B2C, and what to put on the upgrade path when it's an agent hitting your 429, not a human.