DEV Community

sravan27
sravan27

Posted on

I keep finding the same Stripe webhook bugs in SaaS launches

I keep finding the same Stripe webhook bugs in SaaS launches

Most early SaaS billing bugs are not in Stripe Checkout itself. They are in the glue around it:

  • trusting the success redirect instead of the signed webhook
  • parsing JSON before signature verification
  • missing idempotency for retry events
  • reflecting verifier errors from unauthenticated webhook routes
  • updating subscription state without a replay/audit trail
  • letting "Pro" access drift from the payment source of truth

Over the last few days I have been shipping small public fixes around exactly this class of problem.

Recent examples:

The pattern is boring in the best possible way: payment systems should be boring.

The 48-hour version

For a small SaaS that is about to turn on paid plans, I can take a bounded payment assurance sprint:

  • inspect Checkout / webhook / subscription state flow
  • verify signed webhook handling and raw-body behavior
  • add idempotency around Stripe retry events
  • ensure subscription status and entitlement state have one source of truth
  • add a small regression test or smoke script
  • leave a deploy/runbook note so the next failure is diagnosable

Fixed scopes I am taking:

  • $2,000 / 48 hours: one payment path hardened and documented
  • $5,000 / 5 days: full launch pass across Checkout, webhook, subscription mirror, Pro gate, pricing page handoff, and smoke test

I am not dropping a checkout link into a blog post. If you have a live Stripe/Supabase/Cloudflare/Vercel billing path and want me to take the first sprint, reply with:

  1. repo or relevant code paths
  2. what payment state should unlock
  3. current deploy target
  4. whether test-mode Stripe keys/webhook secret are ready

I will send a fixed scope and payment link only if it is a fit.

GitHub: https://github.com/sravan27

Top comments (2)

Collapse
 
mihirkanzariya profile image
Mihir kanzariya

Good list. The one I see most often that does not get talked about enough: mishandling customer.subscription.updated events during plan changes.

When a customer upgrades or downgrades mid-cycle, Stripe fires customer.subscription.updated with the new plan details. But the invoice for the prorated amount comes separately as invoice.paid. If your webhook handler updates entitlements on subscription.updated but the prorated payment fails (card declined, insufficient funds), the user gets the upgraded plan without actually paying for it. The fix is to gate entitlement changes on successful payment confirmation, not on subscription status changes alone.

Another one specific to anyone doing attribution or referral tracking through Stripe: client_reference_id in Checkout Sessions is your friend, but it disappears if you only listen to invoice.paid without linking it back to the original checkout.session.completed event. If you need to know WHO referred a customer (for affiliate commissions, partner tracking, etc.), you have to capture that reference at checkout completion and store it, because later subscription events will not carry it.

The idempotency point is huge too. Stripe explicitly says they may send the same event more than once. I have seen production systems double-credit accounts or double-count referred customers because the handler did not check if it already processed a given event ID.

Collapse
 
mihirkanzariya profile image
Mihir kanzariya

The "trusting the success redirect" one is the most dangerous because it feels correct. The Checkout session has a success_url, the user lands on /thank-you, the app grants access. Then someone bookmarks that URL or hits it directly and gets Pro for free. Seen it happen in production.

The fix people reach for first -- checking the Checkout session status in the success page handler -- is better but still fragile. If your webhook handler and your success-page handler both write subscription state, you now have two writers racing. The webhook might arrive before the redirect, or 30 seconds after, or (if your endpoint was down) not until the next retry window. If the success-page handler already wrote "active" and the webhook handler also writes "active," fine. But if either one also sets metadata, plan details, or trial end dates, you get inconsistent state depending on which one won last.

One pattern that's held up well: the success page handler does NOT write state. It polls or waits for the webhook-written state to appear (with a short timeout and a "we're activating your account" spinner). The webhook is the single writer. If the webhook never arrives within ~30 seconds, you show a "your payment is confirmed, we're finishing setup" message and let the webhook handler do the actual provisioning whenever it lands. One source of truth, no race.

The idempotency point is also underrated. Stripe retries with the same event ID, but most handlers don't check whether they already processed that ID. On a slow endpoint, you can get the same invoice.paid event three times and create three billing records.