Skip to content

Instantly share code, notes, and snippets.

@gsingal
Created May 1, 2026 15:24
Show Gist options
  • Select an option

  • Save gsingal/84730e01dcf19281ea692c6b9b74003e to your computer and use it in GitHub Desktop.

Select an option

Save gsingal/84730e01dcf19281ea692c6b9b74003e to your computer and use it in GitHub Desktop.
PostHog experiment v5 launch plan — execute today (for Mahmoud + Ahmed)

PostHog Experiment v5 — Execute Today

Goal: Get the redesigned pricing-abc-pro experiment (v5) live today. The current v4 dashboard shows wrong rates for every metric — abandonment + completion > 100% for some variants — and we cannot make a Day-14 ship decision on it. v5 fixes the underlying event instrumentation, redesigns the metric set, and resets the experiment.

Owners today:

  • Mahmoud — review + merge two FE PRs ([A2] + [A3])
  • Ahmed — review + merge one BE PR ([A1]), then deploy + verify + run launch
  • Gaurav — available for questions; will not act on these PRs

Why we have to do this

Three orthogonal bugs in v4 that all affect data integrity:

  1. checkout_abandoned fires on every page navigation via a beforeunload handler. A user who reloads the payment page three times then completes the purchase is counted as 3 abandonments + 1 completion. Result: abandonment + completion > 100% on some variants.
  2. begin_checkout fires 1.7× per user because of mid-checkout reloads, plan switches, and Stripe retries. The headline funnel is robust to this, but Tier-2 plan-mix ratios in HogQL count events directly — they're wrong.
  3. add_to_cart is functionally redundant with begin_checkout (clicking a plan card immediately navigates to the payment page). Audit data: 62 add_to_cart users == 62 begin_checkout users (with 4 begin_checkout-only). The metric set deliberately drops it.

The launch script is configured all-or-nothing in a single REST PATCH. If we flip the experiment definition to v5 while production still has the old event semantics, we silently start collecting data with the wrong shape — and that bias is invisible until Day-7 audit catches it. That would force a v6 reset (this would be reset #4 in 6 weeks). So we have to land all three code fixes, deploy, and verify before running the launch.


The four-stage gate sequence

[A] Code merged to master  →  [B] Code deployed to production  →  [C] v5 launch script runs  →  [D] v5 live, Day-7/Day-14 audits

Each arrow is a hard gate. Skipping or reordering any of them produces broken data.


[A] Merge to master — 3 PRs

All three PRs target master, all are ready-for-review labeled, none have outstanding Codex findings on HEAD.

[A1] PR #4461 — server-derived checkout_abandonedAhmed

  • URL: https://github.com/rotatingroom/rr/pull/4461
  • Scope: 9 files / +860/-49
  • Base: feature/posthog-v5 (single integration branch for the v5 batch)
  • What it does:
    • Removes the beforeunload handler from Payment.vue.
    • Adds app/Jobs/EvaluateCheckoutAbandonment.php — a Laravel job that fires 24h after begin_checkout. Checks if the user has a Subscription created/updated in window. If not, fires checkout_abandoned server-side via posthog-php with manually-set $feature/<flag> enrichment.
    • Adds app/Http/Controllers/AnalyticsController.php with POST /api/internal/analytics/schedule-abandonment-check. Auth-gated, listing-ownership guarded, server-derives begin_checkout_at from now() (does not trust the client timestamp).
    • 5 rounds of Codex review addressed: detect swap-and-invoice upgrades; bound lookup to 24h window; preserve null variant end-to-end; verify listing ownership; fail fast when queue connection is sync.
    • Net effect: kills the beforeunload misfire that was producing abandonment + completion > 100%.
  • Review focus: the Job's userPurchasedInWindow() Subscription query; the Controller's listing-ownership guard; the sync-queue 503 fail-fast.
  • Why this is gating: without it, the v5 dashboard's % Checkout Abandonment metric remains uninterpretable.

[A2] PR #4457 — drop add_to_cart PostHog capture — Mahmoud

  • URL: https://github.com/rotatingroom/rr/pull/4457
  • Scope: 4 files / +10/-12 (tiny)
  • Base: master (rebased from feature/posthog-v5; previous polluted history was force-pushed away)
  • What it does:
    • Deletes posthog.capture('add_to_cart', ...) calls from Plans.vue and PlansVariantC.vue (one line each).
    • Removes the plan_selected → add_to_cart PostHog alias from useAnalytics.js.
    • Keeps $gtag.event('add_to_cart', ...) — GA4's funnel still uses this natively.
    • Doc note in ANALYTICS_EVENT_STANDARDS.md.
  • Review focus: confirm GA4 emission is preserved; confirm no other code reads the old PostHog add_to_cart event.
  • Why this is gating: the v5 metric spec deliberately excludes add_to_cart. If production still emits it, the dashboard will show events the metric definition doesn't expect.

[A3] PR #4460 — sessionStorage dedup of begin_checkoutMahmoud

  • URL: https://github.com/rotatingroom/rr/pull/4460
  • Scope: 3 files / +437/-23 (most of +437 is a Node-based test)
  • Base: master (rebased from feature/posthog-v5; previous polluted history was force-pushed away)
  • What it does:
    • Adds resources/js/composables/useCheckoutEnter.js — sessionStorage guard keyed by rr_checkout_enter_v1:<user|anon>:<listing>. Storage operations are wrapped narrowly so a misbehaving tracker never breaks payment.
    • Wires the composable into Payment.vue, replacing the direct analytics.track('checkout_started', ...) call inside whenResolved().then().
    • Re-firing on a fresh session is intentional (a user who returns next day starts a new attempt).
    • Adds a Node-based unit test for the dedup logic.
  • Review focus: the sessionStorage guard wrapping; the per-(user,listing) key shape; behavior when sessionStorage is unavailable (private mode).
  • Why this is gating: without dedup, Tier-2 plan-mix HogQL ratios count events directly, not users — they'd be biased by the 1.7× event-per-user inflation.

[B] Deploy to production — Ahmed

Hard-gated on [A]. Once all three are merged to master, they ride the next deployment branch (Tue/Thu cycle per CLAUDE.md). For v5 launch today, this means: cut a fresh deployment/2026-05-01 from master, push to production, verify.

Verification queries (run after deploy)

Run these against PostHog within 1-2 hours of the deploy. All three must pass before [C].

B1 — add_to_cart removal

SELECT count() AS posthog_atc_events,
       countIf(properties.`$lib` = 'web') AS web_atc_events
FROM events
WHERE event = 'add_to_cart'
  AND timestamp >= '<deploy_time>'
  AND properties.`$host` = 'rotatingroom.com';

Expected: web_atc_events should be 0 (or near-0; some long-lived sessions may still have the old bundle cached). GA4 still gets the event via $gtag.event — only the PostHog-side capture is removed.

B2 — begin_checkout dedup

SELECT count() AS events,
       count(DISTINCT person_id) AS users,
       round(count() * 1.0 / count(DISTINCT person_id), 2) AS events_per_user
FROM events
WHERE event = 'begin_checkout'
  AND timestamp >= '<deploy_time>'
  AND properties.`$host` = 'rotatingroom.com';

Expected: events_per_user drops from 1.7 → ~1.0. Some residual >1.0 is fine (different sessions on the same listing, or different listings).

B3 — checkout_abandoned server-derived

Three checks here.

(a) Client-side emission is gone:

SELECT count()
FROM events
WHERE event = 'checkout_abandoned'
  AND timestamp >= '<deploy_time>'
  AND properties.`$host` = 'rotatingroom.com'
  AND properties.`$lib` = 'web';

Expected: 0 within ~1h post-deploy.

(b) Server-derived events arrive 24h+ after deploy:

SELECT count()
FROM events
WHERE event = 'checkout_abandoned'
  AND timestamp >= '<deploy_time + 25h>'
  AND properties.source = 'server-derived';

Expected: > 0 starting 24h after the first post-deploy begin_checkout.

(c) Queue worker health:

  • Horizon dashboard shows EvaluateCheckoutAbandonment jobs being processed without failure.
  • No Rollbar items mentioning EvaluateCheckoutAbandonment.

If B3(b) returns 0 by Day-2 post-deploy, the queue worker is broken or the endpoint isn't being called — investigate before [C].


[C] Run v5 launch script — Ahmed

Status: the launch script does not exist yet. Gaurav is offering to spawn an agent today to build scripts/posthog-v5-launch.sh and scripts/posthog-v5-baseline-capture.js in parallel with [A]/[B] reviews. So by the time [B] verification passes, [C] should be ready to run.

Pre-launch checks (run sequentially, all must pass)

C1 — Capture v4 baseline (rollback safety).

source ~/.secrets
node scripts/posthog-v5-baseline-capture.js

Writes docs/data/2026-05-01-v4-final-snapshot.json. Required so we can restore v4's metric set if [C2] partially fails.

C2 — SRM check. Chi-squared on $feature_flag_called distinct users by variant against the configured 20/30/30/20 rollout. If p < 0.01, abort — there's a bucketing issue that needs investigation before any reset.

SELECT properties.`$feature/pricing-abc-pro-v3` AS variant,
       count(DISTINCT person_id) AS users
FROM events
WHERE event = '$feature_flag_called'
  AND properties.`$feature_flag` = 'pricing-abc-pro-v3'
  AND timestamp >= '2026-04-24 08:35:00'
  AND properties.`$host` = 'rotatingroom.com'
GROUP BY variant;

C3 — Server-side $feature/<flag> attribution validation. Manually dispatch EvaluateCheckoutAbandonment for a known test user with a known variant. Verify in PostHog event explorer that the resulting checkout_abandoned event has $feature/pricing-abc-pro-v3 populated.

php artisan tinker --execute='
  use App\Jobs\EvaluateCheckoutAbandonment;
  use Carbon\CarbonImmutable;
  EvaluateCheckoutAbandonment::dispatchSync(
    userId: <test_user_id>,
    listingId: <test_listing_id>,
    beginCheckoutAt: CarbonImmutable::now()->subHours(25),
    featureFlagVariant: "control"
  );
'

If the event arrives in PostHog without $feature/pricing-abc-pro-v3, fall back to Tier-4 HogQL-only abandonment (do not include % Checkout Abandonment as a native metric in the v5 spec).

Run the launch

bash scripts/posthog-v5-launch.sh --dry-run   # preview
bash scripts/posthog-v5-launch.sh             # run for real

The script (idempotent, re-runnable):

  1. Resets experiment 367162 via POST /reset/ — clears the Bayesian prior, sets start_date = now().
  2. Replaces the metric set with the v5 spec via PATCH /experiments/367162/. Tier-1 headline = Plan→Purchase Conversion (unchanged from v4); Tier-2 = LPV (KM + Stripe Mean), plan-mix ratios with two denominators; Tier-3 guardrails = Free Plan, Payment Failures; Tier-4 diagnostics = abandonment, switching, etc.
  3. Re-pins the description with the catastrophe rule, LTV table, and a link to the corrected-rates HogQL insight.
  4. Runs validation: experiment-get, asserts metric names match the spec.

Post-launch (within 30 minutes)


[D] v5 live — audits scheduled

Day-7 audit: 2026-05-08. Run R0-R8 from /posthog-experiments skill. Day-14 ship decision: 2026-05-15. Catastrophe rule on Plan→Purchase Conversion (point estimate ≤ -15% relative or 95% CI lower ≤ -25%).

These audits are Gaurav's responsibility; mentioned here so the timeline is clear.


Checklist for today

Mahmoud

  • Review #4457 (atc-deprecate). 4 files, 10/12 lines. ~5 min review.
  • Review #4460 (bc-dedup). 3 files, sessionStorage composable + Vue wire-in + Node test. ~15 min review.
  • Approve and merge both. Order doesn't matter — independent files.

Ahmed

  • Review #4461 (checkout-abandoned). 9 files, Job + Controller + Service + tests. ~30 min review.
  • Approve and merge.
  • Once all 3 are on master: cut deployment/2026-05-01 from master and deploy to production (forward-integrate master into the deployment branch first per CLAUDE.md).
  • Run B1, B2, B3 verification queries (1-2 hours post-deploy for B1+B2; B3(b) waits 24h+ for first server-derived event).
  • Slack-confirm to Gaurav when [B] verification passes.
  • Coordinate with Gaurav to run [C] launch script.

Gaurav (non-blocking)

  • Spawn agent in parallel to build scripts/posthog-v5-launch.sh and scripts/posthog-v5-baseline-capture.js from the plan's Task 1 + Task 6 specs. Ready by the time [B] verification passes.
  • Day-7 audit on 2026-05-08.
  • Day-14 ship decision on 2026-05-15.

References


Why today

Each day v4 keeps running:

  • The dashboard continues to show wrong rates that confuse anyone who looks at it.
  • The Day-3 audit numbers we have are only directional and become more stale.
  • Sequential-testing penalty (this would be reset #3 already; we're accepting it but each delay compounds the penalty).
  • The Day-14 decision window is a moving target — every day v5 doesn't launch is a day Day-14 slips.

If [A] reviews are clean, [A]+[B]+[C] is achievable in a single day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment