You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Message Liquidity Diagnostics + Evidence-Backed June Shortlist
Run 2026-05-20 against the prod read replica. Pairs with the ideation menu at docs/strategy/2026-05-18-message-liquidity-ideation.md.
TL;DR
Three SQL diagnostics against the ideation menu confirm one theme: lister responsiveness is the binding constraint on message liquidity, not seeker inquiry volume. Over half of all inquiries get no lister reply within 72h, and listers who ignore their inquiries churn ~12pts worse.
The strategic insight: that 12pt churn gap is best attacked rehabilitatively, not punitively. Making non-replying listers aware of their waiting inbox (nudge them to respond → they get hits → they see RR working → they churn less) is positive-sum for both sides. Routing demand away from non-repliers is the fallback, not the first move, because non-replying is likely a symptom of "I think RR is dead" and writing them off confirms that belief. This reframes the in-flight P1 unread-reminder work from a messaging-volume lever into a churn-reduction lever aimed at the exact cohort the data flags.
One negative finding: favorites/wishlist ideas are NOT cheap. There is no favorites table in prod, so 3 menu ideas that assumed it would need the feature built first.
Diagnostic 1 — Quotes Per Request (lister reply rate)
Thumbtack-style marketplace-health metric. Pure SQL, runnable weekly.
Of 26,684 inquiries in the last 90 days, only 46.7% (12,463) received any lister reply within 72h. More than half of all inquiries meet silence inside 3 days.
At the seeker level: of 7,444 seekers, only 20.9% (1,554) got 3+ lister replies across all their inquiries within 72h.
Read: the funnel does not leak at "seeker sends a message." It leaks at "lister replies." Roughly 79% of seekers do not get a healthy response volume. QPR (% of seekers getting 3+ replies in 72h) is a strong weekly leading indicator and should be tracked.
Tests the V4.4 thesis: should we route demand AWAY from listers who do not reply?
Cohort: listers who received >=1 inquiry 60-90 days ago. Retention proxy: still has an active Stripe subscription today.
Lister cohort
Listers
Still paying
Retention
Replied to none of their inquiries
437
295
67.5%
Replied to all
377
289
76.7%
Replied to some
411
326
79.3%
Read: listers who got inquiries but ignored them all retain ~12pts worse than listers who engage. The "leaky bucket" cohort (437 listers) is real and measurable. Two opposite first moves follow from the same finding:
Option B (rehabilitative — primary): make the non-repliers aware of their waiting inbox. Surface "you have N inquiries waiting for a reply" to exactly this cohort. The likely causal story is that non-replying is a symptom, not a cause — a lister who isn't replying probably believes RR isn't working for them. If they respond, they get hits, they see RR working, and they are less likely to churn. This is positive-sum: better for the lister (retention) AND the seeker (finally gets a reply). This is precisely what P1 unread reminders (in flight, #5007) does — which reframes P1 from a "messaging volume" lever into a churn-reduction lever aimed at the exact cohort the data flags.
Option A (punitive — fallback): route demand away from chronic non-repliers (V4.4) so seeker attention is not wasted. The risk: if non-replying is a symptom of "I think RR is dead," routing demand away confirms that belief and accelerates churn. So A should only fire for listers who got the Option-B awareness nudge and still did not respond — not as the first move.
Sequencing: B first (rehabilitate via awareness), A as the fallback for non-responders to B. They are not mutually exclusive, but B is the higher-value, positive-sum lever and should lead.
Diagnostic 3 (negative) — Favorites ideas are not cheap
There is no favorites / wishlist / saved-listing table in prod. Three menu ideas assumed it exists and are therefore NOT measurable or buildable in a 1-engineer week:
V4.2 save-to-view ranking signal
V3.6 favorite → 72h soft nudge
#10 favorite as soft inquiry trigger
These require building the favorites feature first. Reclassify from "cheap test" to "needs foundation."
Evidence-backed June shortlist (proposed)
The diagnostics point at one coherent theme. Proposed 3-5 bets, ordered by evidence strength × fit with current team capacity (Ahmed BE, Mahmoud FE, Felicia design):
P1 lister unread reminders(in flight, #5007, Ahmed) — the primary rehabilitative lever. Makes non-replying listers aware of their waiting inbox; lifts the 46.7% reply rate AND attacks the 12pt churn gap in the lowest-retention cohort. Positive-sum for lister + seeker. Evidence: Diagnostics 1 + 2. Worth strengthening the issue's framing/KPI to make churn-reduction (not just message volume) an explicit success metric.
#1 first-message auto-reply + visible response clock — de-risks the seeker side of the same gap ("did this go through?"). Cheapest high-impact bet in the menu. Complements P1. Evidence: Diagnostic 1; Houzz/Etsy/Airbnb prior art.
V4.7 QPR as the weekly leading indicator — ties the above together; pure SQL, near-zero cost. Add to the tier-2 dashboard once defined. Baseline: 46.7% any-reply-72h, 20.9% seekers with 3+ replies.
F4 one-click "I'm Interested"(in flight, #5008, Mahmoud) — V2 conversion lever; orthogonal to responsiveness but already staffed.
V4.4 demand re-routing — FALLBACK only. Route demand away from chronic non-repliers who got the P1 awareness nudge and still did not respond. Not a first move (see Diagnostic 2). Ranking-weight change in RoomRecommendationService, downstream of P1.
Held for a later cycle (not June):
#2 Your-Turn cap — contrarian (reduces raw message count), hard to A/B cleanly, hard to explain. Strong prior art (Hinge) but higher risk.
#3 Special-Offer reply primitive — the one genuinely new conversion lever, but higher build effort; revisit once responsiveness is addressed.
Favorites-dependent ideas — blocked on building favorites first.
"Lister replied" = a chat_messages row in the conversation where user_id = conversations.owner_id.
All queries filter conversations.exclude_from_metrics = false.
Retention proxy = subscriptions.stripe_status = 'active' today; this is a current-state proxy, not point-in-time, so it slightly understates churn for very recent cohorts (acceptable for a directional read).
Written 2026-05-20 by research + synthesis agent. Decision-grade input to the end-of-May message-liquidity shortlist (queued, waiting id=10). Extends the 26-idea ideation menu (docs/strategy/2026-05-18-message-liquidity-ideation.md), the SQL diagnostics (docs/strategy/2026-05-20-message-liquidity-diagnostics.md), and the F/P/M/PR/R catalogue in docs/plans/2026-04-04-messaging-strategy.md. KPI structure follows docs/KPI_MEASUREMENT_GUIDE.md Part I (the Five Questions). All external citations are real and verified to resolve as of 2026-05-20; where no impact number was found, that is stated explicitly.
How to read this
North star: % of paid listings receiving ≥1 message within 14 days. Listings that hit it churn ~50% less. The Q2 tier-2 primary ("Demand Distribution, 14d") sits at 53.1%, target >60%.
The binding constraint (settled, from diagnostics, not re-litigated here): lister responsiveness, not seeker volume. Only 46.7% of inquiries get any lister reply within 72h; only 20.9% of seekers get 3+ replies (QPR baseline). Non-replying listers retain 67.5% vs ~77-79% for engagers (a ~12pt gap). The reframe is rehabilitative (make non-repliers aware of their waiting inbox) over punitive (route demand away).
EV/cost rubric (applied consistently per the handoff):
Impact H = plausibly moves the 14d-message rate or the 46.7% reply rate by a meaningful margin, backed by diagnostics or strong prior-art numbers. M = moves one vector sub-metric with moderate evidence. L = marginal/speculative.
Cost L = pure SQL / config / copy / single-component UI, no new infra. Cost M = new events/fields + a feature surface. Cost H = new tables, ML, real-time infra, or cross-system work.
Penalize anything needing the favorites table (does not exist in prod) or new ML.
Proxy classification (Q4):direct (the metric is the outcome), validated (cite the analysis), weak (name the gap + what a Phase-0 check would prove).
1. TL;DR, the 5 highest-EV/cost bets across all vectors
These are the cross-vector top picks. Three are cheap (ship and measure in a 1-engineer week or less); two extend in-flight work rather than starting cold.
V2-A: Guest inquiry (no account to send the first message). Highest single conversion lever RR has. GA4 shows the registration gate kills 54% of seekers who reach it (~334 lost inquiries/week). Prior art: forced account creation drives ~24% of cart abandonment (Baymard via Shopify); optional accounts lift conversion 10-30%. This IS strategy-doc F1, deserves to be the V2 anchor. Impact H / Cost M. (Mahmoud's F4 #5008 one-click "I'm Interested" is adjacent but solves a different step.)
V3-A / P1: Lister "you have N inquiries waiting" awareness (in flight, #5007, Ahmed). The primary rehabilitative lever. Directly attacks the 46.7% reply rate AND the 12pt churn gap in the exact cohort the data flags. Reframe its success metric from "messages" to reply-rate + 14d retention of the non-replier cohort. Impact H / Cost L (largely built).
V2-B: First-message auto-confirm + visible response-time clock. De-risks the seeker side of the same gap ("did this go through?"). Airbnb: sub-1h response = +25% conversion and +16% impressions; Houzz: 5-min response = 100x reach. Cheapest high-impact behavioral hack in the catalogue. Impact H / Cost L-M.
V4-A: QPR (quotes-per-request) as the weekly leading indicator. Pure SQL, near-zero cost, ties every other bet together. Baseline 46.7% any-reply-72h / 20.9% seekers-with-3+-replies. Not a feature, but the measurement spine for the whole initiative. Impact M (instrumentation) / Cost L.
V3-B: Inquiry-prefill from a seeker's prior conversation. Turns the 5th inquiry from 60 seconds of typing into 5 seconds of confirming, compounding messages-per-seeker. SQL-readable today, render is pure UI, no new infra. Impact M-H / Cost L-M.
Relation to in-flight work: P1 (#5007) and F4 (#5008) are already staffed. This catalog's job is to (a) supply V1 and V2 sections the menu skipped, (b) impose EV/cost ranking, and (c) attach a falsifiable KPI to every item. The cheapest net-new wins are V2-A guest inquiry, V2-B auto-confirm clock, and V4-A QPR.
V1, Increase site visitors
Where the leverage is for RR specifically. V1 is mostly not design and mostly not a 1-engineer-week lever; it is the long-cycle SEO/content/paid machine. For a furnished-mid-term-rental marketplace serving medical-pro travelers on 1-12mo assignments, the durable visitor engine is programmatic long-tail location + profession pages ("furnished housing for travel nurses in [city]"), because the audience searches by assignment city and credential, not by brand. The honest caveat: more visitors only move the north star if the responsiveness constraint is also addressed, otherwise we pour seekers into a leaky bucket. V1 is therefore a parallel, longer-horizon track, not the June lever. Two tier-2 monitoring targets already cover the measurement (TN Messages, NB-Organic Impact), so V1 items below are framed to feed those.
V1.1 Programmatic city × profession landing pages
What it is: Auto-generated, indexable pages for combinations like "furnished rooms for travel nurses in Houston," each templated from existing inventory + city data, targeting long-tail organic queries.
Apartments.com / Zillow dominate rental organic via location-templated pages; competing requires the same template-at-scale play (source: https://gracker.ai/case-studies/zillow).
Expected value:M for the north star directly (visitors are upstream of messages and gated by the responsiveness constraint), H for the TN-Messages and NB-Organic tier-2 monitoring targets. Compounds over quarters, not weeks.
Cost/complexity:H. New page templates, sitemap/indexing work, thin-content risk management, and content QA. Cross-system (SEO + content + eng).
EV/cost rank within V1: 2.
Hypothesis (Q2): If we ship templated city × profession pages, we expect NB-organic clicks and TN-keyword impressions to rise over 2-3 quarters, because we capture long-tail assignment-city searches RR does not currently rank for.
Success measure (Q3+Q4): NB-organic conversions/week and TN messagers/week (GSC + PostHog, weekly cadence), target: lift TN messagers/wk above the current ~2/wk activation floor and grow NB-organic conversions above the 12-week mean. Proxy: validated, NB-organic conversions and TN messagers are existing tier-2 monitoring metrics tied to D (Demand) and S (Supply); the gap is attribution latency (SEO compounds slowly).
Cheap test: Ship 10-20 pages for the highest-volume assignment cities, watch GSC impressions/clicks for 60-90 days before scaling. Not a 1-week win; it is a 1-quarter bet. Needs build first for the templating.
V1.2 Blog → marketplace conversion tightening (existing content, not new traffic)
What it is: Add inline "find furnished rooms in [city] near [hospital]" CTAs and search-prefill links inside high-traffic blog posts, converting existing organic readers into searchers/messagers rather than chasing new visitors.
Prior art:
Blog impressions already run ~13,795/week (internal, KPI guide tier-2 sub-metric), the traffic exists; the gap is conversion of it.
Expected value:M. Converts traffic RR already has; cheaper than net-new acquisition. Bounded by current blog volume.
Cost/complexity:L-M. Copy + CTA components in existing blog templates; small instrumentation for click-through.
EV/cost rank within V1: 1 (best EV/cost in V1 because it monetizes existing traffic).
Hypothesis (Q2): If we add city/hospital-targeted CTAs to top blog posts, we expect blog → search-session and blog → message conversion to rise, because readers already have intent and we shorten the path.
Success measure (Q3+Q4): NB-organic messages/week from blog landing sessions (PostHog, weekly), target: a measurable lift over the pre-CTA blog-session baseline. Proxy: weak, blog→message attribution needs the landing-page → message funnel instrumented; Phase-0 check: confirm PostHog can stitch blog landing → message in a session.
Cheap test: Add CTAs to the top 5 blog posts by impressions, compare blog-session → message rate vs the rest over 30 days. Mostly copy + a component; close to a 1-week win.
What it is: Let seekers post a public "I need furnished housing in [city], [dates], [budget]" request; these are indexable and also routed to matching listers (the reverse-marketplace flow).
Expected value:M-H. Pulls supply toward demand and creates indexable demand pages, but RR's long-cycle search (vs Thumbtack's instant jobs) blunts the synchronous-matching benefit; treat the lister-routing half as the value, the SEO half as a bonus.
Cost/complexity:H. New request entity, routing/matching, lister notification surface, moderation. This is strategy-doc Pillar 5; large.
EV/cost rank within V1: 3.
Hypothesis (Q2): If seekers can post housing requests and we route them to matching listers, we expect lister-initiated first messages to appear (a flow that does not exist today), because we give listers qualified demand without the seeker having to find them.
Success measure (Q3+Q4): lister-initiated conversations/week from housing requests (SQL on conversations with a request-source tag, weekly), target: establish a baseline then grow it. Proxy: validated for QPR (Thumbtack's own metric moved 2x); weak for RR's slower cycle (Phase-0: does request → lister-reply happen within 72h at RR's cadence?).
Cheap test: Manual concierge pilot first, collect 20 housing requests via a form, hand-route to listers, measure reply rate. Needs build first for the automated version.
V2, Increase visitor → message conversion
Where the leverage is for RR specifically. This is the densest cheap-win vector, because RR's own funnel data localizes the leak precisely: the registration gate loses 54% of seekers who reach it (~334 inquiries/week) and mobile drops 43% between listing-view and inquiry-start (GA4, strategy-doc §A2). For a price-sensitive, time-constrained medical-pro audience on mobile between shifts, every required field and every mid-flow account wall is a direct message loss. The menu skipped V2 entirely; it should not have. Note F4 #5008 (one-click "I'm Interested," Mahmoud) is in flight and covers the CTA-affordance step, but not the account wall or the response-anxiety problem.
V2.1 Guest inquiry (no account required to send the first message)
What it is: Let a seeker send a first inquiry with name + email inline, deferring account creation to after the message is sent (or making it optional). Strategy-doc F1.
RR-internal: 54% drop at the registration step, ~334 inquiries/week lost (strategy-doc §A2 GA4).
Expected value:H. This is the single largest measured leak in RR's inquiry funnel and the prior art is unusually consistent. Directly grows messages-per-visitor, which feeds the north star.
Cost/complexity:M. Touches the messaging modal + auth flow + verification gating (messages currently held pending verification). Real but bounded; no new tables. Watch the fraud/verification interaction (T tier-1 guardrail).
EV/cost rank within V2: 1.
Hypothesis (Q2): If seekers can send the first inquiry without an upfront account, we expect inquiry-submit rate to rise by a large margin (toward recovering the 54% registration-gate loss), because we remove the highest-friction step in the measured funnel.
Success measure (Q3+Q4): inquiry-submit rate = inquiry_submitted / inquiry_modal_opened (PostHog, once the Phase-0 funnel events ship; SQL chat_messages first-message count as interim), weekly cadence, target: recover a meaningful share of the ~334/week. Guardrail: fraud contact rate (T) must not rise. Proxy: direct, inquiry submission is the conversion we are measuring.
Cheap test: A/B the guest-inquiry path behind a flag on a traffic slice; compare first-message creation rate. The Phase-0 inquiry-funnel events (inquiry_modal_opened, inquiry_submitted) are the unblocker and are already P0 in the strategy doc. Mostly buildable in a sprint.
What it is: On a seeker's first inquiry, immediately show "Sent. [Lister] usually replies within [X]h," with a visible countdown on the seeker's conversation view. Kills "did this go through?" anxiety and sets a soft lister deadline.
Expected value:H. Cheapest high-impact behavioral hack. Attacks the exact "ghosted → seeker quits → lister churns" loop the diagnostics flag, from the seeker side; complements P1's lister side.
Cost/complexity:L-M. Auto-confirm copy is trivial; the response-time figure can start as a lister-aggregate from chat_messages timestamps (no new infra). The countdown UI is one component.
EV/cost rank within V2: 2.
Hypothesis (Q2): If seekers see immediate confirmation + an expected reply window, we expect fewer abandoned single-message threads and higher progression to a two-message exchange, because confirmation removes the "is this dead?" doubt that drives churn.
Success measure (Q3+Q4): inquiry → two-message-thread rate (SQL: conversations reaching message #3, i.e. seeker + lister + seeker, per chat_messages), weekly, target: lift over the pre-feature baseline. Proxy: validated, Airbnb/Houzz/Etsy all tie response-visibility to conversion; the RR-specific gap is whether a displayed clock (vs actual fast response) moves behavior (Phase-0: A/B the clock-only variant).
Cheap test: Ship auto-confirm + a lister-aggregate "usually replies within Xh" line to a cohort, compare two-message-thread rate vs control via existing chat_messages. Close to a 1-week win.
What it is: A compact trust panel on the listing page: "Verified lister," "responded to an inquiry [N]h ago," and (where present) profile photo, lowering the seeker's perceived risk of messaging.
OfferUp TruYou: verification badge demonstrably increases willingness to transact ("the community prefers to interact with people who have completed optional verifications"); no public conversion-lift % found (source: https://help.offerup.com/hc/en-us/articles/360031993072-TruYou-verification-program, "no published impact data found" for a specific lift number).
Expected value:M. Trust is a real lever for a medical-pro audience wary of housing scams (RR's tier-1 Trust outcome is live), but it nudges conversion at the margin rather than recovering a 54%-sized leak.
Cost/complexity:L-M. RR already has verification + chat_messages timestamps; this is surfacing existing data on the listing page (single component).
EV/cost rank within V2: 3.
Hypothesis (Q2): If listing pages show verified-lister + recent-activity signals, we expect a higher listing-view → message rate, because seekers feel safer and more confident the lister is reachable.
Success measure (Q3+Q4): listing-view → message rate (PostHog room_view → first message, weekly), target: lift on listings showing the block vs not. Proxy: validated (trust→transaction is well-documented) but weak on magnitude for RR (Phase-0: A/B the block, measure view→message delta).
Cheap test: Render the block on a slice of listings (data exists), A/B view→message rate over 2-3 weeks.
V2.4 Mobile inquiry as a 30-second tappable flow
What it is: On mobile, replace the multiline textarea + date typing with chip-select dates + 3 tappable message templates, removing the friction behind the 43% mobile view→inquiry gap.
RR-internal: mobile is 47% of listing views but only 27% of inquiry submits, a 43% gap (strategy-doc §A2).
Expected value:M-H. The mobile gap is large and measured; closing even part of it is meaningful. Bounded by mobile traffic share.
Cost/complexity:M. New mobile inquiry UI + the Phase-0 funnel events to measure it. No new tables.
EV/cost rank within V2: 4.
Hypothesis (Q2): If the mobile inquiry flow becomes tappable and short, we expect the mobile view→inquiry gap to narrow toward desktop levels, because we remove typing friction on small screens.
Success measure (Q3+Q4): mobile inquiry-submit rate vs desktop (PostHog funnel by device, weekly), target: narrow the 43% gap. Proxy: direct.
Cheap test: Ship behind a mobile-only flag, compare inquiry_submitted by device. Needs the Phase-0 funnel events first to measure cleanly.
What it is: Offer 2-3 tappable starter messages ("Hi, I'm a [profession] on assignment [dates], is this still available?") so seekers do not face a blank box.
Etsy/marketplace norm of suggested first messages reduces blank-box drop-off (directional; no isolated RR-applicable % found, "no published impact data found").
Expected value:M. Reduces blank-box friction; smaller than the account-wall lever but stacks with V2.1/V2.4.
Cost/complexity:L. Copy + a chip component in the modal.
EV/cost rank within V2: 5.
Hypothesis (Q2): If seekers get tappable starter messages, we expect higher inquiry-form completion, because we remove the cognitive cost of composing.
Cheap test: A/B templates vs blank box; mostly copy, near 1-week.
V3, Increase messages per visitor
Where the leverage is for RR specifically. V3 is about getting each visitor (seeker or lister) to start more conversations. For RR, the highest-value V3 move is the rehabilitative lister-awareness lever (P1), it is technically a "prompt" but its mechanism is unlocking the 437-lister non-replier cohort, which the diagnostics prove is the binding constraint. Beyond that, V3 leverage is in compounding seeker messages (prefill, similar-listings-up-front) and structured lister reply primitives. Favorites-dependent menu items (V3.6, comment-on-favorite) are reclassified "needs foundation" because no favorites table exists in prod.
V3.1 Lister "you have N inquiries waiting" awareness (P1, in flight #5007)
What it is: Email/in-app nudge to listers with unanswered inquiries: "You have N messages waiting." The primary rehabilitative lever from the diagnostics.
RR-internal: non-repliers retain 67.5% vs 76.7-79.3% (diagnostics Diagnostic 2).
Expected value:H. Attacks both the 46.7% reply rate and the 12pt churn gap in the exact flagged cohort. Positive-sum (lister retention + seeker reply).
Cost/complexity:L. Largely built (#5007); the lever is reminder email/in-app on existing data.
EV/cost rank within V3: 1.
Hypothesis (Q2): If we make non-replying listers aware of their waiting inbox, we expect the 72h reply rate to rise and the non-replier cohort's 14d retention gap to shrink, because non-replying is a symptom of "RR feels dead" that awareness reverses.
Success measure (Q3+Q4): (a) 72h any-reply rate among nudged listers (SQL chat_messages, weekly) target: lift above 46.7%; (b) 14d/30d retention of the nudged non-replier cohort vs unnudged (SQL subscriptions, monthly) target: close part of the 12pt gap. Proxy: validated, Diagnostic 2 establishes the retention link directly. Reframe note: the issue's success metric should be retention + reply-rate, not raw message volume.
Cheap test: Already in flight; measure reply-rate and cohort retention pre/post via SQL replica.
V3.2 Inquiry-prefill from a seeker's prior conversation
What it is: After a seeker's first inquiry, subsequent inquiry forms auto-populate profession/institution/dates/blurb (editable). Strategy/menu V3.4, Zillow portable-application analog.
Expected value:M-H. The multi-listing bottleneck is form fatigue, not interest; reducing the 5th inquiry to a confirm should compound messages-per-seeker.
Cost/complexity:L-M. Prior inquiry content is SQL-readable; render is UI. No new infra.
EV/cost rank within V3: 2.
Hypothesis (Q2): If repeat inquiries are prefilled, we expect messages-per-active-seeker to rise, because we cut the per-message typing cost to near zero.
Success measure (Q3+Q4): 7-day messages-per-active-seeker (SQL chat_messages per distinct seeker, weekly), target: lift over baseline. Proxy: direct.
Cheap test: Prefill from last inquiry's chat_messages.text, measure messages-per-seeker on a slice. SQL + UI, near 1-week.
V3.3 Similar-listings bar moved onto the listing page (pre-inquiry)
What it is: Show "travelers who viewed this also viewed" on the listing detail page itself (not only post-inquiry), turning one page-view into 2-3 messageable options. Menu #6 / M3 extension.
RR already has post-inquiry similar-sublets (M3); moving it earlier reuses the mechanic.
Expected value:M. Multiplies messageable options per session; bounded by inventory density in a city.
Cost/complexity:L. Existing similar-listings logic rendered higher in the page.
EV/cost rank within V3: 3.
Hypothesis (Q2): If similar listings appear before the inquiry, we expect more listings messaged per session, because seekers get low-cost branching options at the decision moment.
Success measure (Q3+Q4): listings-messaged per search session (PostHog session → distinct conversations, weekly), target: lift over baseline. Proxy: direct.
Cheap test: Render the existing component on the listing page for a slice; A/B listings-per-session. Near 1-week.
V3.4 Structured "Special Offer" reply primitive for listers
What it is: Let a lister attach a structured offer ("$1,150/mo for your 8-week stay, expires in 24h") to a reply, not just prose. Menu V3.5.
Reverb's offer mechanic: listings allowing offers sell faster; offers drive a large share of orders (source: https://reverb.com/page/making-offers).
Expected value:M-H. The only genuinely new conversion (price-anchoring) lever, and it forces the lister to commit a number before the seeker drifts. Held-for-later in the diagnostics because of build effort, but high ceiling.
Cost/complexity:M-H. New structured-offer event + (eventually) table; can start text-only rendered as a card.
EV/cost rank within V3: 4.
Hypothesis (Q2): If listers can send structured offers, we expect higher thread-progression after a lister reply, because a concrete price + deadline is a stronger next-step than prose.
Success measure (Q3+Q4): thread-progression rate after a lister offer vs after a text reply (SQL chat_messages, weekly), target: offer threads progress at a higher rate. Proxy: validated (Reverb) but weak for RR's mid-term context (Phase-0: do offers correlate with deal-close signals at RR?).
Cheap test: Ship text-only "$X/mo for Y dates through Z" rendered as a card, compare progression. Then decide on the structured table. Partial build first.
V3.5 Stalled-thread "no reply in 48h → 1-click broadcast to 5 similar" action
What it is: When a seeker's inquiry goes unanswered 48h, replace the dead-end with an action button: "Message 5 similar rooms now." Menu #8, turns the P4 nudge into an action.
RR already has broadcast (rate-limited 15/wk) to reuse.
Expected value:M. Converts a churn moment (ghosted seeker) into 5 new conversations; depends on inventory density.
Cost/complexity:L-M. Reuses broadcast; needs a stalled-thread trigger + button.
EV/cost rank within V3: 5.
Hypothesis (Q2): If ghosted seekers get a 1-click broadcast option, we expect higher messages-per-seeker and fewer total drop-offs, because we redirect frustration into action.
Success measure (Q3+Q4): stalled-thread → new-conversation conversion (SQL, weekly), target: a meaningful share of stalled threads spawn ≥1 new conversation. Proxy: direct.
Cheap test: Trigger on 48h-silent conversations, render the button, measure click → new conversation. Near 1-week given broadcast exists.
V3.6 Recent-activity freshness badge on the listing card
What it is: A per-card "Lister active / replied in the last day" indicator (distinct from a lister-aggregate response badge). Menu V3.7.
Expected value:M. Tells the seeker "I will get a reply if I message now," nudging card CTR. Marginal alone, stacks with V2.3.
Cost/complexity:L. Render from chat_messages.created_at per listing.
EV/cost rank within V3: 6.
Hypothesis (Q2): If cards show recent-reply activity, we expect higher card → message rate on active listings, because seekers prioritize reachable listers.
Success measure (Q3+Q4): card CTR / card → message rate with vs without badge (PostHog, weekly), target: lift on badged cards. Proxy: direct.
Cheap test: Render badge for cards with a reply in the last 24h; A/B CTR. Near 1-week.
V3.7 (Held) Favorite → 72h soft nudge with prefilled message
What it is: Nudge a seeker who favorited but did not message within 72h, with a prefilled intro. Menu V3.6 / #10.
Cost/complexity:H (needs foundation). No favorites/wishlist table exists in prod (diagnostics Diagnostic 3). Build favorites first.
EV/cost rank within V3: 7 (deprioritized: foundation dependency).
Hypothesis (Q2): If favoriting existed and we nudged non-messaging favoriters, we would expect favorite → message conversion to rise.
Success measure (Q3+Q4): favorite → message-within-7d rate (SQL once the table exists, weekly). Proxy: weak, favoriting as message-intent is unproven at RR (Phase-0 requires the table first).
Cheap test:Needs build first (favorites table).
V4, Redistribute messages
Where the leverage is for RR specifically. V4 reallocates a fixed pool of seeker attention toward listings that currently get none. The diagnostics settle the sequencing: rehabilitate non-repliers first (V3.1/P1), route demand away only as a fallback for listers who got the awareness nudge and still did not respond. The cheapest, highest-leverage V4 item is not a feature at all but the QPR measurement spine that tells us whether any of this is working. ML-heavy ranking and real-time presence are penalized on cost.
V4.1 QPR (quotes-per-request) as the weekly leading indicator
What it is: A standing weekly metric: % of seekers whose inquiries got 3+ lister replies within 72h, plus % of inquiries getting any reply. The Thumbtack QPR analog.
Expected value:M (as instrumentation, H as an enabler). Not a behavior change, but the spine that makes every other bet falsifiable. Baseline already computed: 46.7% any-reply, 20.9% 3+-replies.
Cost/complexity:L. Pure SQL on chat_messages + conversations; add to tier-2 dashboard.
EV/cost rank within V4: 1.
Hypothesis (Q2): If we track QPR weekly, we expect to detect whether V2/V3 work moves the reply funnel, because QPR is the leading indicator upstream of 14d demand distribution.
Success measure (Q3+Q4): QPR weekly (SQL), target: move 20.9% (3+ replies) and 46.7% (any reply) upward as interventions ship. Proxy: direct (it is the leading-indicator definition).
Cheap test: Already runnable; productionize the query on the dashboard. <1 day.
V4.2 Two-message-confirmed "responded" status for badges + ranking
What it is: Define "responded" as the conversation reaching message #3 (seeker replied to the lister's first reply), not "lister typed anything." Powers honest badges/ranking. Menu V4.1.
Expected value:M. Prevents "yes available" canned-spam gaming and aligns metrics with real conversation depth; mostly a metric-integrity win.
Cost/complexity:L. Server-side metric redefinition + backfill; no user-visible change to validate.
EV/cost rank within V4: 2.
Hypothesis (Q2): If "responded" requires a two-message turn, we expect badge/ranking signals to better predict real conversations, because we stop rewarding empty replies.
Success measure (Q3+Q4): correlation of the new "responded" flag with downstream deal/retention signals vs the old flag (SQL backfill, one-time + monthly), target: stronger correlation. Proxy: validated (Airbnb uses conversion not response-count).
Cheap test: Recompute on the replica, compare distributions and retention correlation. <1 week, no UI.
What it is: Down-rank chronic non-repliers in search, but only after they got the P1 awareness nudge and still did not respond. Menu V4.4, strategy-doc M1 refined.
Prior art:
Uber-style supply rebalancing by health, inverted (route away from unhealthy supply) (conceptual; source: strategy-doc Pillar 4 / Uber comparison).
Expected value:M, but sequenced last. The diagnostics warn that routing away can confirm "RR is dead" and accelerate churn if used as a first move. Value only as a fallback after rehabilitation fails.
Cost/complexity:M. Ranking-weight change in RoomRecommendationService::calculateRoomPoints() + a "got-nudge-and-still-silent" cohort definition.
EV/cost rank within V4: 4 (intentionally below QPR and the metric-integrity items; gated behind P1).
Hypothesis (Q2): If we down-rank only post-nudge chronic non-repliers, we expect seeker attention to shift to responsive listers without accelerating non-replier churn, because we exhaust rehabilitation first.
Success measure (Q3+Q4): (a) 14d-message rate of the listings that gain ranking (SQL, weekly) target: rise; (b) guardrail: non-replier-cohort churn must not worsen vs the P1-only group (SQL, monthly). Proxy: validated for the routing mechanic; weak on the churn-guardrail (Phase-0: confirm post-nudge non-responders are genuinely inactive, not symptom-driven).
Cheap test: Bucketize listers {inquiry-zero, low-no-reply, low-replying, high} on the replica, predict 14d churn by bucket; only then wire the weight. Gated behind P1 shipping first.
V4.4 Wishlist saves as a ranking signal (Held, needs favorites)
What it is: Use save-to-view ratio as a ranking input; high-save-low-message listings get a price/photo-fix signal instead. Menu V4.2.
What it is: Soft-demote listings whose available_from is >60 days stale with no update, and prompt the lister to refresh or remove. Menu V5.9 (cross-listed here for redistribution effect).
Expected value:M. Removes dead inventory from the front of search, raising message-to-active-supply ratio. Depends on how much stale inventory exists.
Cost/complexity:L-M. Replica query to size the cohort; ranking-weight + prompt to fix.
EV/cost rank within V4: 3.
Hypothesis (Q2): If stale listings are demoted and prompted, we expect either fresher front-of-search inventory or lister updates, both raising the messageable-supply ratio.
Success measure (Q3+Q4): share of search-eligible listings with stale dates (SQL, weekly) target: drop; messages-to-active-supply ratio target: rise. Proxy: direct for freshness; validated for recency→engagement (Apartments.com/Vinted).
Cheap test: Replica query to size the stale cohort; if >5% of search-eligible listings are stale, the cleanup is high-leverage. <1 week to diagnose.
V5, Make listings more attractive
Where the leverage is for RR specifically. V5 fixes the supply side of the leak: listings that get views but no inquiries (the "overpricing/poor-photos → no inquiries → churn" loop from strategy-doc Pillar 6). For a subscription-lister base of small landlords/homeowners (not service pros), the cheapest wins prescribe a fix using data RR already has (z-scores #2729, view counts, photo counts) rather than coaching in the abstract. ML pricing and CV photo-scoring are penalized on cost; the prescriptive-nudge versions are not.
V5.1 Inquiry-deficit prescription to the lister ("47 views, 0 inquiries, try these 3 fixes")
What it is: Tell the lister explicitly when they have high views + zero inquiries, with 1-click fixes (lower price by $X, add photo Y, widen dates). Menu #7, the prescriptive twist on M1.
Expected value:H. Directly targets the view-no-inquiry loop with data RR already computes; the prescription (not just the diagnosis) is the differentiator.
Cost/complexity:L-M. Data exists (views, z-scores #2729); a dashboard widget + email. No new infra.
EV/cost rank within V5: 1.
Hypothesis (Q2): If listers with views-but-no-inquiries get a prescribed fix, we expect higher listing-edit rates and subsequent inquiries, because we convert a vague "it's not working" into a concrete action.
Success measure (Q3+Q4): (a) lister edit-rate within 7d of the prescription (SQL rooms.updated_at, weekly); (b) 14d inquiry rate on edited listings vs unedited (SQL, biweekly). Target: edited listings reach materially higher inquiry rates. Proxy: validated, view-no-inquiry → churn is established (Pillar 6); the gap is whether a prescription (vs a diagnosis) drives edits (Phase-0: A/B prescription vs plain stats).
Cheap test: Single-shot email to a sample of view-rich/inquiry-zero listings, measure edit-rate + downstream inquiries over 14d. SQL + email infra exists; near 1-week.
V5.2 Competitive "similar nearby listings averaged N inquiries" widget
What it is: Show the lister 5 anonymized nearby comparables with their inquiry counts: "Yours: 0. Similar 5 averaged 3 this week. See what's different." Menu V5.2.
Hypothesis (Q2): If listers see they underperform comparable nearby listings, we expect higher edit rates, because social comparison motivates correction more than abstract advice.
Success measure (Q3+Q4): lister edit-rate within 7d of seeing the widget (SQL, weekly), target: lift over no-widget baseline. Proxy: validated (peer-comparison behavior) but weak on RR magnitude (Phase-0 A/B).
Cheap test: Render the widget for a slice; compare 7d edit-rate. Near 1-week.
V5.3 Photo-completeness prompt ("add a bathroom / window-view photo")
What it is: Flag listings with <5 photos or all-similar-angle photos and prompt specific additions. Menu V5.6.
Expected value:M-H. Photo quality has unusually strong, consistent prior-art impact on inquiries; the prompt is cheap.
Cost/complexity:L for the count-based prompt (SQL on rooms.photos); avoid the CV-scoring version (Cost H, penalized).
EV/cost rank within V5: 3.
Hypothesis (Q2): If under-photographed listings get a specific photo prompt, we expect more photos added and higher 14d inquiry rates, because photo completeness is a proven inquiry driver.
Cheap test: Replica query bucketing listings by photo count vs 14d inquiry rate; if a clear cutoff exists, ship the count-based prompt. Diagnosis <1 week.
V5.4 Listing-completeness micro-prompts on the lister dashboard
What it is: Weekly "your listing is missing: pet policy, check-in instructions, recent photos" prompts for live listers, not just at creation. Menu V5.1.
Hypothesis (Q2): If live listers are re-prompted to complete missing fields, we expect higher field completion and inquiry rates, because complete listings rank and convert better.
Expected value:M. Better cold-start descriptions for a niche audience; bounded, mostly a quality nudge.
Cost/complexity:L. Form copy + prompt chips.
EV/cost rank within V5: 5.
Hypothesis (Q2): If the description form offers profession-specific prompts, we expect more complete, audience-relevant descriptions and a small inquiry-rate lift, because guided writing beats a blank box.
Success measure (Q3+Q4): description completion rate + length + 14d inquiry rate on prompted vs blank (SQL/PostHog, biweekly), target: lift. Proxy: weak, description quality → inquiries is plausible but unmeasured at RR (Phase-0 A/B).
V5.6 Free, rate-limited "refresh / bump" lever for listers
What it is: A free (subscription-tier, 1 per 7 days) "bump" that re-fires the similar-sublets email to recent matching searchers, giving listers an agency lever. Menu V5.4.
Expected value:M. Gives listers a lever they lack; risk of email fatigue if not rate-limited.
Cost/complexity:M. New listing_bumped event + cron re-trigger of the existing similar-sublets job.
EV/cost rank within V5: 6.
Hypothesis (Q2): If listers can bump their listing to recent matching searchers, we expect a short-term inquiry spike per bump, because we re-expose the listing to warm demand.
Success measure (Q3+Q4): bump → inquiry-within-72h rate (SQL, weekly), target: a meaningful per-bump inquiry rate without raising email unsubscribe rate. Proxy: direct for inquiries; guardrail on unsubscribe.
Cheap test: Ship for a tier slice, single button; measure bump → 72h inquiry. Partial build (event + cron re-trigger).
Cross-vector ranked shortlist (top items by EV/cost)
#
Item
Vector
Impact
Cost
Cheap-test or build?
Note
1
Guest inquiry (no account for first message)
V2.1
H
M
Build (sprint) + Phase-0 events
Recovers the measured 54% registration-gate loss; strategy-doc F1
2
Lister "N inquiries waiting" awareness (P1 #5007)
V3.1
H
L
In flight
Reframe KPI to retention + reply-rate
3
First-message auto-confirm + response clock
V2.2
H
L-M
Cheap test (~1 wk)
Seeker-side of the ghosting loop
4
QPR weekly leading indicator
V4.1
M (enabler)
L
Cheap test (<1 day)
Measurement spine; baseline 46.7% / 20.9%
5
Inquiry-prefill from prior conversation
V3.2
M-H
L-M
Cheap test (~1 wk)
Compounds messages-per-seeker
6
Inquiry-deficit prescription to lister
V5.1
H
L-M
Cheap test (~1 wk)
Prescribe, do not just diagnose
7
Competitive "nearby averaged N inquiries" widget
V5.2
M-H
L
Cheap test (~1 wk)
Peer-comparison motivation
8
Photo-completeness prompt
V5.3
M-H
L
Diagnose <1 wk
Strongest prior-art impact numbers
9
Similar-listings bar on listing page (pre-inquiry)
V3.3
M
L
Cheap test (~1 wk)
Reuses M3 component
10
Two-message-confirmed "responded" status
V4.2
M
L
Cheap test (<1 wk, no UI)
Metric integrity; anti-gaming
Items intentionally below the line: V2.3 trust block (M/L-M, good but marginal vs the 54% leak), V5.4 completeness prompts (incremental on V5.1/V5.3), V3.4 special-offer primitive (high ceiling but M-H cost), V4.3 demand routing (gated behind P1, fallback only), V1.x (longer-horizon track). Favorites-dependent items (V3.7, V4.4) are excluded pending the foundation.
What to ignore and why (extends the menu's list)
Per-lead credit charging (Bark/Thumbtack). RR's lister economy is subscription-based; small landlords are not service pros with margins to absorb $1.42-$2.35/lead. Adds friction, craters volume. Borrow the lead-quality preview (anonymized profession + dates), not the payment (source: https://www.beltstack.com/lead-generation/compare/bark-vs-thumbtack).
Paid bumps as revenue (OfferUp/Vinted). Use the bump mechanic (V5.6) free; charging subscription listers again for visibility erodes trust.
Poshmark reciprocal "share to followers." Assumes sellers care about peers' inventory; RR homeowners have no analog social interest. Does not transfer.
Reverb continuous-watch buyer pursuit. Assumes high-frequency browsing; RR seekers visit in bursts then disappear for weeks. Real-time presence pings (menu V4.3) are at best a once-per-session ping, not a continuous-watch model, and the build cost (real-time infra) is Cost H for uncertain payoff. Deprioritize the real-time-presence idea on cost grounds.
Roomi hard message caps (5/day). Directly contradicts the Q2 more-messages goal. Skip.
Net-new ML pricing / CV photo-scoring. Penalized on cost. The prescriptive, count-based, z-score-based versions (V5.1, V5.3) get ~80% of the value at Cost L. Build the ML version only if the cheap versions validate the lever.
Hinge "Your Turn" unanswered-thread cap (menu #2/V3.2). Contrarian (reduces raw message count), hard to A/B cleanly, hard to explain to RR's audience. Strong prior art but held for a later cycle per the diagnostics; the rehabilitative P1 lever achieves the same "unblock dead inventory" goal more legibly.
Favorites-dependent ideas. No favorites table in prod. Reclassified "needs foundation," not "cheap." Decide whether to build favorites as a foundation investment before any of these.
Open questions for the team
Verification + guest inquiry interaction (Gaurav/Ahmed). V2.1 removes the account wall, but messages are currently held pending verification (a fraud control). Can guest inquiries be delivered to listers immediately with verification deferred, or does that reopen the fraud surface the tier-1 Trust guardrail protects? This gates the single highest-EV item.
Phase-0 inquiry-funnel events (Ahmed). V2.1, V2.4, V2.5 all need inquiry_modal_opened / inquiry_submitted / inquiry_form_abandoned (already P0 in the strategy doc) to measure cleanly. Confirm these ship before/with the V2 work, else we are flying blind on the biggest leak.
P1 (#5007) success-metric reframe (Gaurav). Diagnostics argue P1 is a churn-reduction lever, not just a message-volume lever. Should its registered KPI be the non-replier cohort's 14d/30d retention + 72h reply rate, not raw messages? This changes what "P1 worked" means.
Favorites as a foundation bet (Gaurav). Three menu/catalog items (V3.7, V4.4, plus future wishlist signals) are blocked on a favorites table. Is favorites worth building as standalone foundation, or do we route around it entirely this cycle?
V1 horizon expectation (Megan/Gaurav). Programmatic SEO (V1.1) is a 1-quarter-plus bet, not a June lever. Is V1 in scope for this initiative, or owned by the separate SEO/content track (tier-2 NB-Organic / TN monitoring)?
Demand-routing sequencing guardrail (Gaurav/Ahmed). V4.3 must fire only after P1 fails to rehabilitate a lister. What is the operational definition of "P1 nudge sent and still silent," and how long do we wait before down-ranking?
answering questions specifically from Slack to me.
Why listers go quiet:
room no longer available, don't think they NEED to reply (Super annoying)
lister unaware they received messages - ie maybe going to spam, bounced, etc.
traveler doesn't "match" their requirements, so they don't respond. IE: too short of a stay, bringing pets, gender preference.
traveling and have room still active
listers replying, but from wrong email or to support or non-related email, so never logs our system.
I think we TRY to nudge hosts initially, but ultimately if a host is a non-replier, we route traffic away from them - or do something where the listing goes dormant and host has to re-engage to resume traffic.
One idea from traveler doesn't match... I wonder if we could add a way for hosts to leave feedback of leads to help us understand what's wrong with certain travelers to provide better leads.
IE: - length of stay mismatch
answering questions specifically from Slack to me.
Why listers go quiet:
I think we TRY to nudge hosts initially, but ultimately if a host is a non-replier, we route traffic away from them - or do something where the listing goes dormant and host has to re-engage to resume traffic.
One idea from traveler doesn't match... I wonder if we could add a way for hosts to leave feedback of leads to help us understand what's wrong with certain travelers to provide better leads.
IE: - length of stay mismatch