Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save rami-ruhayel-vgw/e4f52b7989eeb5e3320667f6d4228fda to your computer and use it in GitHub Desktop.

Select an option

Save rami-ruhayel-vgw/e4f52b7989eeb5e3320667f6d4228fda to your computer and use it in GitHub Desktop.
GeoComply Data Pipeline Investigation: Poker domain to Snowflake architecture, CloudFront geo root cause, IDENTITY_LOGIN lineage

First-Time vs Repeat Player Tracking: Analysis

Problem Statement

Every geo-location metric today is a blended average. A 15% verification failure rate could mean 40% of new players fail (catastrophic for growth) and 5% of returning players fail (normal friction). There is no way to distinguish first-time players from returning players in the current event stream.

When the canary goes from 5% to 100%, 95% of active players hit geo for the first time. Without segmentation, operational dashboards become unreadable.

Two systems, two purposes

Geo events need to exist in two systems because they serve different audiences asking different questions at different timescales.

Loki (operational monitoring): "Is the geo flow healthy right now?"

Question Who asks Timescale Example
Is the verification failure rate spiking? On-call engineer Minutes Alert: failure rate > 20% for 5 minutes
Is GeoComply SDK latency degrading? On-call engineer Minutes P95 jumps from 3s to 30s
Are users being signed out mid-session? On-call engineer Hours Sign-out count exceeds 10/hr
Is the canary group reaching the lobby? Feature owner Hours Lobby load rate drops below 70%
Is geo causing more friction than control? Feature owner Hours Friction delta exceeds 15pp

These are rate/count/quantile queries over sliding time windows. They trigger alerts. They answer "is something broken?" The audience is engineers. Aggregate rates are sufficient - individual player precision is not needed.

Snowflake (analytics): "What is geo doing to our players over time?"

Question Who asks Timescale Why Loki can't answer it
D7 retention for players whose first session included geo? Product manager Weeks Requires joining first event with session 7 days later
How many unique players failed verification this week? Product manager Days Loki counts events, not distinct players
Did a specific player complete all funnel steps? Support engineer Per-session Requires joining events by sessionId
What % of new signups never reach the lobby? Growth team Weeks Requires correlating registration with absence of lobby events
Does geo friction affect high-value players differently? Analytics team Months Requires joining geo events with player value segment
Geo completion rate by US state? Compliance/legal Months No GROUP BY with distinct counts in LogQL
Are canary group players depositing less than control? Product manager Weeks Requires joining geo assignment with transaction data

These are SQL queries with joins, window functions, distinct counts, and cross-domain data. They inform product decisions. The audience is PMs, analysts, compliance. Data needs to be precise at the individual player level.

The fundamental difference:

  • Loki answers: "What's happening?" (aggregate, real-time, operational)
  • Snowflake answers: "What happened, to whom, and what was the impact?" (per-player, historical, analytical)

The same underlying event (e.g., a verification failure) needs to exist in both systems, but for different reasons. In Loki it increments a counter on a dashboard. In Snowflake it's a row tied to a specific player that can be joined with their registration date, deposit history, and future sessions to determine whether that failure caused them to churn.

Separation of Concerns

With the two-system framing in mind, analysis identified two distinct problems being conflated:

1. Geo-specific: "Is the geo flow working for first-timers?"

This is about the geo verification system's behaviour. Belongs in geo instrumentation.

Dimensions that are legitimately geo-specific:

Dimension Type Description
isFirstGeoVerification boolean Has this player ever completed geo verification?
geoVerifyCount integer How many successful geo verifications has this player completed?
verificationSequence integer Nth verification attempt within this session (distinguishes fresh attempts from retries)

Source: Server-side flag (first_geo_verification_at in Auth0 user_metadata or player DB column). Read from existing getUser() call in runGeoGate(). Written back on first successful verification.

2. Platform-wide: "Is this a new player? Are they coming back?"

This is about player acquisition and retention. Belongs in platform analytics, not geo instrumentation.

Dimensions that are NOT geo-specific (should live in LogManager or platform analytics):

Dimension Type Why it's platform-wide
playerTenureDays integer Useful on every event in the system, not just geo
isNewAccount boolean Same
daysSinceLastVisit integer Session-level attribute
sessionNumber integer Same

Analyses that need a data warehouse, not Loki:

Analysis Why Loki can't do it
D1/D7/D30 retention Requires cross-session join by playerId
Cohort analysis (signup week, first-geo week) Requires grouping by first-event date + tracking subsequent events
Unique player counts No COUNT(DISTINCT) in LogQL
Per-session funnel (did player X complete all steps?) Requires sessionId-based join
"Ghost registrations" (registered but never reached lobby) Requires correlating registration events with absence of lobby events
A/B retention (geo group vs control group D7 return rate) Cross-session, cross-day correlation

What Loki Can Handle (Geo-Specific, Operational)

With isFirstGeoVerification on geo events, these Loki queries become possible:

# First-time geo verification failure rate
count_over_time({...} |= `geo.verification.completed` |= `"isFirstGeoVerification":true` |= `"outcome":"failed"` [1h])
/
count_over_time({...} |= `geo.verification.completed` |= `"isFirstGeoVerification":true` [1h])

# P95 latency for first-time vs repeat
quantile_over_time(0.95, {...} |= `geo.verification.completed` |= `"isFirstGeoVerification":true` | regexp ... | unwrap durationMs [1h])

Alerts can fire on: "first-time failure rate > 30% for 5 minutes" (different threshold from repeat).

Data Pipeline: Poker Domain to Snowflake

Current architecture (from pok-infra investigation)

Two separate data paths exist. They do not overlap.

Path A: Application logs (operational monitoring)

Browser -> Game Client Server -> CloudWatch -> Kinesis Firehose -> Grafana Cloud Loki
                                                    |
                                                    v (failures only)
                                               S3 fallback bucket
  • Firehose buffering: 1MB / 60s
  • Lambda filter drops events older than 7 days
  • Labels added: vgw_team=pok, pok_environment=${env}
  • CloudWatch subscription filters cover: /${env}/ecs/*, game-server EC2 logs, Auth0 event logs
  • Geo analytics events travel this path. They land in Loki and are queryable via LogQL.
  • This path does NOT feed Snowflake.

Path B: Domain event stores (analytics/warehouse)

Domain Event Stores -> ECS "eventstore-snowflake" projector services -> S3 Data Landing Zones -> Snowflake
  • S3 DLZ buckets defined in pok-infra/data/landing-zones.tf:
    • ${env}-game-dlz (game domain)
    • ${env}-transaction-dlz (transactions)
    • ${env}-customer-dlz (regulated customer data)
    • ${env}-player-account-dlz (player accounts)
    • ${env}-observability-dlz (ALB access logs, synthetics)
  • ECS projector services (deployed by other repos, not pok-infra):
    • user-eventstore-snowflake-pm
    • game-eventstore-snowflake-pm
    • cdd-eventstore-snowflake
  • These project structured domain events into S3 for Snowflake ingestion
  • README in pok-infra/data/ explicitly states: "used to collect and hold data to be consumed by Snowflake"

Path C: Database replication (DNA Data Lake team)

MySQL databases (site, system) -> VPC peering -> DNA Data Lake AWS account
  • Direct MySQL read replicas, not log-based
  • QC and prod only
  • Managed by the DNA team

Key gap

Geo analytics events (Path A) do not reach Snowflake (Path B). They exist only in Loki, which cannot support retention, cohort, or unique player analysis.

To get geo events into Snowflake, one of these would be needed:

  1. New eventstore projector: A geo-eventstore-snowflake service that reads geo events and writes to observability-dlz or a new geo-dlz S3 bucket
  2. Additional Firehose destination: Add S3 as a second destination on the existing CloudWatch -> Firehose pipeline (Firehose supports multiple destinations)
  3. CloudWatch export: A separate subscription filter sending geo-tagged logs to a new Firehose -> S3 pipeline
  4. Loki export: Query Loki and dump results to S3 on a schedule (fragile, not recommended)

The existing "eventstore-snowflake" pattern (option 1) is the established convention. The geo feature doesn't have an event store today - its events are log lines, not domain events. This is a fundamental architectural mismatch that needs a decision.

Game client analytics integrations (from gp-game-client investigation)

Integration Type Data flow Feeds Snowflake?
LogManager Application logging Browser -> Express /log -> stdout -> CloudWatch -> Firehose -> Loki No
FullStory Session recording Browser -> fullstory.com No
Google Analytics (Classic) Page/event tracking Browser -> google-analytics.com No (legacy ga.js, deprecated)
Google Tag Manager Conversion tracking Browser -> GTM -> configured destinations Possibly (GTM config is external)
Braze Web SDK Customer engagement Browser -> sdk.iad-01.braze.com No

Note: Sumo Logic has been phased out. The log path is now solely: Browser -> LogManager -> /log endpoint -> stdout -> CloudWatch -> Firehose -> Loki.

No game client integration feeds Snowflake directly. The Snowflake data comes from domain event stores via separate ECS projector services.

The architectural gap

Geo analytics events are log lines (structured text in CloudWatch/Loki). The Snowflake pipeline consumes domain events (from event stores via projector services). These are fundamentally different:

Log lines (geo events) Domain events (event stores)
Format Structured JSON in a log line First-class events in an event store
Path to Loki CloudWatch -> Firehose -> Loki N/A
Path to Snowflake None Event store -> ECS projector -> S3 DLZ -> Snowflake
Queryable by LogQL (rates, counts, quantiles) SQL (joins, window functions, distinct counts)
Retention analysis Not possible Possible
Unique player counts Not possible Possible

To get geo analytics data into Snowflake, you'd need to bridge this gap. Options:

  1. Emit geo events to an event store (not just logs). The existing emitGeoEvent abstraction could write to both LogManager (for Loki) and a domain event store (for Snowflake). Requires a new event store and projector.
  2. Add S3 as a Firehose destination alongside Loki. Filter for [geo-analytics] tagged events. Snowpipe ingests from S3. Simpler but mixes log data into the warehouse.
  3. ETL from Sumo Logic to Snowflake. Sumo Logic has Snowflake export capabilities. If geo events already reach Sumo, this might be the cheapest path.
  4. GTM dataLayer push. GTM can route events to BigQuery or other destinations. Would require adding geo events to the GTM dataLayer. Non-standard for this type of data.

VGW data platform (from GitHub org investigation)

Snowflake is the primary analytics warehouse across VGW. The infrastructure is mature:

Global Poker specific repos (not checked out locally):

Repo Purpose
VGW/pok-snowflake IaC for Global Poker's Snowflake tenant
VGW/dna-helix-global-poker-data-streams Kinesis streams for Global Poker
VGW/dna-global-poker-cdc-pipeline CDC pipeline between poker domain and DNA
VGW/dna-poker-analytics Poker analytics
VGW/analytics-poker Poker analytics

Platform-level data infrastructure:

Repo Purpose
VGW/dataeng-snowflake-platform Common Snowflake framework, Terraform-managed
VGW/dataeng-aws-platform AWS infra for data engineering
VGW/domain-snowflake-template Template repo for new domain Snowflake tenants
VGW/dna-airflow-etl Airflow container for ETL
VGW/dna-data-pipelines Data pipelines
VGW/datalake-events-forwarder Forwards events from source to CDC/DNA Kafka
VGW/gam-clickstream-collector Clickstream collector service

17+ domain-specific Snowflake repos exist (pok, chu, pay, pam, cam, mar, gap, aml, etc.), each managing their own tenant via Terraform. Two team naming conventions coexist: legacy "DnA" (Data and Analytics) and modern "Data Engineering" (dataeng-).

How poker domain events reach Snowflake today

The definitive path, confirmed across 5 repos:

PostgreSQL Event Store (Aurora)
  -> ECS Process Manager (pok-eventstore-datapipeline Docker image)
    -> Kinesis Firehose
      -> S3 Data Landing Zone bucket
        -> SQS notification -> Snowpipe (auto_ingest)
          -> RAW schema (JSON parsed to columns)
            -> Scheduled dedup tasks -> CLEANSED schema
              -> Streams + scheduled tasks -> CURATED schema (business-ready)

Five projector services run this pattern:

ECS Service Source DB S3 DLZ Bucket Events
user-eventstore-snowflake-pm aurora-pg-user {env}-customer-dlz/user-eventstore/ Registration, profile, status changes, identity
game-eventstore-snowflake-pm aurora-pg-game {env}-game-dlz/v2_casino/ Casino/slots game events
cdd-eventstore-snowflake aurora-pg-cdd {env}-customer-dlz/cdd-eventstore/ KYC/AML verification events
store-eventstore-snowflake-pm aurora-pg-store {env}-transaction-dlz/v2_store/ Purchase/store events
player-account-snowflake-pm player-account DB {env}-player-account-dlz/player-items/ Wallet, player items

Projectors are domain-agnostic - they project ALL events from their PostgreSQL event store, not specific types. Each event has: position, streamType, streamId, eventType, version, payload, metadata.

Separately, Kafka topics also feed Snowflake via a Kafka Connect connector (connect-pok-events-{env} consumer group, managed by Data Engineering). 7 active topics carry game server events (table events, tournament events, achievements, purchases, leaderboard events).

What's already in Snowflake that matters to us

Snowflake Table Key Fields Relevance
CLEANSED.USER_EVENTSTORE_LOGGED_IN email, platform, time, IP, authId, userAgent, provider, cookies, utmParams Login events with device/platform info
CURATED.IDENTITY_LOGIN accountId, geoLocationCountryCode, geoLocationSubdivisionCode, geoLocationProviderName, geoLocationProviderVersion, geoLocationSourceType, deviceIdentifier Already has geo fields - provider name/version suggest GeoComply or similar
CURATED.CUSTOMER_ATTRIBUTES_OUTPUT registrationDate, lastLogInDate, valueSegmentTier Player tenure, last activity
CURATED.ACCOUNT_ACTIVITY_SUMMARY firstLoginDate, lastLoginDate, firstPlayDate, lastPlayDate, firstPurchaseDate, totalPurchaseAmount Retention-ready fields (but sessionCount30Days is NULL/unpopulated)
CLEANSED.USER_EVENTSTORE_ACCOUNT_CREATED email, facebookId, loginType, screenname, userId, utmParams Registration events
CLEANSED.USER_EVENTSTORE_REGISTRATION_STARTED email, IP, platform, provider, userAgent Registration funnel start

Key insight: CURATED.IDENTITY_LOGIN has GEO_LOCATION_* columns but they are all hardcoded NULL. The columns were added as a schema-forward placeholder. The Snowflake team anticipated geo data would arrive but no source has been integrated yet. The AML documentation states these fields would come from "TMX (ThreatMetrix) data which is currently unavailable in Snowflake." GeoComply verification data would be the first real geo data in the poker Snowflake tenant - filling a recognized gap.

What's NOT in Snowflake

  • Browser-side geo verification funnel events (notice shown/dismissed, verification started/completed, lobby loaded)
  • GeoComply-specific outcomes (success/failed/restricted, failure reasons, restriction types)
  • Verification latency (durationMs)
  • Post-login monitoring events (location checks, sign-outs)
  • Canary group assignment (inGroup)
  • Any clickstream/analytics events from the game client browser

Revised data flow picture

                                          ┌─────────────────────────────┐
                                          │     Snowflake (analytics)    │
                                          │     VGW/pok-snowflake        │
                                          └──────────────▲──────────────┘
                                                         │
                                    ┌────────────────────┤
                                    │                    │
                              S3 DLZ buckets    CDC pipeline
                              (game, txn, etc)  (dna-global-poker-cdc-pipeline)
                                    ▲                    ▲
                                    │                    │
                          ECS projector           Kinesis streams
                          services                (dna-helix-global-poker-
                          (*-eventstore-           data-streams)
                           snowflake-pm)                 ▲
                                    ▲                    │
                                    │                    │
                              Event Stores          MSK (Kafka)
                              (domain events)       (pok-infra/data/msk.tf)
                                    ▲                    ▲
                                    │                    │
                              ┌─────┴────────────────────┴─────┐
                              │     Application Services        │
                              │  (poker, payments, accounts)    │
                              └────────────────────────────────┘

Meanwhile, geo analytics events go here (separate, no Snowflake path):

  Browser -> LogManager -> /log endpoint -> stdout -> CloudWatch -> Firehose -> Loki
                                         -> SumoLogger -> Sumo Logic

What this means for geo analytics + Snowflake

The poker domain already has a well-established path to Snowflake: domain events flow through event stores and/or Kafka/Kinesis into S3 DLZ buckets, then into the pok-snowflake tenant. The infrastructure exists and is maintained by the DnA/Data Engineering team.

Geo analytics events are currently log lines, not domain events. They exist only in Loki (operational) and Sumo Logic (logging). To get them into Snowflake, the options in order of alignment with existing patterns:

  1. Publish geo events to MSK (Kafka). The game client server already has access to the MSK cluster. Add a Kafka producer alongside the LogManager sink. The existing CDC pipeline or a new consumer picks them up and writes to S3 DLZ -> Snowflake. Most aligned with VGW architecture.

  2. Write geo events to an event store. Create a geo event store. The existing eventstore-snowflake projector pattern handles the rest. Cleanest domain modeling but more engineering work.

  3. Add an S3 sink to the Firehose pipeline. The existing CloudWatch -> Firehose already processes all ECS logs. Add S3 as a second destination, filter for [geo-analytics] tagged events. Data Engineering team ingests from S3. Cheapest to implement but mixes log data into the warehouse.

  4. Sumo Logic -> Snowflake export. If Sumo Logic already has an export integration (common in enterprise setups). Cheapest if the integration exists, but adds dependency on Sumo Logic retention.

Reference implementation: gam-clickstream-collector

The VGW/gam-clickstream-collector (Kotlin/Spring Boot, owned by GSE/Games team) is a working example of browser analytics -> Kafka -> Snowflake:

Game Client -> postMessage -> GAP IFrame -> HTTP POST /public/v1/clickstream
  -> Collector (validates, wraps in EventEnvelope)
    -> Core MSK Kafka [clickstream-events topic]
      -> Snowflake connector -> GAP_CLICKSTREAM_EVENTS table

Architecture:

  • HTTP-in, Kafka-out (no direct Snowflake dependency)
  • Unauthenticated public endpoint (fire-and-forget analytics, Sev-3/Tier-4)
  • Events validated via kotlinx.serialization with polymorphic discriminator (actionName)
  • Wrapped in EventEnvelope (from event-store-core library) before Kafka publish
  • Kubernetes on EKS, deployed via Helm + ArgoCD
  • Currently handles one event type (PlayEnded for slot games) but explicitly designed for extension

Why it doesn't fit geo directly:

  • Common fields (gameSessionId, gameInstanceId, gameId, coinType) are all required and game-context specific. Geo events happen before/outside a game session.
  • Owned by GSE team, not poker team.
  • Events land in GAP_CLICKSTREAM_EVENTS table, not poker's Snowflake tenant.

What to take from it:

  • The pattern is proven: thin HTTP collector -> Kafka -> Snowflake connector.
  • The poker domain has its own MSK cluster (pok-msk-{env}) and Snowflake tenant (pok-snowflake).
  • The same pattern could be applied: either a standalone pok-clickstream-collector service, or the existing game client server could publish to the poker MSK cluster from the /log endpoint (or a new /analytics endpoint).
  • ADR-0001 in the repo documents the architecture decision process well and could serve as a template.

Options for getting geo events into Snowflake

Option A: Publish to the user event store (most aligned)

Geo verification is a user-level event. The user-eventstore-snowflake-pm already projects ALL events from aurora-pg-user to S3 -> Snowpipe -> Snowflake. If geo verification events were written to the user event store (as new event types like GeoVerificationCompleted, GeoVerificationFailed), they'd flow to Snowflake automatically through the existing pipeline with zero new infrastructure.

Requires: The geo verification flow (or a server-side handler) writes events to the user PostgreSQL event store. The pok-user service owns this.

Pros: Uses established pattern. No new pipelines. Data lands alongside login/registration events in the same Snowflake schema. The pok-snowflake team adds CLEANSED/CURATED views.

Cons: Browser-side events (notice shown/dismissed, lobby loaded) don't naturally belong in a server-side event store. Would need a server endpoint to receive them. Only captures events that reach the server (drops if the user closes the browser before the event fires).

Option B: Publish to Kafka (new topic)

Add a new Kafka topic (e.g., pok.app-event.gameclient.geo-analytics.v1.eu-west-1) on the poker MSK cluster. The existing Kafka Connect connect-pok-events-{env} connector (or a new one) ingests into Snowflake.

Requires: Kafka producer in the game client server (doesn't exist today - only the Java game server publishes to Kafka). A new Snowflake connector or extending the existing one.

Pros: Clean separation. The game client server could publish from the /log endpoint. Kafka is the modern event bus for the domain.

Cons: Game client server (Node.js/Express) has no Kafka producer today. Adding one means new dependency (kafkajs + MSK IAM auth). Connector setup involves Data Engineering.

Option C: Write directly to S3 DLZ

The game client server writes geo analytics events as JSON files to the {env}-customer-dlz or {env}-observability-dlz S3 bucket. Snowpipe auto-ingests.

Requires: S3 write permissions for the game client ECS task. A new Snowpipe in pok-snowflake for the geo analytics path.

Pros: Simplest. No Kafka, no event store. S3 + Snowpipe is battle-tested. The DLZ buckets already exist.

Cons: Bypasses the event store and Kafka patterns. Batch-oriented (files, not real-time). Need to manage file partitioning, naming, and batching.

Recommendation: Option A for verification outcomes, Option C for browser-side funnel events.

Verification success/failure/restricted outcomes are domain events that belong in the user event store. Browser-side funnel events (notice shown, dismissed, lobby loaded) are analytics/observability data that fit better as batched writes to S3. Both paths feed Snowflake via existing infrastructure.

Existing data we can leverage today (no new pipelines needed)

Even without getting geo events into Snowflake, we can answer SOME of the first-time vs repeat questions using data already there:

Question How to answer with existing Snowflake data
Is this player new? CUSTOMER_ATTRIBUTES_OUTPUT.REGISTRATION_DATE gives account age
Has this player logged in before? ACCOUNT_ACTIVITY_SUMMARY.FIRST_LOGIN_DATE vs current date
What's the player's value segment? CUSTOMER_ATTRIBUTES_OUTPUT.VALUE_SEGMENT_TIER
Has this player been geo-verified before? IDENTITY_LOGIN.GEO_LOCATION_PROVIDER_NAME - if populated, they've had at least one geo-verified login
Player retention after geo rollout? Compare ACCOUNT_ACTIVITY_SUMMARY first/last login dates across the canary rollout period

Critically: IDENTITY_LOGIN already captures geo provider info at the CDS/identity level. We should investigate what populates this - if it's Auth0 login events enriched with GeoComply data, some of our questions might already be answerable.

IDENTITY_LOGIN geo fields: resolved

All five GEO_LOCATION columns are hardcoded NULL. Traced the full lineage:

S3: {env}-customer-dlz/user-eventstore/
  -> Snowpipe -> RAW.USER_EVENTSTORE (PAYLOAD variant contains full JSON)
    -> Task (hourly) -> CLEANSED.USER_EVENTSTORE_LOGGED_IN (extracts email, platform, IP, authId, etc. but NO geo fields)
      -> Stream -> Task (hourly) -> CURATED.IDENTITY_LOGIN (hardcodes all 5 geo columns as NULL)

Source events are Auth0 LOGGED_IN events with STREAM_TYPE = 'USER_LOGINS'. The payload contains email, platform, time, ip, authId, userAgent, provider, cookies, utmParams but no geo data.

The AML docs state: geo fields would come from TMX (ThreatMetrix), "currently unavailable in Snowflake." The DMF null-count checks for these columns are commented out (team knows they're NULL).

Implication: GeoComply data from our instrumentation would be the first real geo data in the poker Snowflake tenant. The schema is already waiting for it. The IDENTITY_LOGIN task could be updated to pull from a new source (either the LOGGED_IN payload if enriched with geo data, or a new event type like GEO_VERIFICATION_COMPLETED).

pok-user event store: resolved

The pok-user domain (VGW/pok-user, Kotlin) publishes 15 event types to a PostgreSQL event store. The LOGGED_IN event is triggered by an Auth0 post-login action calling POST /audit/login/:id. Its payload:

authId, time, provider, platform, utmParams, userAgent, ip, cookies, email

Raw IP is captured but no geo resolution happens. No GeoComply, no geo-enrichment, no CDS integration. The USER_LOGINS stream has no known consumers within pok-user.

To get GeoComply data into Snowflake via this path, either:

  • Enrich LOGGED_IN: Add geo fields to the Auth0 post-login action payload (would require changes to both the Auth0 action and pok-user's AuditUserLoginRequest)
  • New event type: Add a GEO_VERIFICATION_COMPLETED event type to the user event store. Publish from a new endpoint called by the game client after verification. Flows through the existing user-eventstore-snowflake-pm -> S3 -> Snowpipe pipeline automatically.

Why CloudFront geo data doesn't reach Snowflake today

Three independent geo sources exist at login time. None make it into the LOGGED_IN event:

Source Where it lives Why it's not in LOGGED_IN
CloudFront headers (viewer_country, viewer_country_region) Rendered into page as JS globals -> sent to Auth0 as query params -> persisted to user_metadata.cookies by persist-parameters action -> embedded in JWT by enrich-token-metadata action Filtered out by post-audit-login-event's cookie allow-list (line 118 of pok-auth0/auth0/src/actions/post-audit-login-event.ts). The allow-list only passes marketing cookies (_fbp, _ga, ECID, etc.).
Auth0 geoip (event.request.geoip.countryCode, subdivisionCode) Available in every Auth0 action automatically (Auth0's own IP-based geo) Never read. No Auth0 action accesses event.request.geoip at all.
Raw IP (event.request.ip) Sent as ip field in the audit POST body Stored in the event but not geo-resolved. Just a raw IP string.

The full CloudFront geo path:

Browser request -> CloudFront adds viewer_country/viewer_country_region headers
  -> Express server reads headers, renders as JS globals in index.html
    -> Browser sends to Auth0 as /authorize query params
      -> Auth0 persist-parameters action saves to user_metadata.cookies
        -> Auth0 enrich-token-metadata action reads cookies -> JWT claims
          -> Browser reads JWT claims via fetchUserRegion() -> jwtGeo in analytics events

But separately:

Auth0 post-audit-login-event action -> POST /audit/login/:id to pok-user
  -> Builds cookies object from user_metadata.cookies
    -> Filters through allow-list (line 118)
      -> viewer_country and viewer_country_region are NOT in the allow-list
        -> Lost. Never reaches LOGGED_IN event. Never reaches Snowflake.

Supporting multiple geo sources

When GeoComply goes live, there will be three potential geo sources per login:

Source Accuracy Availability Legal standing
CloudFront (IP-based, CDN edge) Approximate (ISP hub, not actual location) All users, every request Not sufficient for regulatory compliance
Auth0 geoip (IP-based, Auth0 edge) Approximate (same as CloudFront, different resolver) All users, every login Not sufficient for regulatory compliance
GeoComply SDK (device-level verification) Precise (GPS, WiFi, cell tower triangulation) Canary group only (expanding to 100%) Legally required for regulated markets

The IDENTITY_LOGIN table already has GEO_LOCATION_SOURCE_TYPE and GEO_LOCATION_PROVIDER_NAME columns designed for this. The recommended approach:

Short term (fix the existing gap):

  • Add viewer_country and viewer_country_region to the cookie allow-list in post-audit-login-event.ts. This is a one-line change in pok-auth0. The CloudFront geo data that's already being captured and JWT-embedded would flow through to the LOGGED_IN event, then to Snowflake via the existing pipeline.
  • Alternatively, add event.request.geoip fields as dedicated top-level fields in the audit POST body. Cleaner than overloading the cookies map.

Medium term (GeoComply integration):

  • Add a new event type GEO_VERIFICATION_COMPLETED to the pok-user event store with: outcome, subdivision, country, providerName ("GeoComply"), providerVersion, sourceType ("sdk"), durationMs, restrictionType.
  • The game client calls a new pok-user endpoint after verification.
  • Flows through existing user-eventstore-snowflake-pm -> S3 -> Snowpipe -> Snowflake pipeline automatically.
  • The pok-snowflake IDENTITY_LOGIN task gets updated to populate the geo columns from this new event type instead of hardcoding NULL.

Result: Each login has an IP-based geo reading (CloudFront or Auth0, always available) and optionally a precise GeoComply reading (canary group, then all users). Both are queryable in Snowflake with the source identified.

Ownership summary

Component Owner Change needed
pok-auth0 (Auth0 actions) Our team Add viewer_country to allow-list, or add geoip fields
pok-user (event store) Our team New event type + endpoint for GeoComply data
pok-infra (monitoring) Our team Already done (Loki dashboards + alerts)
gp-game-client (browser) Our team Already done (geo analytics events via LogManager)
pok-snowflake (warehouse) Data Engineering / pok-snowflake team Update IDENTITY_LOGIN task to populate geo columns

Open questions

  • Should we fix the CloudFront geo gap (one-line allow-list change) as a quick win before the larger GeoComply integration?
  • For the GEO_VERIFICATION_COMPLETED event: should it go on the USER_LOGINS stream (alongside LOGGED_IN) or the USER_AUDIT stream?
  • Do we want both IP-based and SDK-based geo readings per login, or just the highest-fidelity source available?
  • Who specifically owns the pok-snowflake repo and the IDENTITY_LOGIN task?
  • Is the connect-pok-events-{env} Kafka Connect connector managed by Data Engineering? Can it ingest from a new topic?
  • For retention baselines, can we query ACCOUNT_ACTIVITY_SUMMARY now (first/last login dates exist even without geo data)?
  • Who owns the CDS schema and the IDENTITY_LOGIN task in pok-snowflake?
  • Should we propose updating the IDENTITY_LOGIN task to populate the existing NULL columns, or create new dedicated geo tables?

Implementation Plan

Phase 1: Geo-specific dimensions (no backend changes)

  • Add verificationSequence (sessionStorage counter) to verification events
  • Add hasLocalVerifyHistory (localStorage check, supplementary signal only) to geo.rollout.evaluated
  • Update simulator and dashboard with new fields

Phase 2: Server-side geo flag (one backend change)

  • Add first_geo_verification_at to Auth0 user_metadata
  • Read in runGeoGate() from existing getUser() response
  • Write back on first successful verification (idempotent)
  • Add isFirstGeoVerification and geoVerifyCount to setGeoContext()
  • Segmented dashboard panels and alerts

Phase 3: Canary-to-100% transition

  • Add geoRolloutPhase dimension ("canary_5", "canary_25", "ga")
  • Pre-populate first_geo_verification_at for all canary-period verifiers
  • First-time user funnel row in dashboard with separate alert thresholds

Separate initiative: Platform analytics

  • Cross-cutting player context (accountAgeDays, isNewPlayer) in LogManager playerDetails
  • Event export pipeline to Snowflake for retention/cohort analysis
  • Not owned by geo feature team

Behavioural Science Insights

Key findings from behavioural analysis (relevant to geo-specific work):

  • Friction budget asymmetry: First-time players have ~50-80 units of friction tolerance vs ~150-300 for returning players. A verification failure that barely registers for a returning player is budget-destroying for a new one.
  • Habituation timeline: Prompt dismissal time drops 40-60% by visits 2-3. The geo step becomes "invisible" by visit 8-15. Track geoVerifyCount to verify this empirically.
  • Prompt-to-dismiss latency is the single best predictor of friction sensitivity. Consider adding this as a computed field on geo.notice.dismissed (timestamp delta from geo.notice.shown).
  • Canary-to-100% cohort: Loyal players encountering new friction react more negatively than brand-new players (loss aversion). Consider a one-time explanatory interstitial.
  • Restricted market detection should happen as early as possible. Every step between "I want to play" and "you can't" is brand damage.

Open Questions

  1. How does poker domain data currently reach Snowflake? (under investigation)
  2. Who owns the platform-wide analytics infrastructure at VGW?
  3. Can first_geo_verification_at be added to Auth0 user_metadata without a backend deployment? (Management API write from the SPA, or does it need a server endpoint?)
  4. Should the one-time geo.player.first_verification event be emitted? (useful for simple Loki counting but adds a new event to the schema)
  5. What's the timeline for canary-to-100% rollout? (determines urgency of Phase 2/3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment