Created
March 24, 2026 09:29
-
-
Save yalisassoon/c3b78e89d69c1ed74f9da2dc009cef01 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| name: SNOWPLOW_UNIFIED_VIEWS | |
| description: | | |
| Snowplow Unified Digital (Web) is a dbt package that converts raw Snowplow behavioral events into three analytics-ready entities: views, sessions, and users. It standardizes and enriches event data (e.g., content, device, geo, traffic attribution) and computes engagement metrics such as engaged time and scroll depth, producing a reliable canonical layer for product analytics, funnel/journey reporting, experimentation, and ML/LLM use. | |
| The core tables relate hierarchically: | |
| `snowplow_unified_views` has one row per content view (page/screen), capturing view-level context and engagement; | |
| `snowplow_unified_sessions` aggregates all activity within a visit into one row per session, rolling up view and event behaviors into session metrics; | |
| `snowplow_unified_users` aggregates sessions into one row per user, providing user traits and lifetime/rolling metrics, including optional identity stitching (e.g., `stitched_user_id`) across devices and login states. | |
| Together, these tables replace repeated event-level aggregation with consistent, governed definitions and cost-efficient incremental models, making behavioral analysis and AI consumption simpler and more trustworthy. | |
| Value notes: | |
| - Snowplow-generated identifiers (pageview id, domain user id, session id) are UUIDs - examples: 00112233-4455-6677-8899-aabbccddeeff, 00000000-1111-2222-3333-444455556666 | |
| - Company's internal userid might be anything the company defines. | |
| - Timestamp columns follow ISO-8601-style datetime format: yyyy-MM-dd'T'HH:mm:ss.SSSZ - examples: 2025-01-01T10:10:10.010+0000, 2025-01-01T20:32:45.276+0000 | |
| {{YOUR_BUSINESS_DESCRIPTION_HERE}} | |
| Example Co is a company established in 2000. Key industry is A. Key products are A, B, C. | |
| Company uses Snowplow to track web behaviour on these company's websites: | |
| - your_app -- https://example.com -- Example Co's main website. Example Co is a company that sells A, B, C and uses this website as an e-commerce platform. | |
| - your_app_support -- https://support.example.com -- Example Co's support website. Example Co uses Support website to help customers use their products. | |
| tables: | |
| - name: SNOWPLOW_UNIFIED_SESSIONS | |
| description: This table stores information about user sessions, including session metadata, user information, device and browser details, geographic location, marketing campaign data, and engagement metrics, providing a comprehensive view of user behavior and interactions with a website or application. | |
| base_table: | |
| database: {{YOUR_DATABASE}} | |
| schema: DERIVED # update this if you use non-default schema | |
| table: SNOWPLOW_UNIFIED_SESSIONS | |
| dimensions: | |
| - name: APP_ID | |
| description: The application or system that initiated the Snowplow session. | |
| expr: APP_ID | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - website | |
| - docs | |
| - name: DEFAULT_CHANNEL_GROUP | |
| description: The channel through which the user initially entered the website, categorized into groups such as paid advertising, email campaigns, or unassigned sources. | |
| expr: DEFAULT_CHANNEL_GROUP | |
| data_type: VARCHAR(25) | |
| sample_values: | |
| - Unassigned | |
| - Paid Search | |
| - name: DEVICE_CATEGORY | |
| description: The type of device used by the user during the session, categorized as Phone, Tablet, or Desktop. | |
| expr: DEVICE_CATEGORY | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - Phone | |
| - Tablet | |
| - Desktop | |
| - name: DEVICE_IDENTIFIER | |
| description: Unique identifier for the device used to access the application or website, allowing for tracking of user behavior across sessions on the same device. | |
| expr: DEVICE_IDENTIFIER | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - 00112233-4455-6677-8899-aabbccddeeff | |
| - 00000000-1111-2222-3333-444455556666 | |
| - name: DEVICE_SESSION_INDEX | |
| description: A unique identifier for a user's device session, allowing for the tracking of individual user interactions across multiple sessions on the same device. | |
| expr: DEVICE_SESSION_INDEX | |
| data_type: NUMBER(38,0) | |
| sample_values: | |
| - '2' | |
| - '10' | |
| - '42' | |
| - name: ENGAGED_TIME_IN_S | |
| description: The total amount of time (in seconds) the user was actively engaged with the website or application during the session. | |
| expr: ENGAGED_TIME_IN_S | |
| data_type: NUMBER(22,0) | |
| sample_values: | |
| - '40' | |
| - '60' | |
| - '70' | |
| - name: EVENT_COUNTS | |
| description: This column stores the count of different types of events that occurred during a user's session, including page pings, application errors, link clicks, and page views, providing insight into user behavior and application performance. | |
| expr: EVENT_COUNTS | |
| data_type: VARIANT | |
| sample_values: | |
| - |- | |
| { | |
| "page_ping": 9 | |
| } | |
| - |- | |
| { | |
| "event": 2, | |
| "page_ping": 12, | |
| "page_view": 2 | |
| } | |
| - |- | |
| { | |
| "event": 3, | |
| "link_click": 1, | |
| "page_ping": 3, | |
| "page_view": 3 | |
| } | |
| - name: FIRST_EVENT_NAME | |
| description: The type of event that triggered the start of the session. | |
| expr: FIRST_EVENT_NAME | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - page_ping | |
| - page_view | |
| - name: FIRST_GEO_CITY | |
| description: The city where the user's session originated from, based on their IP address. | |
| expr: FIRST_GEO_CITY | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - London | |
| - New-York | |
| - name: FIRST_GEO_COUNTRY | |
| description: The country where the user's session originated, based on geolocation data. | |
| expr: FIRST_GEO_COUNTRY | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - GB | |
| - US | |
| - name: FIRST_GEO_COUNTRY_NAME | |
| description: The country of origin of the user's first geolocation event during the session. | |
| expr: FIRST_GEO_COUNTRY_NAME | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - United Kingdom of Great Britain and Northern Ireland | |
| - Canada | |
| - name: FIRST_PAGE_TITLE | |
| description: The title of the first page visited by a user during a session, providing insight into the entry point of the user's journey on the website. | |
| expr: FIRST_PAGE_TITLE | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - Example Page | |
| - Blog Example Title | |
| - name: FIRST_PAGE_URL | |
| description: The URL of the first page visited by a user during a session. | |
| expr: FIRST_PAGE_URL | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - https://example.com | |
| - https://example.com/a/b/c | |
| - https://blog.example.com/categories | |
| - name: FIRST_PAGE_URLHOST | |
| description: The domain of the first page visited by the user during the session. | |
| expr: FIRST_PAGE_URLHOST | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - example.com | |
| - blog.example.com | |
| - name: FIRST_PAGE_URLPATH | |
| description: The URL path of the first page visited by the user during the session. | |
| expr: FIRST_PAGE_URLPATH | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - / | |
| - /a/b/c | |
| - /categories | |
| - name: FIRST_PAGE_URLQUERY | |
| description: The URL query string of the first page visited by a user during a session, which may contain parameters such as campaign IDs, keywords, and source information. | |
| expr: FIRST_PAGE_URLQUERY | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - utm_campaign=website&utm_medium=email | |
| - hidden=false&rows=10&search= | |
| - name: IS_ENGAGED | |
| description: 'Indicates whether the user was actively engaged with the application during the session, meaning they performed at least one of the following actions: scroll, click, or key press.' | |
| expr: IS_ENGAGED | |
| data_type: BOOLEAN | |
| sample_values: | |
| - 'TRUE' | |
| - 'FALSE' | |
| - name: LAST_EVENT_NAME | |
| description: The name of the last event that occurred during a user's session. | |
| expr: LAST_EVENT_NAME | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - page_view | |
| - name: LAST_GEO_CITY | |
| description: The city where the user was located when the session was recorded, as determined by geolocation data. | |
| expr: LAST_GEO_CITY | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - London | |
| - New-York | |
| - name: LAST_GEO_COUNTRY | |
| description: The country from which the user accessed the application, based on their geolocation. | |
| expr: LAST_GEO_COUNTRY | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - US | |
| - GB | |
| - name: LAST_GEO_COUNTRY_NAME | |
| description: The country of origin of the user's geolocation, as determined by their IP address. | |
| expr: LAST_GEO_COUNTRY_NAME | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - Singapore | |
| - United States of America | |
| - name: LAST_PAGE_TITLE | |
| description: The title of the last page visited by a user during a session. | |
| expr: LAST_PAGE_TITLE | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - Example Title | |
| - Blog Post Example Title | |
| - name: LAST_PAGE_URL | |
| description: The URL of the last page visited by the user during the session. | |
| expr: LAST_PAGE_URL | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - https://example.com | |
| - https://example.com/a/b/c | |
| - https://blog.example.com/categories | |
| - name: LAST_PAGE_URLHOST | |
| description: The URL host of the last page visited by the user during the session. | |
| expr: LAST_PAGE_URLHOST | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - example.com | |
| - blog.example.com | |
| - name: LAST_PAGE_URLPATH | |
| description: The URL path of the last page visited by the user during the session. | |
| expr: LAST_PAGE_URLPATH | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - / | |
| - /a/b/c | |
| - /categories | |
| - name: LAST_PAGE_URLQUERY | |
| description: The URL query parameters of the last page visited by the user during a session, including UTM tracking codes and Google Ads click IDs. | |
| expr: LAST_PAGE_URLQUERY | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - utm_campaign=website&utm_medium=email | |
| - hidden=false&rows=10&search= | |
| - name: MKT_CAMPAIGN | |
| description: The marketing campaign that drove the user to the website. | |
| expr: MKT_CAMPAIGN | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - example_campaign | |
| - utm_campaign_example_2025 | |
| - name: MKT_CONTENT | |
| description: Marketing content identifier, a unique code associated with a specific marketing content or campaign. | |
| expr: MKT_CONTENT | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - '123' | |
| - 'blog-article-123' | |
| - name: MKT_MEDIUM | |
| description: The marketing medium through which a user accessed the website, such as a specific advertising campaign, social media platform, or email newsletter. | |
| expr: MKT_MEDIUM | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - name: MKT_NETWORK | |
| description: The marketing network through which the user accessed the website or application, such as Google or Microsoft. | |
| expr: MKT_NETWORK | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - Microsoft | |
| - name: MKT_SOURCE | |
| description: The marketing source that initiated the user's session, such as a Pardot campaign, email, or GitHub referral. | |
| expr: MKT_SOURCE | |
| data_type: VARCHAR(16777216) | |
| sample_values: | |
| - github | |
| - name: MKT_SOURCE_PLATFORM | |
| description: The platform through which a marketing campaign was sourced, such as Facebook, Google, or Email. | |
| expr: MKT_SOURCE_PLATFORM | |
| data_type: VARCHAR(16777216) | |
| - name: MKT_TERM | |
| description: The marketing term or keyword that trig... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment