Created
April 29, 2022 21:40
-
-
Save mistercrunch/3de2ac9a19843295fc77029a218d2d42 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{# | |
Given a source table and a DATETIME or DATE column, this macro returns | |
a DATE `YYYY-MM-DD` pointing the the most recent available day. | |
For streaming tables that come in impredictable batches, the previous_day | |
flag when on will very likely point to the latest FULL day as we can | |
safely assume that if we have received event for a given day, the | |
previous one is complete. | |
#} | |
{%- macro latest_dt(source_table, dttm_column="ds", previous_day=False) -%} | |
{%- set max_partition = none -%} | |
{%- if execute -%} | |
{%- call statement('max_partition_date_query', fetch_result=True) -%} | |
SELECT CAST(MAX({{ dttm_column }}) AS DATE) AS max_partition_date | |
FROM {{ source_table }} | |
WHERE {{ dttm_column }} >= CAST('1970-01-01' AS DATE) | |
{%- endcall -%} | |
{%- set result = load_result('max_partition_date_query') -%} | |
{% if result %} | |
{%- set max_partition = result['data'][0][0] -%} | |
{%- endif -%} | |
{%- endif -%} | |
{%- if max_partition -%} | |
{%- if previous_day-%} | |
{%- set max_partition = max_partition - modules.datetime.timedelta(1) -%} | |
{{- max_partition.strftime('%Y-%m-%d') -}} | |
{%- else -%} | |
{{- max_partition.strftime('%Y-%m-%d') -}} | |
{%- endif -%} | |
{%- endif -%} | |
{%- endmacro -%} | |
~ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{# | |
Given a source table that is partitionned, this macro returns | |
a query pointing to the latest partition | |
Note that this could be improved to better support different column | |
types with proper casting. We assume a DATETIME column here. | |
Also note that we may need to wait on a more directed signal from the | |
previous ETL phase that insures that ALL workspaces have been synced | |
#} | |
{% macro latest_partition(source_table, dttm_column="ds", previous_day=False) %} | |
{%- set max_partition = latest_dt(source_table, dttm_column, previous_day) -%} | |
SELECT * FROM {{ source_table }} | |
{% if max_partition -%} | |
WHERE {{ dttm_column }} = '{{ max_partition }}' | |
{% else -%} | |
WHERE {{ dttm_column }} > '1970-01-01' | |
{% endif %} | |
{% endmacro %} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment