Last active
March 22, 2018 17:22
-
-
Save zsxwing/18ee545ac640d3009b793763a2e7ec8b to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
watermark = 1 hour | |
First batch (max event time = null): | |
2017-06-07 10:00:00.000 | |
StateStore will store 2017-06-07 10:00:00.000 | |
Second batch (max event time = 2017-06-07 10:00:00.000): | |
2017-06-07 11:00:00.000 | |
StateStore will store 2017-06-07 10:00:00.000 and 2017-06-07 11:00:00.000 | |
StateStore will evict rows <= max event time - 1 hour (2017-06-07 09:00:00.000) | |
Third batch (max event time = 2017-06-07 11:00:00.000): | |
2017-06-07 12:00:00.000 | |
StateStore will store 2017-06-07 10:00:00.000, 2017-06-07 11:00:00.000 and 2017-06-07 12:00:00.000 | |
StateStore will evict rows <= max event time - 1 hour (2017-06-07 10:00:00.000) | |
So now you can see 2017-06-07 10:00:00.000 in the output. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is it possible to shorten the time until data is output by using a negative delayThreshold for the watermark? In my use case, I assign current_timestamp() to eventTime so for me there's no point in waiting for late data to arrive because there is no late data.