Skip to content

Deterministic completion of a batch (frontier) #338

@pauljamescleary

Description

@pauljamescleary

Is there a way in differential to deterministically know that a time is complete?

For example, if you have input.filter(...).join(...).x, you submit data at time 1 and it is filtered, x operator (anything) never gets the data, however, I want to still know that the input at time 1 is "complete" i.e. done processing.

With some experimentation, it does appear that the downstream operator from filter in this example are indeed scheduled. I was able to create a simple operator that did basically nothing and verify that it was called, just nothing was in the input (as expected). However, I haven't been able to find in SharedProgress any indication that time 1 is finished.

I believe that I can manage to rig a left join of the output against the input to determine completion of the time, but that feels rather heavy handed.

Is there an operator or technique that I haven't seen that will let me know for sure that a time is done, even after input stops at filters and joins?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions