Skip to content

Conversation

@cretz
Copy link
Member

@cretz cretz commented Nov 4, 2022

What was changed

  • Altered async context manager-ness of Worker to cancel the current task akin to what https://docs.python.org/3/library/asyncio-task.html#asyncio.timeout does. However, that is only in 3.11 and relies on https://docs.python.org/3/library/asyncio-task.html#asyncio.Task.uncancel to prevent inadvertent nesting/re-raise. We don't get that benefit, but I have observed no issues with the current implementation yet.
  • Added Worker.is_running and Worker.is_shutdown. We could have a "status"/"state" enum like TS, but this was simpler for now and doesn't keep us from doing so in the future if someone needs to know, for example, it's in the process of shutting down.
  • Altered Worker.run to be cancel-safe and to re-raise any polling errors returned by core
  • Altered Worker.shutdown to be idempotent and even be able to be called before run starts to alleviate issues like panic if Run and Stop race sdk-go#868, Worker fatal error can cause double worker stop sdk-go#903, etc
  • Added on_fatal_error callback that users can register. This doesn't have much value since we throw out of run anyways, but the nice thing is this one is called before we start the shutdown process if they want to do anything (I can't imagine what though).

Core does not have a way to cause an immediate fatal error (everything is retried for a minute) so I mock core in my tests, but I have confirmed that this does actually propagate the error after a little while if, say, you delete the namespace with the operator service.

Checklist

  1. Closes Propagate fatal worker errors #25

@cretz cretz requested a review from a team November 4, 2022 14:36
"""Same as :py:meth:`shutdown` for use by ``async with``."""
await self.shutdown()
@property
def is_shutdown(self) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: A touch more precise

Suggested change
def is_shutdown(self) -> bool:
def has_finished_shutdown(self) -> bool:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok w/ lack of precision in the name so long as doc is clear

Comment on lines 405 to 406
# Cancel the shutdown task (safe if already done)
tasks[0].cancel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... is it possible to use a dict with names or a named struct field or something for the tasks rather than a list where the indices matter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need in a local function like this IMO. I need this as a sequence for how it's used. If this variable escaped this function, for sure.

…error

# Conflicts:
#	tests/worker/test_worker.py
@cretz cretz merged commit 6b9f554 into temporalio:main Nov 7, 2022
@cretz cretz deleted the worker-fatal-error branch November 7, 2022 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Propagate fatal worker errors

2 participants