Skip to content

Conversation

@sandreim
Copy link
Collaborator

There are 2 changes:

  • Allow to_orchestra_rx to be handled in a Overseer task.
    Reasoning: the Polkadot overseer handles both events and job spawners in the same select loop. This can create deadlocks in subsystems that use ctx.spawn() to create a job and wait for it, only if the same subsystem has a full channel and an external messages needs to be passed to it from the overseer select loop.
  • Allow external injection of messages over the unbounded queues of orchestrated subsystems
    Not sure this is the best API to expose this, happy to hear alternatives

@sandreim sandreim requested review from drahnr and vstakhov September 15, 2023 14:50
Copy link
Contributor

@vstakhov vstakhov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to bump the middle version with this change, as we are going to change the existing API.

/// Send sink for `Message`s to be sent to a subsystem.
pub tx_bounded: crate::metered::MeteredSender<MessagePacket<Message>>,
/// Unbounded send sink for `Message`s to be sent to a subsystem.
pub tx_unbounded: crate::metered::UnboundedMeteredSender<MessagePacket<Message>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


/// Gather running subsystems' outbound streams into one.
to_orchestra_rx: #support_crate ::stream::Fuse<
to_orchestra_rx: Option<#support_crate ::stream::Fuse<
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did this become an Option?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been using this here: paritytech/polkadot-sdk@291eefe This seemed to be the path of least resistance.

Copy link
Collaborator

@drahnr drahnr Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the reason is you need to move the receiver to another future task, and therefore - since stop and all the other associated functions require self (non referential) as arguments, you have to be able to retain Self as type (no unpacking) due to the above as well.

I am not a particular fan of using Option<_>, it obfuscates the rationale. Note that with the Option<_> the previous implementation of spawning tasks becomes rather awkward, since one always has to check if rx_* is Some before polling. I'd say either we add a new type, or we create an <OrchestraName>Inner type that is only the part that is being spawned and one then can unpack the <OrchestraName> into the inner type plus receivers.

Not sure there is a better alternative from the top of my head.

@drahnr
Copy link
Collaborator

drahnr commented Sep 18, 2023

* **Allow external injection of messages over the unbounded queues of orchestrated subsystems**
  Not sure this is the best API to expose this, happy to hear alternatives

This I do understand.

* **Allow `to_orchestra_rx ` to be handled in a `Overseer` task.**
  Reasoning: the Polkadot overseer handles both events and job spawners in the same select loop. This can create deadlocks in  subsystems that use ctx.spawn() to create a job and wait for it, only if the same subsystem has a full channel and an external messages needs to be passed to it from the overseer select loop.

I am missing context here. Whatever long-blocking operation in the subsystems message processing loop is going to be an issue. I don't quite understand how spawn is specific here? I don't quite get how the change enables that. Could you elaborate or add an example under examples?

@drahnr
Copy link
Collaborator

drahnr commented Nov 17, 2023

Ping :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants