Skip to content
This repository was archived by the owner on Nov 15, 2023. It is now read-only.

Conversation

@gnunicorn
Copy link
Contributor

While investigating the CPU spikes at import, I noticed that we are requesting a block twice after it as announced. Turns out that we are tracking the queue_best_number twice, once in the importqueue itself and once prior to that and because of the order in items are called, they are not in sync at that particular moment after we have received the latest block and before the handover to the importqueue - but exactly at that moment do we check whether we should download further blocks against the importqueue, which doesn't yet know of the new block. And we happen to request the same block again - but only twice because right after the request we successfully import the block and the next response is actually ignored.

The fix is simple: we need to check the local value not the remote value. This PR does that and the logs show no double imports or requests anymore.

@gnunicorn gnunicorn requested a review from svyatonik December 5, 2018 14:17
@gnunicorn gnunicorn added A0-please_review Pull request needs code review. I3-bug The node fails to follow expected behavior. I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. and removed I3-bug The node fails to follow expected behavior. labels Dec 5, 2018
@gnunicorn gnunicorn changed the title Fix Double BlockRequest Fix double BlockRequest after announcement Dec 5, 2018
@svyatonik
Copy link
Contributor

iirc, initially the reason behind two counters was:

  • the counter#1 in import_queue.rs was about max block that is currently scheduled for import;
  • the counter#2 in sync.rs was about max block that is already imported.
    So when block was imported by the ImportQueue, counter#2 was increased. I guess this isn't true now (after [frontport] Fixed sync stalling when import queue is full #691) && counters must be merged (not a part of this PR).

That said, if I understand correctly, the whole ::std::cmp::max(...) could be replaced back with the peer.common_number, since peer.common_number should always be lte than self.best_queued_number (best block# that we have ever seen).

@gnunicorn
Copy link
Contributor Author

Well, this problem occurs now because, because of other constrains we had to move import_queue.import_blocks(self, protocol, (origin, new_blocks)); out of that function call and it now only takes places after maintain_sync (and subsequently download_now) is being called. Therefore, arguably a different fix could also be to move maintain_sync only into the call for block_imported (which also makes more sense to me, as something actually happened). But I am not sure whether that has unwanted consequences...

@rphmeier
Copy link
Contributor

rphmeier commented Dec 5, 2018

cc @arkpar

@rphmeier rphmeier requested a review from arkpar December 5, 2018 14:58
Copy link
Contributor

@svyatonik svyatonik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO maintain_sync is called where it should be (but probably better ask @arkpar about that). I'm happy with the fix, but still suggest you to remove the changed line at all, because peer.common_number should always be <= import_status.best_importing_number. Feel free to merge as-is if you're unsure.

@arkpar
Copy link
Member

arkpar commented Dec 5, 2018

Looks correct to me

@arkpar arkpar added A8-looksgood and removed A0-please_review Pull request needs code review. labels Dec 5, 2018
@rphmeier
Copy link
Contributor

rphmeier commented Dec 6, 2018

This seems to have broken syncing with fork and ancestry search tests.

@gnunicorn
Copy link
Contributor Author

Yeah, I noticed at least test is failing with this, so we might have to find a different approach after all..

@arkpar
Copy link
Member

arkpar commented Dec 7, 2018

Superseded by #1235

@arkpar arkpar closed this Dec 7, 2018
@bkchr bkchr deleted the ben-fix-double-blockrequest branch December 8, 2018 19:43
lamafab pushed a commit to lamafab/substrate that referenced this pull request Jun 16, 2020
* companion pr for paritytech#6235

* ???

* nah doesn't work for my branch

* bump Cargo.lock

* bump kusama spec versin

* bump kusama spec version

* revert to master

* bump Cargo.lock

Co-authored-by: Bastian Köcher <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants