Skip to content

Conversation

@hodgesds
Copy link
Contributor

@hodgesds hodgesds commented Jun 5, 2025

Backwards sticky scheduling ensures the task thrashes all CPUs like a bull in a china shop before it matures and settles down to realize it's true potential by staying sticky to it's roots. Or some BS like that...

Perfomance tests:
Full o' BS (with the --bs flag):

Wakeup Latencies percentiles (usec) runtime 50 (s) (672946 total samples)
          50.0th: 8          (166788 samples)
          90.0th: 11         (133733 samples)
        * 99.0th: 16         (56243 samples)
          99.9th: 88         (5823 samples)
          min=1, max=18418
Request Latencies percentiles (usec) runtime 50 (s) (674857 total samples)
          50.0th: 6632       (207407 samples)
          90.0th: 10480      (260129 samples)
        * 99.0th: 12208      (60946 samples)
          99.9th: 12944      (5506 samples)
          min=5810, max=299844
RPS percentiles (requests) runtime 50 (s) (51 total samples)
          20.0th: 13488      (13 samples)
        * 50.0th: 13616      (18 samples)
          90.0th: 13680      (16 samples)
          min=9189, max=13735
current rps: 13490.15

No BS:

Wakeup Latencies percentiles (usec) runtime 30 (s) (407935 total samples)
          50.0th: 8          (99090 samples)
          90.0th: 11         (81299 samples)
        * 99.0th: 16         (33058 samples)
          99.9th: 28         (2502 samples)
          min=1, max=997
Request Latencies percentiles (usec) runtime 30 (s) (408757 total samples)
          50.0th: 6632       (118699 samples)
          90.0th: 10480      (162838 samples)
        * 99.0th: 12240      (36212 samples)
          99.9th: 12880      (3600 samples)
          min=5820, max=19542
RPS percentiles (requests) runtime 30 (s) (31 total samples)
          20.0th: 13552      (8 samples)
        * 50.0th: 13648      (11 samples)
          90.0th: 13712      (9 samples)
          min=13327, max=13752
average rps: 13625.23

Doesn't seem to add much BS, should be safe to run in production!

@etsal etsal self-requested a review June 5, 2025 02:26
@etsal
Copy link
Contributor

etsal commented Jun 5, 2025

IIUC this tracks where the task has been and tries to reuse the same CPUs? Not very clear on the & operations in the code bc it looks like ->bs is a mask and nr_cpus/cpu is a number, but I might be missing sth

@hodgesds hodgesds force-pushed the p2dq-backwards-sticky branch 2 times, most recently from 7c4b055 to 08dbe60 Compare June 5, 2025 03:01
@hodgesds
Copy link
Contributor Author

hodgesds commented Jun 5, 2025

IIUC this tracks where the task has been and tries to reuse the same CPUs? Not very clear on the & operations in the code bc it looks like ->bs is a mask and nr_cpus/cpu is a number, but I might be missing sth

Whoops, looks like I accidently added some BS... should be good now!

@hodgesds hodgesds force-pushed the p2dq-backwards-sticky branch from 08dbe60 to 8c6a1f3 Compare June 5, 2025 03:23
bpf_for(i, 0, nr_cpus)
if (taskc->bs_mask == 0 || (taskc->bs_mask & i) != i) {
if (i == nr_cpus - 1)
taskc->bs_mask &= nr_cpus;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused about the idea of bs_mask. Is it safe/correct to bitwise operation between bitmask (bs_mask) and just integer numbers (nr_cpus and task_cpu)? It would a good idea to add some comments about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably not super safe and I think the &= should be an |=. After thinking about this more though it probably belongs in scx_chaos. The right way to do this in p2dq would be to put this into a scx_bitmap_t since the task_ctx in p2dq is arena allocated.

@arighi
Copy link
Contributor

arighi commented Jun 5, 2025

Backwards sticky scheduling ensures the task thrashes all CPUs like a bull in a china shop before it matures and settles down to realize it's true potential by staying sticky to it's roots. Or some BS like that...

New logo of scx_p2dq confirmed:

image

@hodgesds hodgesds force-pushed the p2dq-backwards-sticky branch from 8c6a1f3 to a165d1f Compare June 5, 2025 09:50
Backwards sticky scheduling ensures the task thrashes all CPUs like a
bull in a china shop before it matures and settles down to realize it's
true potential by staying sticky to it's roots. Or some BS like that...

Signed-off-by: Daniel Hodges <[email protected]>
@hodgesds hodgesds force-pushed the p2dq-backwards-sticky branch from a165d1f to 96e11ae Compare June 5, 2025 17:10
@hodgesds hodgesds force-pushed the p2dq-backwards-sticky branch from 96e11ae to 1f0bd16 Compare June 5, 2025 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants