@@ -649,20 +649,24 @@ are added in all the versions (`v1alpha1`, `v1beta1`, and `v1beta2`).
649649The following display shows the two new fields along with the updated
650650description for the ` AssuredConcurrencyShares `  field, in ` v1beta2 ` .
651651
652+ ** Note** : currently this design does not use the ` Priority `  field for
653+ anything.  We need to either use it for something or take it out of
654+ the design.
655+ 
652656``` go 
653657type  LimitedPriorityLevelConfiguration  struct  {
654658  ...
655659  //  `assuredConcurrencyShares` (ACS) contributes to the computation of the
656-   //  NominalConcurrencyLimit (NCL ) of this level.
660+   //  NominalConcurrencyLimit (NominalCL ) of this level.
657661  //  This is the number of execution seats available at this priority level.
658662  //  This is used both for requests dispatched from
659663  //  this priority level as well as requests dispatched from higher priority
660664  //  levels borrowing seats from this level.  This does not limit dispatching from
661665  //  this priority level that borrows seats from lower priority levels (those lower
662-   //  levels do that).  The server's concurrency limit (SCL ) is divided among the
666+   //  levels do that).  The server's concurrency limit (ServerCL ) is divided among the
663667  //  Limited priority levels in proportion to their ACS values:
664668  // 
665-   //  NCL (i)  = ceil( SCL  * ACS(i) / sum_acs )
669+   //  NominalCL (i)  = ceil( ServerCL  * ACS(i) / sum_acs )
666670  //  sum_acs = sum[limited priority level k] ACS(k)
667671  // 
668672  //  Bigger numbers mean a larger nominal concurrency limit, at the expense
@@ -671,13 +675,13 @@ type LimitedPriorityLevelConfiguration struct {
671675  //  +optional
672676  AssuredConcurrencyShares  int32 
673677
674-   //  `borrowablePercent` prescribes the fraction of the level's NCL  that
678+   //  `borrowablePercent` prescribes the fraction of the level's NominalCL  that
675679  //  can be borrowed by higher priority levels.  This value of this
676680  //  field must be between 0 and 100, inclusive, and it defaults to 0.
677681  //  The number of seats that higher levels can borrow from this level, known
678-   //  as this level's BorrowableConcurrencyLimit (BCL ), is defined as follows.
682+   //  as this level's BorrowableConcurrencyLimit (BorrowableCL ), is defined as follows.
679683  // 
680-   //  BCL (i) = round( NCL (i) * borrowablePercent(i)/100.0 )
684+   //  BorrowableCL (i) = round( NominalCL (i) * borrowablePercent(i)/100.0 )
681685  // 
682686  //  +optional
683687  BorrowablePercent  int32 
@@ -730,59 +734,82 @@ when the priority field holds zero.
730734|  global-default  |   20 |   50 |   9900 | 
731735|  catch-all       |    5 |    0 |  10000 | 
732736
733- Each priority level has two concurrency limits: its
734- NominalConcurrencyLimit (NCL) as defined above by configuration, and a
735- CurrentConcurrencyLimit (CCL) that is used in dispatching requests.
736- The CCLs are adjusted periodically, based on configuration, the
737- current situation at adjustment time, and recent observations.  The
738- "borrowing" resides in the differences between CCL and NCL.  A
739- priority level's CCL can go as low as NCL-BCL; the upper limit is
740- imposed only by how many seats are available for borrowing from other
741- priority levels.  The sum of the CCLs, like the sum of the NCLs, is
742- always equal to the server's concurrency limit (SCL).  These CCLs are
743- floating-point values, because the adjustment logic below is
744- incremental.  The actual limits used in dispatching are the result of
745- rounding these floating-point numbers to their nearest integer.
737+ Each non-exempt priority level ` i `  has two concurrency limits: its
738+ NominalConcurrencyLimit (` NominalCL(i) ` ) as defined above by
739+ configuration, and a CurrentConcurrencyLimit (` CurrentCL(i) ` ) that is
740+ used in dispatching requests.  The CurrentCLs are adjusted
741+ periodically, based on configuration, the current situation at
742+ adjustment time, and recent observations.  The "borrowing" resides in
743+ the differences between CurrentCL and NominalCL.  There is a lower
744+ bound on each non-exempt priority level's CurrentCL: `MinCL(i) =
745+ NominalCL(i) - BorrowableCL(i)`; the upper limit is imposed only by
746+ how many seats are available for borrowing from other priority levels.
747+ The sum of the CurrentCLs, like the sum of the NominalCLs, is always
748+ equal to the server's concurrency limit (ServerCL).
746749
747750Dispatching is done independently for each priority level.  Whenever
748751(1) a non-exempt priority level's number of occupied seats is zero or
749- below the level's rounded CCL and (2) that priority level has a
750- non-empty queue, it is time to dispatch another request for service.
751- The Fair Queuing for Server Requests algorithm below is used to pick a
752- non-empty queue at that priority level.  Then the request at the head
753- of that queue is dispatched if possible.
754- 
755- Every 10 seconds, all the CCLs are adjusted.  The adjustments take
756- into account high watermarks of seat demand.  A priority level's seat
757- demand is the sum of its occupied seats and the number of seats in the
758- queued requests.  Each priority level has two high watermarks: a
759- short-term one M1 and a long-term one M2.  During an adjustment
760- period, M1 is updated to track the maximum seat demand seen during
761- that adjustment period.  At the end of every adjustment period, M2 is
762- set to ` max(M1, A*M2 + (1-A)*M1) `  and M1 is set to the current seat
763- demand.  That is, M2 jumps up to M1 if that is higher (so a spike in
764- demand gets an immediate response at adjustment time), otherwise
765- exponentially drifts down toward M1 with a parameter A; 0.9 might be a
766- good value for A.
767- 
768- The adjustment logic takes the M2 values as desired targets to aim
769- toward and adjusts the CCL values in two steps.  The first step aims
770- to equalize the "pressure" on the priority levels.  Define each
771- priority level's pressure ` P = max(NCL-BCL, M2) - CCL ` .  Let ` PAvg `  be
772- the result of averaging P over the priority levels.  The first step
773- adjusts each CCL by adding ` B * (P - PAvg) ` .  We use a coefficient B
774- --- for which 0.25 might be a good value --- so that this step goes
775- only part way toward its target.  Such damping is commonly done in
776- controllers.
777- 
778- The second step corrects for any lower bounds violations.  There are
779- two lower bounds: one imposed by the limit on borrowable seats (BCL),
780- and one imposed by priority levels that wish to reclaim borrowed seats
781- due to recent load.  For every priority level where `CCL <
782- max(NCL-BCL, min(NCL, M2))`, CCL gets increased to that lower bound.
783- Whenever there are such increases, there must also be priority levels
784- for which ` CCL - NCL > 0 ` .  The seats for the former are taken from
785- the latter, in proportion to the latter difference.
752+ below the level's CurrentCL and (2) that priority level has a
753+ non-empty queue, it is time to consider dispatching another request
754+ for service.  The Fair Queuing for Server Requests algorithm below is
755+ used to pick a non-empty queue at that priority level.  Then the
756+ request at the head of that queue is dispatched if possible.
757+ 
758+ Every 10 seconds, all the CurrentCLs are adjusted.  We do smoothing on
759+ the inputs to the adjustment logic in order to dampen control
760+ gyrations, in a way that lets a priority level reclaim lent seats at
761+ the nearest adjustment time.  The adjustments take into account the
762+ high watermark ` HighSD(i) ` , time-weighted average ` AvgSD(i) ` , and
763+ time-weighted population standard deviation ` StDevSD(i) `  of each
764+ priority level ` i ` 's seat demand over the just-concluded adjustment
765+ period.  A priority level's seat demand at any given moment is the sum
766+ of its occupied seats and the number of seats in the queued requests.
767+ We also define ` EnvelopeSD(i) = AvgSD(i) + StDevSD(i) ` .  The
768+ adjustment logic is driven by a quantity called smoothed seat demand
769+ (` SmoothSD(i) ` ), which does an exponential averaging of EnvelopeSD
770+ values using a coeficient A in the range (0,1) and immediately tracks
771+ EnvelopeSD when it exceeds SmoothSD.  The rule for updating priority
772+ level ` i ` 's SmoothSD at the end of an adjustment period is
773+ `SmoothSD(i) := max( EnvelopeSD(i), A* SmoothSD(i) + (1-A)* Envelope(i)
774+ )` .  The command line flag  ` --seat-demand-history-fraction` with a
775+ default value of 0.9 configures A.
776+ 
777+ For adjusting the CurrentCL values, each non-exempt priority level ` i ` 
778+ has a lower bound (` MinCurrentCL(i) ` ) for the new value.  It is simply
779+ HighSD clipped by the configured concurrency limits: `MinCurrentCL(i)
780+ = max( MinCL(i), min( NominalCL(i), HighSD(i) ) )`.
781+ 
782+ If MinCurrentCL(i) = NominalCL(i) for every non-exemple priority level
783+ i then there is no wiggle room.  No priority level is willing to lend
784+ any seats.  The new CurrentCL values must equal the NominalCL values.
785+ Otherwise there is wiggle room and the adjustment proceeds as follows.
786+ 
787+ The priority levels would all be fairly happy if we set CurrentCL =
788+ SmoothSD for each.  We clip that by the lower bound just shown, taking
789+ ` Target(i) = max(SmoothSD(i), MinCurrentCL(i)) `  as a first-order
790+ target for each non-exempt priority level ` i ` .
791+ 
792+ Sadly, the sum of the Target values --- let's name that TargetSum ---
793+ is not necessarily equal to ServerCL and the individual Target values
794+ do not necessarily respect the corresponding MinCurrentCL bound.  If
795+ we had only the first of those two problems then we could set each
796+ CurrentCL(i) to FairFrac * Target(i) where FairFrac = ServerCL /
797+ TargetSum.  This would share the gain or pain in equal proportion
798+ among the priority levels.  Taking the lower bounds into account means
799+ finding the FairFrac value that solves the following conditions, for
800+ all the non-exempt priority levels ` i ` , and also makes the CurrentCL
801+ values sum to ServerCL.  For this step we let the CurrentCL values be
802+ floating-point numbers, not necessarily integers.
803+ 
804+ ``` 
805+ CurrentCL(i) = FairFrac * Target(i)  if FairFrac * Target(i) >= MinCurrentCL(i) 
806+ CurrentCL(i) = MinCurrentCL(i)       if FairFrac * Target(i) <= MinCurrentCL(i) 
807+ ``` 
808+ 
809+ This is the mirror image of the max-min fairness problem and can be
810+ solved with the same sort of algorithm, taking O(N log N) time and
811+ O(N) space.  After finding the floating point CurrentCL solutions,
812+ each one is rounded to the nearest integer to use in dispatching.
786813
787814### Fair Queuing for Server Requests  
788815
@@ -1917,7 +1944,7 @@ others, at any given time this may compute for some priority level(s)
19171944an assured concurrency value that is lower than the number currently 
19181945executing.  In these situations the total number allowed to execute 
19191946will temporarily exceed the apiserver's configured concurrency limit 
1920- (`SCL `) and will settle down to the configured limit as requests 
1947+ (`ServerCL `) and will settle down to the configured limit as requests 
19211948complete their service. 
19221949
19231950# ## Default Behavior
@@ -1991,6 +2018,17 @@ This KEP adds the following metrics.
19912018- apiserver_dispatched_requests (count, broken down by priority, FlowSchema) 
19922019- apiserver_wait_duration (histogram, broken down by priority, FlowSchema) 
19932020- apiserver_service_duration (histogram, broken down by priority, FlowSchema) 
2021+ - ` apiserver_flowcontrol_request_concurrency_limit` (gauge of NominalCL, broken down by priority) 
2022+ - ` apiserver_flowcontrol_request_min_concurrency_limit` (gauge of MinCL, broken down by priority) 
2023+ - ` apiserver_flowcontrol_request_current_concurrency_limit` (gauge of CurrentCL, broken down by priority) 
2024+ - ` apiserver_flowcontrol_demand_seats` (timing ratio histogram of seat demand / NominalCL, broken down by priority) 
2025+ - ` apiserver_flowcontrol_demand_seats_high_water_mark` (gauge of HighSD, broken down by priority) 
2026+ - ` apiserver_flowcontrol_demand_seats_average` (gauge of AvgSD, broken down by priority) 
2027+ - ` apiserver_flowcontrol_demand_seats_stdev` (gauge of StDevSD, broken down by priority) 
2028+ - ` apiserver_flowcontrol_envelope_seats` (gauge of EnvelopeSD, broken down by priority) 
2029+ - ` apiserver_flowcontrol_smoothed_demand_seats` (gauge of SmoothSD, broken down by priority) 
2030+ - ` apiserver_flowcontrol_target_seats` (gauge of Target, brokwn down by priority) 
2031+ - ` apiserver_flowcontrol_seat_fair_frac` (gauge of FairFrac) 
19942032
19952033# ## Testing
19962034
0 commit comments