-
Notifications
You must be signed in to change notification settings - Fork 385
fix(calling): fix 429 re-try logic #4559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: next
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -727,6 +727,110 @@ describe('Registration Tests', () => { | |
| }); | ||
| }); | ||
|
|
||
| describe('restorePreviousRegistration 429 handling tests', () => { | ||
| beforeEach(() => { | ||
| jest.useFakeTimers(); | ||
| }); | ||
|
|
||
| afterEach(() => { | ||
| jest.clearAllTimers(); | ||
| jest.clearAllMocks(); | ||
| jest.useRealTimers(); | ||
| }); | ||
|
|
||
| it('should schedule retry when 429 with retry-after < 60 seconds during reconnect', async () => { | ||
| restartSpy = jest.spyOn(reg, 'restartRegistration'); | ||
| reg.setActiveMobiusUrl(mobiusUris.primary[0]); | ||
| const failurePayload429Small = <WebexRequestPayload>(<unknown>{ | ||
| statusCode: 429, | ||
| body: 'SOMETHING RANDOM', | ||
| headers: { | ||
| 'retry-after': 30, | ||
| }, | ||
| }); | ||
| webex.request | ||
| .mockRejectedValueOnce(failurePayload429Small) | ||
| .mockResolvedValueOnce(successPayload) | ||
| .mockRejectedValueOnce(failurePayload429Small) | ||
| .mockResolvedValueOnce(successPayload); | ||
|
|
||
| await reg.reconnectOnFailure('SOMETHING RANDOM'); // This call is being used to set the retry-after value | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we have to invoke reconnectOnFailure twice ? |
||
|
|
||
| await reg.reconnectOnFailure('SOMETHING RANDOM'); // This call is being used to trigger the retry | ||
| jest.advanceTimersByTime(40 * SEC_TO_MSEC_MFACTOR); | ||
| await flushPromises(); | ||
|
|
||
| expect(restartSpy).toHaveBeenCalledTimes(1); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should have more expect statements as to check restore being invoked then restrat being invoked with retyr after value |
||
| expect(restartSpy).toHaveBeenCalledWith('SOMETHING RANDOM'); | ||
| }); | ||
|
|
||
| it('should try backup servers when 429 with retry-after >= 60 seconds on primary during reconnect', async () => { | ||
| // Setup: Register successfully with primary first | ||
| const attemptRegistrationWithServersSpy = jest.spyOn(reg, 'attemptRegistrationWithServers'); | ||
| reg.setActiveMobiusUrl(mobiusUris.primary[0]); | ||
| const failurePayload429Small = <WebexRequestPayload>(<unknown>{ | ||
| statusCode: 429, | ||
| body: 'SOMETHING RANDOM', | ||
| headers: { | ||
| 'retry-after': 100, | ||
| }, | ||
| }); | ||
| webex.request | ||
| .mockRejectedValueOnce(failurePayload429Small) | ||
| .mockResolvedValueOnce(successPayload) | ||
| .mockRejectedValueOnce(failurePayload429Small) | ||
| .mockResolvedValueOnce(successPayload); | ||
|
|
||
| await reg.reconnectOnFailure('SOMETHING RANDOM'); // This call is being used to trigger the retry | ||
| jest.advanceTimersByTime(40 * SEC_TO_MSEC_MFACTOR); | ||
| await flushPromises(); | ||
|
|
||
| expect(attemptRegistrationWithServersSpy).toHaveBeenCalledTimes(2); | ||
| expect(attemptRegistrationWithServersSpy).toHaveBeenNthCalledWith(1, 'SOMETHING RANDOM', [ | ||
| mobiusUris.primary[0], | ||
| ]); | ||
| // Immediately try backup servers when retry-after >= 60 seconds on primary | ||
| expect(attemptRegistrationWithServersSpy).toHaveBeenNthCalledWith( | ||
| 2, | ||
| 'SOMETHING RANDOM', | ||
| mobiusUris.backup | ||
| ); | ||
| }); | ||
| it('should restart registration with primary if we get 429 while on backup', async () => { | ||
| // Setup: Register successfully with primary first | ||
| restartSpy = jest.spyOn(reg, 'restartRegistration'); | ||
| const attemptRegistrationWithServersSpy = jest.spyOn(reg, 'attemptRegistrationWithServers'); | ||
| reg.setActiveMobiusUrl(mobiusUris.backup[0]); | ||
| const failurePayload429Small = <WebexRequestPayload>(<unknown>{ | ||
| statusCode: 429, | ||
| body: 'SOMETHING RANDOM', | ||
| headers: { | ||
| 'retry-after': 100, | ||
| }, | ||
| }); | ||
| webex.request | ||
| .mockRejectedValueOnce(failurePayload429Small) | ||
| .mockResolvedValueOnce(successPayload) | ||
| .mockRejectedValueOnce(failurePayload429Small) | ||
| .mockResolvedValueOnce(successPayload); | ||
|
|
||
| await reg.reconnectOnFailure('SOMETHING RANDOM'); // This call is being used to trigger the retry | ||
| jest.advanceTimersByTime(40 * SEC_TO_MSEC_MFACTOR); | ||
| await flushPromises(); | ||
|
|
||
| expect(attemptRegistrationWithServersSpy).toHaveBeenCalledTimes(2); | ||
| expect(attemptRegistrationWithServersSpy).toHaveBeenNthCalledWith(1, 'SOMETHING RANDOM', [ | ||
| mobiusUris.backup[0], | ||
| ]); | ||
| // Immediately try primary servers when retry-after >= 60 seconds on backup | ||
| expect(restartSpy).toHaveBeenCalledTimes(1); | ||
| expect(restartSpy).toHaveBeenCalledWith('SOMETHING RANDOM'); | ||
| expect(attemptRegistrationWithServersSpy).toHaveBeenNthCalledWith(2, 'SOMETHING RANDOM', [ | ||
| mobiusUris.primary[0], | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should add expect statement for all functions invocation in sequence, Restore being invoked, which fails with 429 then if it goes to restart from there or directly to attemmptRegistration or to failoverTmer function also |
||
| ]); | ||
| }); | ||
| }); | ||
|
|
||
| describe('Registration failover tests', () => { | ||
| it('verify unreachable primary with reachable backup servers', async () => { | ||
| jest.useFakeTimers(); | ||
|
|
@@ -872,7 +976,10 @@ describe('Registration Tests', () => { | |
| // delete should be successful | ||
| global.fetch = jest.fn(() => Promise.resolve({json: () => mockDeleteResponse})) as jest.Mock; | ||
|
|
||
| postRegistrationSpy.mockRejectedValue(failurePayload429Two); | ||
| // Mock to fail twice with 429 (once for executeFailback, once for restorePreviousRegistration) | ||
| postRegistrationSpy | ||
| .mockRejectedValueOnce(failurePayload429Two) | ||
| .mockRejectedValueOnce(failurePayload429Two); | ||
|
|
||
| /* Wait for failback to be triggered. */ | ||
| jest.advanceTimersByTime( | ||
|
|
@@ -892,11 +999,14 @@ describe('Registration Tests', () => { | |
| failurePayload429Two.headers['retry-after'], | ||
| 'executeFailback' | ||
| ); | ||
| expect(reg.failback429RetryAttempts).toBe(0); | ||
| // After handling 429 during failback, the counter is incremented to 1 | ||
| expect(reg.failback429RetryAttempts).toBe(1); | ||
| expect(reg.getStatus()).toBe(RegistrationStatus.INACTIVE); | ||
| expect(restoreSpy).toBeCalledOnceWith(REG_429_RETRY_UTIL); | ||
| expect(restartSpy).toBeCalledOnceWith(REG_429_RETRY_UTIL); | ||
| expect(reg.failbackTimer).toBe(undefined); | ||
| // restartRegistration is not called immediately because 429 with retry-after < 60 | ||
| // schedules a delayed retry instead | ||
| expect(restartSpy).not.toBeCalled(); | ||
| expect(reg.failbackTimer).not.toBe(undefined); // Timer is set in handle429Retry | ||
| expect(reg.rehomingIntervalMin).toBe(DEFAULT_REHOMING_INTERVAL_MIN); | ||
| expect(reg.rehomingIntervalMax).toBe(DEFAULT_REHOMING_INTERVAL_MAX); | ||
| }); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -207,9 +207,32 @@ export class Registration implements IRegistration { | |
| */ | ||
| private async restorePreviousRegistration(caller: string): Promise<boolean> { | ||
| let abort = false; | ||
|
|
||
| // Testing: | ||
| // 4XX | ||
| // 429 big n small value | ||
| // 404 | ||
| // 500 | ||
|
Comment on lines
+210
to
+214
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can remove this comment or make it more verbose |
||
| if (this.activeMobiusUrl) { | ||
| abort = await this.attemptRegistrationWithServers(caller, [this.activeMobiusUrl]); | ||
| if (this.retryAfter) { | ||
| if (this.retryAfter < RETRY_TIMER_UPPER_LIMIT) { | ||
| // If retry-after is less than threshold, honor it and schedule retry | ||
| setTimeout(async () => { | ||
| await this.restartRegistration(caller); | ||
| }, this.retryAfter * 1000); | ||
| } else if ( | ||
| this.primaryMobiusUris.includes(this.activeMobiusUrl) && | ||
| this.backupMobiusUris.length > 0 | ||
| ) { | ||
| // If we are using primary and got 429, switch to backup | ||
| abort = await this.attemptRegistrationWithServers(caller, this.backupMobiusUris); | ||
| } else { | ||
| // If we are using backup and got 429, restart registration | ||
| this.restartRegistration(caller); | ||
| } | ||
|
Comment on lines
+217
to
+232
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are we adding this whole logic here? It seems bit confusing here, this whole logic could be moved to a different function named something like "handling429RegResponse" and then call this inside some catch block There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rarajes2 We already have a callback function created for 429 handling and we are updating the retyr after value there. Why we cannot have separate function and handle retries separately is because we need to keep our retries consolidated. If you see in this approach, we are not scheduling any retry as such but we are calling existing functions only and ensuring that retry after value is honored while invoking those functions. It is to avoid cases where mutliple retries are scheduled from different flows. Let's discuss this over call if it's still confusing |
||
|
|
||
| return true; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are we supposed to return There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wherever we are calling restore function, after that immediatety we are invoking restart which we wanna avopid hence we are returning true here in hardcoded manner but I am thinking if there is a better wway to handle this probably |
||
| } | ||
| } | ||
|
|
||
| return abort; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use proper caller name from wherever reconnectONFailure is being invoked