Add delay logic between Rep's retries to download droplets

## Summary

When Rep is downloading droplets from a blobstore in some cases the Hyperscaller may apply throttling. E.g. Azure has a limit of ~100-150 Gbps, and as soon as this threshold is reached some HTTP Requests are terminated with "503 ServerBusy" so that the maximum bandwidth is not exceeded. For some reason they aren't just reducing the download speed of all of the connections, but just terminating some of them. Also they aren't responding with 429, which is the standard but 503.

We tried to workaround this by decreasing the diego.executor.max_concurrent_downloads to 2, but there was no improvement (we are updating ~45 cells in parallel). For now we will decrease the max_in_flight property, but this is rather a temporary solution and will increase the update time.

This is why we think it will be good to change the code which handles the retires in case of failure. It seems to be here
https://github.com/cloudfoundry/cacheddownloader/blob/master/downloader.go#L213-L215
And add some delay in case of 429 (eventually by processing also "Retry-After" header) and 503 ServerBusy (specifically for Azure).

We should discuss to what extent this should be configurable:

* Plain on/off switch to enable/disable the functionality
* or Configurable delay, even some randomness
* or just add some preset delay of e.g. 5 seconds in case of those errors appearing

## Diego repo
https://github.com/cloudfoundry/cacheddownloader

## Describe alternatives you've considered (optional)
* decrease diego.executor.max_concurrent_downloads from 5 to 2 - for some reason this did not help. Assumption is that Azure is summing up the downloaded data for a certain amount of time and regardless of the number of threads, it reaches the limit
* decrease max_in_flight - this will be our current workaround, though this will increase the update time
* use bigger VMs for the diego cells, so that we update less of them in parallel - this is something that we are currently working on, but is also a temporary solution

## Additional Text Output, Screenshots, or contextual information (optional)

>Please add any other context, slack conversations, log files, code snippets, or screenshots that would help us understand the request.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add delay logic between Rep's retries to download droplets #628

Summary

Diego repo

Describe alternatives you've considered (optional)

Additional Text Output, Screenshots, or contextual information (optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add delay logic between Rep's retries to download droplets #628

Description

Summary

Diego repo

Describe alternatives you've considered (optional)

Additional Text Output, Screenshots, or contextual information (optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions