Skip to content

Conversation

jamiehannaford
Copy link
Contributor

When containers are deleted and true is passed in as an argument, the SDK will delete all the existing objects inside this container before issuing a final DELETE request for the container itself. The current method of doing this is not optimal because:

  • An individual DELETE request is sent per object. For large collections, this is a massive performance hit to the local client
  • When all requests have been sent, the remote API needs time to process all the delete requests. But we aren't waiting for this to happen.

To address these issues, I've implemented two fixes:

  1. Instead of populating an array of requests (one per object), we now delete objects through the service's batch delete method which supports up to 10,000 object deletions per request.
  2. After these batch delete requests are sent, we now wait until the X-Container-Object-Count metadata value equals 0 (by continuously polling via HEAD requests).

This should go some way to fixing #477.

@ycombinator
Copy link
Contributor

Thanks @jamiehannaford. I'll review this within a couple of hours.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: make the 10000 a class constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

@jamiehannaford
Copy link
Contributor Author

@ycombinator Ready for review again

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this some more, I wonder if the chunking should be handled inside the bulkDelete method. This way every caller of it wouldn't have to worry about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that too, but the problem with moving the chunking to bulkDelete is that we'll also need to change the method's response type from a singular Response to an array of Responses - which is a non-BC change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point. What do you think about providing a convenience method alongside bulkDelete, then? This method would implement the chunking and return an array of Responses.

@ycombinator
Copy link
Contributor

I've been testing this change with the test script. So far I've run the script three times and it has failed all times with the same 409 Conflict error as before. When I check the container, it still contains one or two objects in it.

So either the bulk delete API is also intended to be asynchronous with the actual deletes or this is a bug upstream. @jamiehannaford: Could you try the test script a couple of times as well, please?

@jamiehannaford
Copy link
Contributor Author

@ycombinator This is ready for another round of review. I tested before I made the above changes on two very large containers (one had +1000 objects, the other +600) and both deleted all the objects successfully. The only bug was that the X-Container-Object-Count val was returned as a string, so the === 0 check kept failing (which meant the loop continued until timeout).

@ycombinator
Copy link
Contributor

@jamiehannaford:

  1. Our friend, the PSR-2 linter, failed the build
  2. I just tried the test script with this code and it failed again with the same 409 Conflict error. In your tests with the large containers how large were the files inside the containers and which region were you using? The test script container has 100 files, each up to 10kB in size, and is located in DFW.

@jamiehannaford
Copy link
Contributor Author

@ycombinator PSR stuff should be fixed. The other issue is weird. I just tested again with a container in IAD (800+ objects, most of which were over 40KB in size), and it deleted everything successfully (all the objects and the container) without a 409...

@ycombinator
Copy link
Contributor

@jamiehannaford Interesting. Let me try IAD instead of DFW. Would you mind trying the inverse on your end (DFW instead of IAD)?

@ycombinator
Copy link
Contributor

I just tried IAD instead of DFW three times and was unable to reproduce my issue even once! So it appears that this issue might be region-specific. I'm testing in ORD now but I think, regardless of my findings, this PR has some improvements and is good to be merged. Will merge shortly.

ycombinator added a commit that referenced this pull request Dec 1, 2014
Making batch object deletion more robust
@ycombinator ycombinator merged commit 0fe266a into rackspace:working Dec 1, 2014
@jamiehannaford jamiehannaford deleted the cf-batch-delete branch December 1, 2014 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants