Making batch object deletion more robust #478

jamiehannaford · 2014-12-01T10:53:14Z

When containers are deleted and true is passed in as an argument, the SDK will delete all the existing objects inside this container before issuing a final DELETE request for the container itself. The current method of doing this is not optimal because:

An individual DELETE request is sent per object. For large collections, this is a massive performance hit to the local client
When all requests have been sent, the remote API needs time to process all the delete requests. But we aren't waiting for this to happen.

To address these issues, I've implemented two fixes:

Instead of populating an array of requests (one per object), we now delete objects through the service's batch delete method which supports up to 10,000 object deletions per request.
After these batch delete requests are sent, we now wait until the X-Container-Object-Count metadata value equals 0 (by continuously polling via HEAD requests).

This should go some way to fixing #477.

ycombinator · 2014-12-01T12:48:45Z

Thanks @jamiehannaford. I'll review this within a couple of hours.

ycombinator · 2014-12-01T13:01:40Z

lib/OpenCloud/ObjectStore/Resource/Container.php

Nit: make the 10000 a class constant.

jamiehannaford · 2014-12-01T14:57:45Z

@ycombinator Ready for review again

ycombinator · 2014-12-01T15:01:04Z

lib/OpenCloud/ObjectStore/Resource/Container.php

Thinking about this some more, I wonder if the chunking should be handled inside the bulkDelete method. This way every caller of it wouldn't have to worry about it.

I thought about that too, but the problem with moving the chunking to bulkDelete is that we'll also need to change the method's response type from a singular Response to an array of Responses - which is a non-BC change

Oh, good point. What do you think about providing a convenience method alongside bulkDelete, then? This method would implement the chunking and return an array of Responses.

ycombinator · 2014-12-01T15:32:50Z

I've been testing this change with the test script. So far I've run the script three times and it has failed all times with the same 409 Conflict error as before. When I check the container, it still contains one or two objects in it.

So either the bulk delete API is also intended to be asynchronous with the actual deletes or this is a bug upstream. @jamiehannaford: Could you try the test script a couple of times as well, please?

…; add container waiter

jamiehannaford · 2014-12-01T16:22:39Z

@ycombinator This is ready for another round of review. I tested before I made the above changes on two very large containers (one had +1000 objects, the other +600) and both deleted all the objects successfully. The only bug was that the X-Container-Object-Count val was returned as a string, so the === 0 check kept failing (which meant the loop continued until timeout).

ycombinator · 2014-12-01T16:42:44Z

@jamiehannaford:

Our friend, the PSR-2 linter, failed the build
I just tried the test script with this code and it failed again with the same 409 Conflict error. In your tests with the large containers how large were the files inside the containers and which region were you using? The test script container has 100 files, each up to 10kB in size, and is located in DFW.

jamiehannaford · 2014-12-01T17:21:51Z

@ycombinator PSR stuff should be fixed. The other issue is weird. I just tested again with a container in IAD (800+ objects, most of which were over 40KB in size), and it deleted everything successfully (all the objects and the container) without a 409...

ycombinator · 2014-12-01T17:31:01Z

@jamiehannaford Interesting. Let me try IAD instead of DFW. Would you mind trying the inverse on your end (DFW instead of IAD)?

ycombinator · 2014-12-01T17:44:03Z

I just tried IAD instead of DFW three times and was unable to reproduce my issue even once! So it appears that this issue might be region-specific. I'm testing in ORD now but I think, regardless of my findings, this PR has some improvements and is good to be merged. Will merge shortly.

Making batch object deletion more robust

Jamie Hannaford added 2 commits December 1, 2014 10:46

Making batch object deletion more robust

627ea06

No need for counter 💭

26d980e

ycombinator reviewed Dec 1, 2014
View reviewed changes

ycombinator mentioned this pull request Dec 1, 2014

Deleting container and objects within causes 409 Conflict error #477

Closed

Use proper timeout and class constant

87797b0

ycombinator reviewed Dec 1, 2014
View reviewed changes

Moving chunking functionality to service client; deprecate bulkDelete…

cf46cbf

…; add container waiter

PSR-2 fixes

4f095fc

ycombinator added a commit that referenced this pull request Dec 1, 2014

Merge pull request #478 from jamiehannaford/cf-batch-delete

0fe266a

Making batch object deletion more robust

ycombinator merged commit 0fe266a into rackspace:working Dec 1, 2014

jamiehannaford deleted the cf-batch-delete branch December 1, 2014 18:43

Making batch object deletion more robust #478

Making batch object deletion more robust #478

Uh oh!

Conversation

jamiehannaford commented Dec 1, 2014

Uh oh!

ycombinator commented Dec 1, 2014

Uh oh!

ycombinator Dec 1, 2014

Choose a reason for hiding this comment

Uh oh!

jamiehannaford Dec 1, 2014

Choose a reason for hiding this comment

Uh oh!

jamiehannaford commented Dec 1, 2014

Uh oh!

ycombinator Dec 1, 2014

Choose a reason for hiding this comment

Uh oh!

jamiehannaford Dec 1, 2014

Choose a reason for hiding this comment

Uh oh!

ycombinator Dec 1, 2014

Choose a reason for hiding this comment

Uh oh!

ycombinator commented Dec 1, 2014

Uh oh!

jamiehannaford commented Dec 1, 2014

Uh oh!

ycombinator commented Dec 1, 2014

Uh oh!

jamiehannaford commented Dec 1, 2014

Uh oh!

ycombinator commented Dec 1, 2014

Uh oh!

ycombinator commented Dec 1, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants