fix: correctly handle method parameter counting in function_extra_arg #136

benieric · 2025-03-26T19:03:17Z

Issue #, if available:

Description of changes:

This change should address error like model_fn() takes x positional argument but y were given
This error occurs under a race condition where validate_and_initialize_user_module() is called by 2 workers and extra arg is calculated incorrectly

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

arjkesh · 2025-03-28T00:37:39Z

Can you add a test to check for regression against the race condition you described? Besides this, LGTM

davidthomas426 · 2025-03-28T01:54:51Z

Can you add a test to check for regression against the race condition you described? Besides this, LGTM

+1.

benieric · 2025-03-28T02:17:55Z

Thanks for taking a look, I'll work on getting a test for this in

davidthomas426 · 2025-03-28T17:38:15Z

tests/unit/test_handler_service_with_context.py

+    inference_handler.initialize(CONTEXT)
+    inference_handler.initialize(CONTEXT)


Can you clarify why there are two threads calling initialize twice in a single python process?

Yup, it is in the handle() method -

sagemaker-huggingface-inference-toolkit/src/sagemaker_huggingface_inference_toolkit/handler_service.py

Line 234 in 92b57dd

def handle(self, data, context):

specifically this block:

if not self.initialized: if self.attempted_init: logger.warn( "Model is not initialized, will try to load model again.\n" "Please consider increase wait time for model loading.\n" ) self.initialize(context)

The test just assumes the fail condition already occurred

From some personal testing I was able to see that that model gets loaded and later on fails when attempting to load again:

2025-03-25T18:41:48,682 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=7a3c6bfffe7cf36a-0000007c-00000000-9116165d8c9c5504-f2c269a2 ... 2025-03-25T18:42:34,468 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - mms.service.PredictionException: model_fn() takes 1 positional argument but 2 were given : 400

After testing with fix, did not see such error

davidthomas426

LGTM.

Though I'm still slightly confused about how a race condition will happen in this code. But I see that if the initialization function runs twice, it causes this problem, and I can see how this fixes that.

benieric added 2 commits March 26, 2025 12:02

fix: correctly handle method parameter counting in function_extra_arg

6efa0b0

format

89be785

benieric marked this pull request as ready for review March 27, 2025 21:44

benieric added 2 commits March 28, 2025 10:20

add tests

d1db51f

fix test name

a204e71

arjkesh approved these changes Mar 28, 2025

View reviewed changes

davidthomas426 reviewed Mar 28, 2025

View reviewed changes

davidthomas426 approved these changes Mar 28, 2025

View reviewed changes

davidthomas426 merged commit 92b57dd into aws:main Mar 28, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: correctly handle method parameter counting in function_extra_arg #136

fix: correctly handle method parameter counting in function_extra_arg #136

Uh oh!

benieric commented Mar 26, 2025 •

edited

Loading

Uh oh!

arjkesh commented Mar 28, 2025

Uh oh!

davidthomas426 commented Mar 28, 2025

Uh oh!

benieric commented Mar 28, 2025

Uh oh!

davidthomas426 Mar 28, 2025

Uh oh!

benieric Mar 29, 2025

Uh oh!

benieric Mar 29, 2025

Uh oh!

benieric Apr 1, 2025

Uh oh!

davidthomas426 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		inference_handler.initialize(CONTEXT)
		inference_handler.initialize(CONTEXT)

fix: correctly handle method parameter counting in function_extra_arg #136

fix: correctly handle method parameter counting in function_extra_arg #136

Uh oh!

Conversation

benieric commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arjkesh commented Mar 28, 2025

Uh oh!

davidthomas426 commented Mar 28, 2025

Uh oh!

benieric commented Mar 28, 2025

Uh oh!

davidthomas426 Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

benieric Mar 29, 2025

Choose a reason for hiding this comment

Uh oh!

benieric Mar 29, 2025

Choose a reason for hiding this comment

Uh oh!

benieric Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

davidthomas426 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

benieric commented Mar 26, 2025 •

edited

Loading