Skip to content

Conversation

dmchoiboi
Copy link
Collaborator

Pull Request Summary

Deepseek loads model configs + tokenizers through python files defined in checkpoint.

What is this PR changing? Why is this change being made? Any caveats you'd like to highlight? Link any relevant documents, links, or screenshots here if applicable.

Test Plan and Usage Guide

How did you validate that your PR works correctly? How do you run or demo the code? Provide enough detail so a reviewer can reasonably reproduce the testing procedure. Paste example command line invocations if applicable.

@dmchoiboi dmchoiboi requested review from seanshi-scale and yunfeng-scale and removed request for seanshi-scale October 14, 2024 17:38
@dmchoiboi dmchoiboi force-pushed the dmchoi/trust_remote_code branch from 924bd8f to eb5f813 Compare October 14, 2024 18:00
trust_remote_code,
)
elif checkpoint_path.startswith("azure://") or "blob.core.windows.net" in checkpoint_path:
return self.load_model_weights_sub_commands_abs(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we update this too to keep things in sync?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk done

@dmchoiboi dmchoiboi merged commit 89b9ddd into main Oct 14, 2024
7 checks passed
@dmchoiboi dmchoiboi deleted the dmchoi/trust_remote_code branch October 14, 2024 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants