Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Mar 13, 2025

Supersede #12347 and #12323

Close #12264

I checked the code base and turns out n_predict is only support on main.cpp and infill.cpp

For server, use --no-context-shift to do the same thing, so it doesn't make sense to add n_predict == -2 support to server (which turns out to be quite messy)

@ngxson ngxson requested a review from ggerganov March 13, 2025 10:29
Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove -2 from the main and infill examples and use the --no-context-shift too, but we can do it later.

@ngxson ngxson merged commit be7c303 into ggml-org:master Mar 13, 2025
47 checks passed
jpohhhh pushed a commit to Telosnex/llama.cpp that referenced this pull request Mar 14, 2025
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Mar 19, 2025
@Martin-Laclaustra
Copy link

Please reconsider that there is a real need for n_predict = -2 in the server example and --no-context-shift is not equivalent to stopping at the end of the context:

#12264 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eval bug: server API endpoint not respecting n_predict with -2 (until context filled)

3 participants