-
Notifications
You must be signed in to change notification settings - Fork 470
feat: support custom generation control of executors. #364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Overall looks good! Over in the other PR I was confused about how the |
/// <param name="inferenceParams">The inference params used in the current generation.</param> | ||
/// <param name="lastOutputText">The last output text generated by the model.</param> | ||
/// <returns></returns> | ||
bool ShouldStopGeneration(LLamaContext context, IInferenceParams inferenceParams, string lastOutputText); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought - is it possible to decouple this interface from the IInferenceParams
somehow? It feels a bit odd that GenerationControl
is overriding parts of IInferenceParams
and is also configured by it.
The obvious way to do that would be to remove the inferenceParams
parameter here, that would introduce problems in DefaultGenerationControl
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I also thought of that and I could come up with the following alternatives:
- Change the API of
ILLamaExecutor
toInferAsync(string text, IInferenceParams? inferenceParams = null, IGenerationControl? control = null, CancellationToken token = default);
or similar formats. It will introduce some break changes though. - Use generic type. Change the
ILLamaExecutor
toILLamaExecutor<TControl>
, or change the methodInferAsync
. - Use chained-call. For example, in the sight of users, they could do
var executor = new StatelessExecutor().WithControl()
.
Each of them has advantages and disadvantages. The most important one I think is that whether we should make it class level or method level. Actually I can't think up a case that the user must use different control strategies in different calls of an executor, though it does have such flexibility. If to get a compromise of it, I'd like to suggest the following proposal:
- Use the option 3 above, the chained-call for
ILLamaExecutor
, takingWithGenerationControl
as an extension method. - Define a static method to execute the generation, just like what I did in [WIP] refactor: init some experimental refactoring. #362. In this static method, we could introduce any break change, like option 1 above. Since dotnet 7, we could define a static method in the interface but in older version it's not supported. Therefore if taking this approach, some parts of the design may look a little awkward.
BTW there's one thing related with it, the sampling pipeline. Currently InferenceParams
contains many params of sampling, while specifying SamplingPipeline
will totally override them. It's completely okay in the master branch because we don't want to introduce substantial break changes now. However I'm wondering if we should refactor it before v1.0.0.
For example, we could add many kinds of sampling pipelines as we discussed here, along with a class named CustomSamplingPipeline
, which accepts all the parameters related with samping in InferenceParams
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking more along the lines of removing the IInferenceParams inferenceParams
from the ShouldStopGeneration
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's completely okay in the master branch because we don't want to introduce substantial break changes now. However I'm wondering if we should refactor it before v1.0.0.
Yeah this is my thinking. It feels like we're moving in a direction where we can completely get rid of the inference params in the future (which is part of why I want to avoid depending on it in this new interface, if possible).
I've fixed the merge conflict. Waiting for tests now. |
Sorry for leaving this PR incomplete... Since already two months past, the design of this PR may need to be re-considered. Therefore I convert it to draft PR now |
Related discussion: #362