Skip to content

Conversation

SanftMonster
Copy link
Collaborator

Related discussion: #362

@SanftMonster SanftMonster added the enhancement New feature or request label Dec 14, 2023
@martindevans
Copy link
Member

Overall looks good!

Over in the other PR I was confused about how the ShouldStopGeneration(token) method works and what I was talking about there was wrong. Desregard all that :)

/// <param name="inferenceParams">The inference params used in the current generation.</param>
/// <param name="lastOutputText">The last output text generated by the model.</param>
/// <returns></returns>
bool ShouldStopGeneration(LLamaContext context, IInferenceParams inferenceParams, string lastOutputText);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought - is it possible to decouple this interface from the IInferenceParams somehow? It feels a bit odd that GenerationControl is overriding parts of IInferenceParams and is also configured by it.

The obvious way to do that would be to remove the inferenceParams parameter here, that would introduce problems in DefaultGenerationControl though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I also thought of that and I could come up with the following alternatives:

  1. Change the API of ILLamaExecutor to InferAsync(string text, IInferenceParams? inferenceParams = null, IGenerationControl? control = null, CancellationToken token = default); or similar formats. It will introduce some break changes though.
  2. Use generic type. Change the ILLamaExecutor to ILLamaExecutor<TControl>, or change the method InferAsync.
  3. Use chained-call. For example, in the sight of users, they could do var executor = new StatelessExecutor().WithControl().

Each of them has advantages and disadvantages. The most important one I think is that whether we should make it class level or method level. Actually I can't think up a case that the user must use different control strategies in different calls of an executor, though it does have such flexibility. If to get a compromise of it, I'd like to suggest the following proposal:

  1. Use the option 3 above, the chained-call for ILLamaExecutor, taking WithGenerationControl as an extension method.
  2. Define a static method to execute the generation, just like what I did in [WIP] refactor: init some experimental refactoring. #362. In this static method, we could introduce any break change, like option 1 above. Since dotnet 7, we could define a static method in the interface but in older version it's not supported. Therefore if taking this approach, some parts of the design may look a little awkward.

BTW there's one thing related with it, the sampling pipeline. Currently InferenceParams contains many params of sampling, while specifying SamplingPipeline will totally override them. It's completely okay in the master branch because we don't want to introduce substantial break changes now. However I'm wondering if we should refactor it before v1.0.0.
For example, we could add many kinds of sampling pipelines as we discussed here, along with a class named CustomSamplingPipeline, which accepts all the parameters related with samping in InferenceParams now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more along the lines of removing the IInferenceParams inferenceParams from the ShouldStopGeneration methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's completely okay in the master branch because we don't want to introduce substantial break changes now. However I'm wondering if we should refactor it before v1.0.0.

Yeah this is my thinking. It feels like we're moving in a direction where we can completely get rid of the inference params in the future (which is part of why I want to avoid depending on it in this new interface, if possible).

@martindevans
Copy link
Member

I've fixed the merge conflict. Waiting for tests now.

@SanftMonster
Copy link
Collaborator Author

Sorry for leaving this PR incomplete... Since already two months past, the design of this PR may need to be re-considered. Therefore I convert it to draft PR now

@SanftMonster SanftMonster marked this pull request as draft March 3, 2024 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants