Skip to content

HTTP server should vary endpoints by available models #1313

@grahamking

Description

@grahamking

Feature request

In a distributed setup we have an ingress node like this: dynamo-run in=http out=dyn.

That currently exposes endpoints to list models (that should stay always), do completions, chat completions, soon embeddings, and more in the future.

However we might only have a chat model, in which case completions and embeddings should not be advertised, and so on.

We know what kind of model we are serving because they register themselves with a ModelType.

Describe the problem you're encountering

/

Describe alternatives you've tried

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    backlogIssues or bugs that will be tracked for long term fixesdynamo-llmRelates to dynamo-llm componentenhancementNew feature or requestlanguage::rustIssues/PRs that reference Rust code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions