-
Notifications
You must be signed in to change notification settings - Fork 472
Closed
Description
It looks like recent updates to Llama.cpp (e.g. ggml-org/llama.cpp#1797) have modified the API significantly with regards to how "state" is handled.
The llama_model
is loaded with one API call (llama_load_model_from_file), which loads all of the static data (weights, vocabulary etc) and then you can create one or more states over this (llama_new_context_with_model).
Is anyone else working on this? If not I'm happy to have a go at it.
Metadata
Metadata
Assignees
Labels
No labels