llama_state

It looks like recent updates to Llama.cpp (e.g. https://github.com/ggerganov/llama.cpp/pull/1797) have modified the API significantly with regards to how "state" is handled.

The `llama_model` is loaded with one API call ([llama_load_model_from_file](https://github.com/ggerganov/llama.cpp/commit/527b6fba1d237befb324fd846bda7418c0fa394d#diff-81037668d1ec4be4b72740f4070add30efb0b021c28e93e41c2a0a2062ba10e8R555)), which loads all of the static data (weights, vocabulary etc) and then you can create one _or more_ states over this ([llama_new_context_with_model](https://github.com/ggerganov/llama.cpp/commit/527b6fba1d237befb324fd846bda7418c0fa394d#diff-81037668d1ec4be4b72740f4070add30efb0b021c28e93e41c2a0a2062ba10e8R561)).

Is anyone else working on this? If not I'm happy to have a go at it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama_state #62

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

llama_state #62

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions