Commit 2002bc9
server : refactor (#5882)
* server : refactoring (wip)
* server : remove llava/clip objects from build
* server : fix empty prompt handling + all slots idle logic
* server : normalize id vars
* server : code style
* server : simplify model chat template validation
* server : code style
* server : minor
* llama : llama_chat_apply_template support null buf
* server : do not process embedding requests when disabled
* server : reorganize structs and enums + naming fixes
* server : merge oai.hpp in utils.hpp
* server : refactor system prompt update at start
* server : disable cached prompts with self-extend
* server : do not process more than n_batch tokens per iter
* server: tests: embeddings use a real embeddings model (#5908)
* server, tests : bump batch to fit 1 embedding prompt
* server: tests: embeddings fix build type Debug is randomly failing (#5911)
* server: tests: embeddings, use different KV Cache size
* server: tests: embeddings, fixed prompt do not exceed n_batch, increase embedding timeout, reduce number of concurrent embeddings
* server: tests: embeddings, no need to wait for server idle as it can timout
* server: refactor: clean up http code (#5912)
* server : avoid n_available var
ggml-ci
* server: refactor: better http codes
* server : simplify json parsing + add comment about t_last
* server : rename server structs
* server : allow to override FQDN in tests
ggml-ci
* server : add comments
---------
Co-authored-by: Pierrick Hymbert <[email protected]>1 parent ceca1ae commit 2002bc9
File tree
14 files changed
+2265
-2711
lines changed- .github/workflows
- examples
- server
- tests
- features
- steps
14 files changed
+2265
-2711
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
61 | | - | |
| 61 | + | |
| 62 | + | |
62 | 63 | | |
63 | 64 | | |
64 | 65 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
724 | 724 | | |
725 | 725 | | |
726 | 726 | | |
727 | | - | |
| 727 | + | |
728 | 728 | | |
729 | | - | |
730 | | - | |
| 729 | + | |
731 | 730 | | |
732 | 731 | | |
733 | 732 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
| 16 | + | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
436 | 436 | | |
437 | 437 | | |
438 | 438 | | |
439 | | - | |
| 439 | + | |
440 | 440 | | |
441 | 441 | | |
442 | 442 | | |
| |||
This file was deleted.
0 commit comments