Trace output distributions to a log file #246

Piezoid · 2023-03-17T21:46:55Z

I do not expect this to be merged, but I figured it might help others. Although, I don't know if this is the right place.

This logs information to a ./out.log (hard-coded) file. I wrote this throwaway code before the switch to stderr, which is why it uses a global file handle.
The refactoring of the sampler code should produce the same results than the master branch.

For each predicted token, it logs:

in:' because' n_past=14, remaining_tokens=62, embd.size()=1, embd_inp.size()=13
soft_max: top_sact=25.503617 mean_sact=19.826111 top_p=0.357196 entropy=1.664120
top_p: n: 15 sum: 0.990421
->0: ' they' p=0.357196 act=17.853 temp=0.70
  1: ' I' p=0.231013 act=20.643 temp=0.82
  2: ' of' p=0.228527 act=17.540 temp=0.70
[...]
  15: ' the' p=0.000876 act=13.645 temp=0.70

The soft_max: lines reports statistics of the top k tokens' logits (divided by temp) and entropy (in nats, not bits),
top_p: line gives the number of retained tokens after the top p filtering, and the sum of their probabilities,
Last, a list of the top 16 tokens, along with their respective probabilities, original logits, and the product of their temperature and eventual repetition penalty.. The drawn token is indicated by an ->.

I will close this either when it becomes obsolete or when it can no longer be rebased.

* add tokens per second output * Update gpttype_adapter.cpp simplify --------- Co-authored-by: LostRuins <[email protected]>

Piezoid added 5 commits March 20, 2023 12:10

new sampler for experimentation.

abbf7e7

sampler log function

aa6c2bd

log file for debug output

e66962f

log llama's entropy

4547848

log distribution after prompt tokens

0375574

gjmulder added the enhancement New feature or request label Mar 20, 2023

Piezoid closed this Mar 24, 2023

Piezoid mentioned this pull request Mar 24, 2023

Trace model outputs to a binary file #477

Closed

WangHaoranRobin mentioned this pull request Jun 10, 2023

A sampling function that returns top token probabilities #1784

Closed

AAbushady pushed a commit to AAbushady/llama.cpp that referenced this pull request Jan 27, 2024

add tokens per second output (ggml-org#246)

971fe9f

* add tokens per second output * Update gpttype_adapter.cpp simplify --------- Co-authored-by: LostRuins <[email protected]>

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trace output distributions to a log file #246

Trace output distributions to a log file #246

Uh oh!

Piezoid commented Mar 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Trace output distributions to a log file #246

Trace output distributions to a log file #246

Uh oh!

Conversation

Piezoid commented Mar 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants