Tokenizer Fixes For Issue 430 #433

martindevans · 2024-01-12T15:33:54Z

Added a test for tokenizing just a new line (reproduce LLama.Native.SafeLLamaContextHandle.Tokenize bug? #430)
Properly displaying LLamaToken
Removed all tokenisation code in SafeLLamaContextHandle - just pass it all through to the SafeLlamaModelHandle
Improved SafeLlamaModelHandle tokenisation:
- Renting an array, for one less allocation
- Not using &tokens[0] to take a pointer to an array, this is redundant and doesn't work on empty arrays

…rp#430) - Properly displaying `LLamaToken` - Removed all tokenisation code in `SafeLLamaContextHandle` - just pass it all through to the `SafeLlamaModelHandle` - Improved `SafeLlamaModelHandle` tokenisation: - Renting an array, for one less allocation - Not using `&tokens[0]` to take a pointer to an array, this is redundant and doesn't work on empty arrays

martindevans merged commit 5b41c8e into SciSharp:master Jan 12, 2024

martindevans deleted the tokenizer_fixes_newline branch January 12, 2024 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tokenizer Fixes For Issue 430 #433

Tokenizer Fixes For Issue 430 #433

Uh oh!

martindevans commented Jan 12, 2024

Uh oh!

Uh oh!

Tokenizer Fixes For Issue 430 #433

Tokenizer Fixes For Issue 430 #433

Uh oh!

Conversation

martindevans commented Jan 12, 2024

Uh oh!

Uh oh!