Skip to content

LLama.Native.SafeLLamaContextHandle.Tokenize bug? #430

@vvdb-architecture

Description

@vvdb-architecture

I've been using Microsoft's Kernel Memory with LLmaSharp and have encountered an (issue)[https://github.com/microsoft/kernel-memory/issues/252].

Basically, a call to

 _context.Tokenize(text).Length;

with text equal to "\n" throws an exception:

LLama.Exceptions.RuntimeError
  HResult=0x80131500
  Message=Error happened during tokenization. It's possibly caused by wrong encoding. Please try to specify the encoding.
  Source=LLamaSharp
  StackTrace:
   at LLama.Native.SafeLLamaContextHandle.Tokenize(String text, Boolean add_bos, Boolean special, Encoding encoding)

and according to the folks there, it seems to be a bug in LLamaSharp. So I've been asked to post the issue here.

I've tested with various models, e.g. kai-7b-instruct.Q5_K_M.gguf and it seems to occur in most of them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions