Skip to content

Conversation

SilasMarvin
Copy link
Collaborator

LlamaCPP added support for chat templates a few weeks ago: ggml-org/llama.cpp#5538

This adds a method on the model struct to apply a chat template.

Copy link
Contributor

@MarcusDunn MarcusDunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition. I'd like to see a user able to allocate a bigger buffer if required + a couple nitpicks.

@MarcusDunn MarcusDunn mentioned this pull request Mar 4, 2024
@jiabochao
Copy link
Contributor

Very useful feature, can’t wait!

@SilasMarvin
Copy link
Collaborator Author

Checking in on this. Has anything with this functionality been added yet? If not, I'll make the requested changes

@MarcusDunn
Copy link
Contributor

There's some overlap with #194, otherwise nothing comes to mind.

@MarcusDunn
Copy link
Contributor

The windows build failing is fine, but could you look into why the Linux Cuda one is failing?

Looks like a i8 vs u8 pointer difference.

@SilasMarvin
Copy link
Collaborator Author

The windows build failing is fine, but could you look into why the Linux Cuda one is failing?

Looks like a i8 vs u8 pointer difference.

6f9fa32 should fix it. That is actually a really interesting error. Whether c_char is an *i8 or *u8 is dependent on the architecture which is why I was not getting an error: https://doc.rust-lang.org/std/os/raw/type.c_char.html

@MarcusDunn MarcusDunn merged commit 636da79 into utilityai:main Apr 6, 2024
@MarcusDunn
Copy link
Contributor

awesome, thanks!

@SilasMarvin SilasMarvin deleted the silas-apply-chat-template branch April 6, 2024 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants