-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Closed
Labels
Description
Hi,
I was able to build a version of Llama using clblast + llama on Android. I am using this model ggml-model-q4_0.gguf and ggml-model-f32.gguf
When running it seems to be working even if the output look weird and not matching the question but at least I have some response. I can have some error in my default params for sure ;)
My issues is even if I can detect I am initializing OpenCL it always run on CPU. How I can enforce to run on GPU / OpenCL backend ?
Also I see in thecae the KF16 support is commented for OpenCL any reason ?
Thanks