-mu without -m is... tricky

TL;DR: I propose to default `-m` to `models/` + filename from `-mu` (or `-hff`) if it's set

It's easy to misuse these flags, for instance:

```bash
./main -mu https://huggingface.co/NousResearch/Meta-Llama-3-70B-Instruct-GGUF/resolve/main/Meta-Llama-3-70B-Instruct-Q5_K_M.gguf -p "Test"
# Wait patiently for 50GB to download
# ...

# Wanna test something else?
./main -mu https://huggingface.co/TheBloke/phi-2-GGUF/resolve/main/phi-2.Q2_K.gguf -p "Test"
# Oh well, your 50GB model is gone forever now
```

In a nutshell:


- The workaround (always specify `-mu` & `-m` together) is cumbersome

  ```bash
  ./main -mu https://huggingface.co/NousResearch/Meta-Llama-3-70B-Instruct-GGUF/resolve/main/Meta-Llama-3-70B-Instruct-Q5_K_M.gguf \
         -m  models/Meta-Llama-3-70B-Instruct-Q5_K_M.gguf \
         -p "Test"
  ```
- it feels weird / wrong that w/o an explicit `-m`, these __quantized__ models got downloaded to `models/7B/ggml-model-f16.gguf`
- by default the folder `models/7B` doesn't exist and these commands meant to simplify the experience might puzzle first-time users (compare to [ollama](https://github.com/ollama/ollama/blob/main/README.md#quickstart))

(the only benefit I see to the current behaviour is for people who have profuse bandwidth and a very small hard drive)

I propose to turn main & server's `-m`'s default to `models/$( basename $model_url )` if `-mu` (or `-hff`) is set, and to the legacy `models/7B/ggml-model-f16.gguf` otherwise.

Happy to send a PR if there's a consensus.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

-mu without -m is... tricky #6887

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

-mu without -m is... tricky #6887

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions