Llama.cpp 30B runs with only 6GB of RAM now : https://github.com/ggerganov/llama.cpp/pull/613