whisper : add Metal support in the Decoder

GPU inference on Apple Silicon via Metal backend was recently added to `llama.cpp`: https://github.com/ggerganov/llama.cpp/pull/1642

We should port the changes to `whisper.cpp` and allow the Decoder to run on the GPU in a similar way