Currently, in order to run multiple transformations in parallel, you'll need to instantiate multiple whisper_contexts because the state of the transformation (e.g mel spectogram, previous prompts, prtial results) is stored on the same context where the model, vocabulary, etc. are stored.
Is there something which I'm missing why this cannot be separated (to have the context and the state as different entities).
The context will be immutable in this case and can be used by multiple transformations without extra allocation.
What do you think @ggerganov ?