Skip to content

16 kHz required instead of 16 bit (main example) #909

@rubendax

Description

@rubendax

In the main example, audio files must be at an audio sampling rate of 16 kHz to be accepted:
read_wav: WAV file 'xyz.wav' must be 16 kHz.

However, in the readme it says:
"Note that the main example currently runs only with 16-bit WAV files".
The readme also suggests using ffmeg to convert the file to 16 bit with the -ar 16000 option. However, -ar controls the audio sampling frequency in hertz, not the audio bit rate.

Having a 16 bit depth requirement makes sense as it is the standard for recorded audio, however, 16 kHz sampling rate is far from standard as the vast majority of recorded audio is at 44.1 kHz (44100 hz) sampling rate or higher.

I could be misunderstanding the intention here, it is totally possible that converting to a lower sampling rate is needed to decrease the overhead for whisper.cpp. However, based on the readme, it seems that there might be some confusion between audio sampling rate and audio bit depth, so I thought I should mention it, just in case.


As I side note I would like to give a huge shoutout to ggerganov. Your work has been invaluable for the open source ML/AI community and has completely changed the landscape. Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions