Skip to content

Conversation

@sreekanthputta
Copy link

Add flush_flag to listener to flush the recorded audio immediately without waiting for the phrase to complete.

I am working on a real time speech to text application where I am kinda facing an issue.
When the user is done talking, the speech_recognizer waits until the pause_threshold is elapsed. This gets even worse in noisy environments with the dynamic_energy_threshold turned off.

My users don't want to wait as they know that they are done talking. They want to be able to hit enter and reduce the time taken to show them the transcription.

This is just one example of where this could be helpful. I'm sure this feature can be useful in many ways.

I have tried stopper but, it takes a maximum of a second to stop but wont flush the audio.
Also, the stopper wont stop the recorder when the audio is being actively recorded at the times where energy > energy_threshold.

Hence this change.

How to use?

self.flush_flag = [False]
self.recorder.listen_in_background(self.source, self.record_callback, phrase_time_limit=self.record_timeout, flush_flag=self.flush_flag)
        
def onEnter():
    self.flush_flag = [True] # this flag will be reset to false once the audio is flushed.

Please feel free to modify the logic to make it more clean and robust.
TIA.
<|endoftext|>

@ftnext
Copy link
Collaborator

ftnext commented Jul 30, 2024

Thanks.
Is this the same feature request with #757?

@sreekanthputta
Copy link
Author

Not really.
#757 is about streaming buffers as they are recorded.
My change is about the speaker being able to stop the recording immediately after he is done speaking either by clicking transcribe button on my UI or releasing the mic button which held since he started speaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants