A DBSCAN-Based Automation for Speech Onset Detection
Download Praditor | English · 中文
Praditor is a speech onset detector that helps you find out boundaries between silence and sound automatically.
Praditor works for both single-onset and multi-onset audio files without any language limitation. It generates output as PointTiers in .TextGrid format.
- Onset/Offset Detection
- Silence Detection
Praditor also allows users to adjust parameters in the Dashboard to get a better performance.
You can try test_audio.wav and test_audio_mp3.mp3 on Praditor.
Basic understanding is enough. Understanding the algorithm is better.
- Basic knowledge: Go to the first section of Quick Fix.
- Advanced knowledge: Go to the second section of Quick Fix (i.e., Detailed Introduction).
- Expert knowledge: Go to Parameter.
Although I have prepared various buttons in this GUI, you do not have to use them all.
The simplest and easiest procedure is (1) import audio files, (2) hit the extract
button,
(3) [optional] you are not happy about the results, fine-tune the parameters and hit the extract
button again.
Until you are happy about the results, repeat step (2) and (3).
File
-> Read files...
-> Import your target audio file (Recommend: >= 44.1 kHz; Accept: >= 8 kHz)
Run
Run algorithm and extract onsets. Wait for a while until the results come out. Onsets are in blue, offsets are in green.
Test
Test how many onsets/offsets may be found using the presented parameters. This function does not affect .TextGrid.
If the number meets the expectation, hit
Run
to get the final annotation.
F5 to play the audio signal that is currently presented in the window, and Any Key to stop playing.
Next
/Prev
Move to the next/previous audio file
Clear
If you want to temporarily clear the annotations, this does not delete/change the .TextGrid file. It's safe.
Show
If you want the cleared annotations back. Praditor will go back to the .TextGrid and present whatever is in it.
Onset
/Offset
to hide/show annotations on the screen (also does not change the .TextGrid).
Note: Onsets and offsets are controlled by two DIFFERENT sets of parameters, which means there is no strict guarantee on 1-to-1 correspondence. Offset annotation is the onset annotation on the reversed audio.
Current/Default
Display default parameters or parameters for the current file
Save
Save the displayed parameters as Current/Default
Reset
Reset the displayed parameters to the last time you saved it
Last
Go back to the last set parameters you have run
Wheel ↑/Wheel ↓ to zoom-in/out at amplitude
Ctrl/Command+Wheel ↑/Wheel ↓ to zoom-in/out at timeline (Ctrl/Command+I/O also works)
Shift+Wheel ↓/Wheel ↑ to move forward/backward in timeline
↑✌↑
/↓✌↓
to zoom-in/out at amplitude
←✌→
/→✌←
to zoom-in/out at timeline
Timeline zoom might not work in macOS. Use
Command + I/O
instead.
←←✌
/✌→→
to move forward/backward in timeline
If you would like to download the datasets that were used in developing Praditor, please refer to our OSF storage.
Shout out to these remarkable contributors!!
- Thank YU Xinqi, Dr. MA Yunxiao, ZHANG Sifan for their work in validating the effectiveness of Praditor's algorithm.
- Thank HU Wing Chung for her work in packaging Praditor for macOS (arm64 and universal2)
- Thank Prof. ZHANG Haoyun (University of Macau) and Prof. WANG Ruiming (South China Normal University) for their guidance and support for this project
Also, the funding:
- This project was funded by the National Natural Science Foundation of China (32200845), the Science and Technology Development Fund, Macao S.A.R (FDCT, 0153/2022/A), and the Multi-Year Research Grant (MYRG2022-00148-ICI) from the University of Macau to Haoyun Zhang.
Praditor is written and maintained by Tony, Liu Zhengyuan from Centre for Cognitive and Brain Sciences, University of Macau.
If you have any questions in terms of how to use Praditor or its algorithm details, or you want me to help you write some additional
scripts like export audio files, export Excel tables,
feel free to contact me at [email protected]
or [email protected]
.