Skip to content

Paradeluxe/Praditor

Repository files navigation

License GitHub Release Downloads



Praditor_icon

Praditor

A DBSCAN-Based Automation for Speech Onset Detection

Download Praditor | English · 中文


Features

Praditor is a speech onset detector that helps you find out boundaries between silence and sound automatically.

audio2textgrid.png

Praditor works for both single-onset and multi-onset audio files without any language limitation. It generates output as PointTiers in .TextGrid format.

  • Onset/Offset Detection
  • Silence Detection

Praditor also allows users to adjust parameters in the Dashboard to get a better performance.

You can try test_audio.wav and test_audio_mp3.mp3 on Praditor.

Video instruction

📚 Fine-tuning guidance

Basic understanding is enough. Understanding the algorithm is better.

🙌 Play with GUI

Although I have prepared various buttons in this GUI, you do not have to use them all.

The simplest and easiest procedure is (1) import audio files, (2) hit the extract button, (3) [optional] you are not happy about the results, fine-tune the parameters and hit the extract button again. Until you are happy about the results, repeat step (2) and (3).

General

File -> Read files... -> Import your target audio file (Recommend: >= 44.1 kHz; Accept: >= 8 kHz)

Run Run algorithm and extract onsets. Wait for a while until the results come out. Onsets are in blue, offsets are in green.

Test Test how many onsets/offsets may be found using the presented parameters. This function does not affect .TextGrid.

If the number meets the expectation, hit Run to get the final annotation.

F5 to play the audio signal that is currently presented in the window, and Any Key to stop playing.

Next/Prev Move to the next/previous audio file

.TextGrid related

Clear If you want to temporarily clear the annotations, this does not delete/change the .TextGrid file. It's safe.

Show If you want the cleared annotations back. Praditor will go back to the .TextGrid and present whatever is in it.

Onset/Offset to hide/show annotations on the screen (also does not change the .TextGrid).

Note: Onsets and offsets are controlled by two DIFFERENT sets of parameters, which means there is no strict guarantee on 1-to-1 correspondence. Offset annotation is the onset annotation on the reversed audio.

Parameters

Current/Default Display default parameters or parameters for the current file

Save Save the displayed parameters as Current/Default

Reset Reset the displayed parameters to the last time you saved it

Last Go back to the last set parameters you have run

Audio signal

Mouse & Keyboard 🖱️⌨️

Wheel ↑/Wheel ↓ to zoom-in/out at amplitude

Ctrl/Command+Wheel ↑/Wheel ↓ to zoom-in/out at timeline (Ctrl/Command+I/O also works)

Shift+Wheel ↓/Wheel ↑ to move forward/backward in timeline

Touchpad 💻

↑✌↑/↓✌↓ to zoom-in/out at amplitude

←✌→/→✌← to zoom-in/out at timeline

Timeline zoom might not work in macOS. Use Command + I/O instead.

←←✌/✌→→ to move forward/backward in timeline

🗃️ Data and Materials

If you would like to download the datasets that were used in developing Praditor, please refer to our OSF storage.

🙌 Acknowledgments

Shout out to these remarkable contributors!!

  • Thank YU Xinqi, Dr. MA Yunxiao, ZHANG Sifan for their work in validating the effectiveness of Praditor's algorithm.
  • Thank HU Wing Chung for her work in packaging Praditor for macOS (arm64 and universal2)
  • Thank Prof. ZHANG Haoyun (University of Macau) and Prof. WANG Ruiming (South China Normal University) for their guidance and support for this project

Also, the funding:

  • This project was funded by the National Natural Science Foundation of China (32200845), the Science and Technology Development Fund, Macao S.A.R (FDCT, 0153/2022/A), and the Multi-Year Research Grant (MYRG2022-00148-ICI) from the University of Macau to Haoyun Zhang.

📨 Contact us

Praditor is written and maintained by Tony, Liu Zhengyuan from Centre for Cognitive and Brain Sciences, University of Macau.

If you have any questions in terms of how to use Praditor or its algorithm details, or you want me to help you write some additional scripts like export audio files, export Excel tables, feel free to contact me at [email protected] or [email protected].