🧠 Speech-to-Text Analysis & Semantic Chunking

This Jupyter Notebook explores advanced speech-to-text processing through two major tasks:

🎬 Part 1: Semantic Chunking of YouTube Audio

Downloaded and extracted audio from a YouTube video using yt-dlp.
Transcribed speech to text using OpenAI Whisper.
Performed time-aligned transcription and speaker diarization using PyAnnote.
Applied semantic chunking based on sentence structure, speaker turns, and conjunctions for meaningful text segmentation.

Aligned audio and transcript at the word and phoneme level.
Conducted detailed word-level speech analysis, including misalignment, pauses, and anomalies.
Detected audio anomalies (like silence, distortion) using waveform and spectral analysis.
Performed text bias and linguistic analysis to examine potential content skew.
Analyzed audio quality, duration trends, and phoneme usage.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
speech2text.ipynb		speech2text.ipynb