A high-performance Linux terminal application that provides real-time speech transcription with AI-powered text optimization for CLI developers or anyone wanting to copy and paste speech-to-text. Uses optimized Whisper.cpp with Vulkan GPU acceleration for instant transcription, and LLaMA.cpp with local AI models to clean up and optimize transcribed text. In short you speak your prompt and AI cleans it up for you. Coded with Factory Droid using GLM 4.6 LLM as my first open-source project.
- π Real-time Streaming: 2-second chunk processing with 1-second overlap for immediate feedback
- β‘ Vulkan GPU Acceleration: Utilizes AMD/NVIDIA GPUs for 10x faster transcription and AI processing
- π― High Accuracy: Large V3 Turbo model with optimized parameters
- π§ AI Text Optimization: Local LLaMA.cpp integration with Magistral Small model for intelligent text cleanup
- ποΈ Smart Audio Capture: PipeWire/PulseAudio support with automatic detection
- π Continuous Output: Clean paragraph formatting without timestamps
- β¨οΈ Simple Controls: Enter to start/stop, Ctrl+C to quit
# Fedora/RHEL
sudo dnf install cmake gcc-c++ pkg-config pulseaudio-libs-devel vulkan-devel SDL2 SDL2-devel
# Ubuntu/Debian
sudo apt install cmake g++ pkg-config libpulse-dev libvulkan-dev libsdl2-dev
# Arch Linux
sudo pacman -S cmake gcc pkgconf pulseaudio libpulse vulkan-devel sdl2
# openSUSE
sudo zypper install cmake gcc-c++ pkg-config pulseaudio-devel vulkan-devel libSDL2-devel
# Solus
sudo eopkg install cmake gcc pkgconfig pulseaudio-devel vulkan-devel sdl2-devel# AMD GPUs (most distributions)
# Usually included with mesa drivers
# NVIDIA GPUs
# Ubuntu/Debian: sudo apt install nvidia-driver-535
# Fedora: sudo dnf install akmod-nvidia
# Arch: sudo pacman -S nvidia
# Intel GPUs (integrated graphics usually work out of the box)
# Ubuntu: sudo apt install intel-media-driver
# Fedora: sudo dnf install intel-media-driver# Clone the repository
git clone https://github.com/ajaxdude/speakprompt.git
cd speakprompt
# Configure with Vulkan and SDL2 support
mkdir build && cd build
cmake .. -DGGML_VULKAN=1 -DWHISPER_SDL2=ON
# Build
make -j$(nproc)
# Download the optimized Whisper model (optional - auto-downloads if missing)
wget -O ../models/ggml-large-v3-turbo.bin https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin
# Download the LLM model for text optimization (optional but recommended)
mkdir -p ../models/llm
wget -O ../models/llm/Magistral-Small-2509-Q4_K_M.gguf https://huggingface.co/MaziyarPanahi/Magistral-Small-2509-GGUF/resolve/main/Magistral-Small-2509-Q4_K_M.gguf
# Run
./speakprompt- Start the application:
./speakprompt - Press Enter to begin transcription β Shows
[STATUS] ON AIR - Speak clearly into your microphone
- Watch real-time transcription appear as continuous text
- Press Enter again to stop β Shows
[STATUS] OFF AIR - AI Optimization: The app will automatically optimize your transcribed text using the local LLM
- Copy the optimized text for use in CLI tools
Enter- Toggle recording ON/OFFCtrl+C- Quit application
When you stop recording, the application automatically:
- Removes filler words (um, uh, like, you know)
- Fixes grammar and sentence structure
- Makes text more concise and coherent
- Preserves original meaning and key points
- Organizes rambling thoughts into clear sentences
Note: The LLM model is optional. If not found, the app will provide raw transcriptions.
- Response Time: 2-3 seconds (vs 30+ seconds in basic implementations)
- Transcription Model: Large V3 Turbo (1.6GB) with GPU acceleration
- AI Model: Magistral Small (13GB) with Vulkan GPU acceleration
- Processing: 8-thread parallel processing with Vulkan
- Audio: Real-time 16kHz, 16-bit mono capture
- Output: Clean, AI-optimized text without timestamps or silence markers
If you find SpeakPrompt useful and want to support my open-source development work, consider buying me a coffee! Your support helps me continue developing and maintaining this project.
ggml-large-v3-turbo.bin(Fastest & most accurate)ggml-base.en.bin(Fallback option)
- Model:
Magistral-Small-2509-Q4_K_M.gguf(13GB, 4-bit quantized) - Location:
models/llm/directory - GPU: Uses Vulkan for acceleration (CPU fallback if GPU unavailable)
- Optional: Application works without LLM model (raw transcription only)
- PipeWire (Modern Linux - Fedora, Arch, Ubuntu 22.04+)
- PulseAudio (Traditional systems)
- Automatic detection with fallback support
speakprompt/
βββ src/
β βββ main_simple.cpp # Entry point & user interaction
β βββ audio_capture.h/cpp # Real-time audio capture (mic/WAV)
β βββ transcription_engine.h/cpp # Whisper.cpp integration
β βββ terminal_output.h/cpp # Clean output formatting
β βββ llm_processor.h/cpp # LLaMA.cpp integration for AI text optimization
βββ models/
β βββ ggml-large-v3-turbo.bin # Whisper transcription model
β βββ llm/ # LLM models directory
β βββ Magistral-Small-2509-Q4_K_M.gguf # AI text optimization model
βββ whisper.cpp/ # Whisper.cpp submodule
βββ llama.cpp/ # LLaMA.cpp submodule
βββ CMakeLists.txt # Build configuration
βββ README.md # This file
- Chunk Size: 2 seconds with 1-second overlap
- Thread Pool: 8 parallel processing threads
- GPU Backend: Vulkan with matrix acceleration
- Audio Buffer: Continuous streaming with smart overlap
- No timestamps - Clean, readable text
- Continuous paragraphs - No line breaks
- Silence filtering - No dots or blank markers
- Status indicators - ON AIR / OFF AIR
# Check audio system
pactl info
# Verify PipeWire is running
systemctl --user status pipewire pipewire-pulse
# Test microphone
arecord -d 5 test.wav && aplay test.wav
# Check available audio devices
pactl list sources
# Restart audio service if needed
systemctl --user restart pipewire pipewire-pulse# Check if SDL2 is installed
pkg-config --modversion sdl2
# Install SDL2 if missing
# Ubuntu/Debian: sudo apt install libsdl2-dev
# Fedora/RHEL: sudo dnf install SDL2 SDL2-devel
# Arch Linux: sudo pacman -S sdl2
# Verify SDL2 detection
cmake .. -DWHISPER_SDL2=ON | grep SDL2# Check Vulkan support
vulkaninfo --summary
# Install GPU drivers if needed
# AMD: sudo dnf install mesa-vulkan-drivers
# NVIDIA: Install proprietary drivers
# Test Vulkan with a simple program
vkcube # Should show a rotating cube if Vulkan works# Clean build
rm -rf build && mkdir build && cd build
cmake .. -DGGML_VULKAN=1 -DWHISPER_SDL2=ON
make -j$(nproc)
# If build fails due to missing dependencies:
# Ubuntu/Debian:
sudo apt install cmake g++ pkg-config libpulse-dev libvulkan-dev libsdl2-dev
# Fedora/RHEL:
sudo dnf install cmake gcc-c++ pkg-config pulseaudio-libs-devel vulkan-devel SDL2 SDL2-devel
# Arch Linux:
sudo pacman -S cmake gcc pkgconf pulseaudio libpulse vulkan-devel sdl2# Check if model exists
ls -la models/ggml-large-v3-turbo.bin
# Download model if missing
mkdir -p models
wget -O models/ggml-large-v3-turbo.bin https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin
# Alternative model (smaller, faster download)
wget -O models/ggml-base.en.bin https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin# Check if LLM model exists
ls -la models/llm/Magistral-Small-2509-Q4_K_M.gguf
# Download LLM model if missing (13GB - large download)
mkdir -p models/llm
wget -O models/llm/Magistral-Small-2509-Q4_K_M.gguf https://huggingface.co/MaziyarPanahi/Magistral-Small-2509-GGUF/resolve/main/Magistral-Small-2509-Q4_K_M.gguf
# If LLM model loading takes too long, try reducing GPU layers or using CPU only
# Edit src/llm_processor.cpp and change model_params.n_gpu_layers = 0;# Check if Vulkan is working for LLM acceleration
vulkaninfo --summary
# If LLM is slow, ensure you have enough VRAM (at least 16GB recommended)
# For lower VRAM systems, the model may fall back to CPU processing
# Application will still work without LLM model - it will provide raw transcriptionsMIT License - see LICENSE file for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Test performance improvements
- Submit a pull request
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Built with β€οΈ for CLI developers who want fast, accurate speech-to-text.

