
Real-time VRM avatar animation with intelligent TTS synchronization

๐ Quick Start โข โก Features โข ๐ ๏ธ Setup โข ๐ Documentation
- ๐ค Auto-loading: Default VRM model and animations on startup
- ๐ Real-time Animation: Seamless idle โ talking state switching
- ๐ฌ Mouth Animation: Volume-based blend shapes with fine-tuned controls
- ๐ฎ Interactive Controls: Full camera and avatar manipulation
- ๐ Multi-format Support: VRM, FBX, VRMA, GLB, GLTF files
- ๐ WebSocket Communication: Zero-latency animation triggers
- โก Perfect Timing: Server-side synchronization for precise animation
- ๐ฏ Smart Detection: Deterministic triggers, not audio guessing
- ๐ง Optional Mode: Works standalone without TTS server
- Movement:
WASD
(avatar),Arrow Keys
(camera),Mouse
(orbit) - Manipulation:
Ctrl+WASD
(rotation),Shift+Drag
(positioning) - UI:
H
(toggle interface), organized accordion panels - Animation: Manual play/stop/reset controls
You CANNOT open room.html
directly in browser. Use a web server:
1. Install "Live Server" extension
2. Right-click room.html โ "Open with Live Server"
3. Opens at http://127.0.0.1:5500/room.html
# Python (built-in)
python -m http.server 8000
# โ http://localhost:8000/room.html
# Node.js
npx serve .
# โ http://localhost:3000/room.html
# PHP
php -S localhost:8000
# โ http://localhost:8000/room.html
Why? ES6 modules, CORS restrictions, and WebSocket context require HTTP protocol.
- Start web server
- Open
room.html
- Auto-loads:
- Default VRM avatar (
AvatarSample_H.vrm
) - Idle animation (
Happy Idle.fbx
) - Talking animation (
Talking.fbx
)
- Default VRM avatar (
- Use manual controls to trigger animations
- Run TTS server:
run_gpt_sovits.bat
- Start web server and open
room.html
- Check "Enable TTS WebSocket Connection"
- Animations trigger automatically with TTS audio
VRMViewer/
โโโ room.html # ๐ฏ Main application
โโโ api_v3.py # ๐ Modified TTS server
โโโ run_gpt_sovits.bat # ๐ Server launcher
โโโ assets/
โ โโโ models/ # ๐ค VRM avatar files
โ โ โโโ AvatarSample_H.vrm
โ โ โโโ *.vrm
โ โโโ animations/ # ๐ญ Animation files
โ โโโ Happy Idle.fbx
โ โโโ Talking.fbx
โ โโโ *.fbx, *.vrma
โโโ js/ # ๐ฆ JavaScript modules
โ โโโ three-vrm-core.module.js
โ โโโ three-vrm-animation.module.js
โ โโโ loadMixamoAnimation.js
โโโ css/ # ๐จ Styling
โโโ styles.css
Our modified api_v3.py
extends GPT-SoVITS with WebSocket animation signals:
# Real-time VRM communication
vrm_websocket = None
async def notify_vrm(message_type, text=None):
if vrm_websocket:
message = {"type": message_type}
await vrm_websocket.send(json.dumps(message))
# Perfect timing integration
await notify_vrm("tts_start") # Animation begins
await notify_vrm("tts_end") # Return to idle
{"type": "tts_start"} // ๐ฃ๏ธ Start talking animation
{"type": "tts_end"} // ๐ด Return to idle animation
- Browser: Chrome/Firefox/Edge with WebGL support
- TTS Server: GPT-SoVITS v2 Pro (optional)
- Dependencies:
websockets
,asyncio
(for TTS integration)
// Browser console commands:
startIdleAnimation(); // ๐ด Start idle
startTalkingAnimationFromTTS(); // ๐ฃ๏ธ Start talking
stopAnimation(); // โน๏ธ Stop current
resetAnimation(); // ๐ Reset to idle
- VRM Models: Drop into
assets/models/
folder - Animations: Support FBX (Mixamo) and VRMA formats
- Environments: GLB/GLTF room files supported
- Auto-retargeting: Mixamo animations automatically fit VRM skeleton
- Mouth Gain: Adjust lip-sync intensity (0.1 - 2.0)
- Body Threshold: Set talking animation trigger sensitivity
- Blend Shapes: Utilizes VRM visemes (aa, ih, ou, ee, oh)
- Port: 8765 (configurable in
api_v3.py
) - Auto-reconnect: 5-second intervals on connection loss
- Status Indicators: Real-time connection status display
- OBS Compatible: Optimized for broadcast software
- Performance: Hardware acceleration recommended
- Audio Routing: Support for virtual audio cables
Problem | Solution |
---|---|
๐ซ Modules not loading | Use HTTP server, not file:// protocol |
๐ Audio not working | Check browser permissions & device selection |
๐ป VRM not visible | Verify valid VRM file, check console errors |
๐ญ Animations not playing | Confirm FBX/VRMA format, check VRM compatibility |
๐ TTS connection failed | Verify api_v3.py server running on port 8765 |
- โ Enable hardware acceleration in browser settings
- โ Use Chrome/Edge for best WebGL performance
- โ Disable unused features (spring bones, etc.) if lag occurs
- โ Local server recommended over network drives
- โ Close unused browser tabs for optimal performance
- Three.js r169: 3D rendering engine
- VRM 3.0: Avatar standard support
- WebSocket: Real-time communication
- ES6 Modules: Clean import system
- Web Audio API: Advanced audio processing
- Deterministic Timing: TTS server knows exact audio timing
- Zero Latency: No audio detection delays
- Universal Compatibility: Works regardless of user audio setup
- Reliable Synchronization: No false positives or missed triggers
Built with:
- ๐ฏ Three.js + VRM Libraries
- ๐ค VRM Consortium sample assets
- ๐ญ Mixamo animation library
- ๐ฃ๏ธ GPT-SoVITS TTS framework