You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -46,7 +46,7 @@ RubyLLM fixes all that. One beautiful API for everything. One consistent format.
46
46
chat =RubyLLM.chat
47
47
chat.ask "What's the best way to learn Ruby?"
48
48
49
-
# Analyze images, audio, documents, and text files
49
+
# Analyze images, videos, audio, documents, and text files
50
50
chat.ask "What's in this image?", with:"ruby_conf.jpg"
51
51
chat.ask "Describe this meeting", with:"meeting.wav"
52
52
chat.ask "Summarize this document", with:"contract.pdf"
@@ -88,7 +88,7 @@ chat.with_tool(Weather).ask "What's the weather in Berlin? (52.5200, 13.4050)"
88
88
## Core Capabilities
89
89
90
90
* 💬 **Unified Chat:** Converse with models from OpenAI, Anthropic, Gemini, Bedrock, OpenRouter, DeepSeek, Ollama, or any OpenAI-compatible API using `RubyLLM.chat`.
91
-
* 👁️ **Vision:** Analyze images within chats.
91
+
* 👁️ **Vision:** Analyze images and videos within chats.
92
92
* 🔊 **Audio:** Transcribe and understand audio content.
93
93
* 📄 **Document Analysis:** Extract information from PDFs, text files, and other documents.
94
94
* 🖼️ **Image Generation:** Create images with `RubyLLM.paint`.
Copy file name to clipboardExpand all lines: docs/guides/chat.md
+26-1Lines changed: 26 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -119,7 +119,7 @@ RubyLLM manages a registry of known models and their capabilities. For detailed
119
119
120
120
## Multi-modal Conversations
121
121
122
-
Modern AI models can often process more than just text. RubyLLM provides a unified way to include images, audio, text files, and PDFs in your chat messages using the `with:` option in the `ask` method.
122
+
Modern AI models can often process more than just text. RubyLLM provides a unified way to include images, videos, audio, text files, and PDFs in your chat messages using the `with:` option in the `ask` method.
123
123
124
124
### Working with Images
125
125
@@ -144,6 +144,30 @@ puts response.content
144
144
145
145
RubyLLM handles converting the image source into the format required by the specific provider API.
146
146
147
+
### Working with Videos
148
+
149
+
You can also analyze video files or URLs with vision-capable models. RubyLLM will automatically detect video files and handle them appropriately.
150
+
151
+
```ruby
152
+
# Ask about a local video file
153
+
chat =RubyLLM.chat(model:'gpt-4o')
154
+
response = chat.ask "What happens in this video?", with:"path/to/demo.mp4"
155
+
puts response.content
156
+
157
+
# Ask about a video from a URL
158
+
response = chat.ask "Summarize the main events in this video.", with:"https://example.com/demo_video.mp4"
159
+
puts response.content
160
+
161
+
# Combine videos with other file types
162
+
response = chat.ask "Analyze these files for visual content.", with: ["diagram.png", "demo.mp4", "notes.txt"]
163
+
puts response.content
164
+
```
165
+
166
+
**Notes:**
167
+
- Supported video formats include .mp4, .mov, .avi, .webm, and others (provider-dependent).
168
+
- Not all models or providers support video input; check the [Available Models Guide]({% link guides/available-models.md %}) for details.
169
+
- Large video files may be subject to size or duration limits imposed by the provider.
170
+
147
171
### Working with Audio
148
172
149
173
Provide audio file paths to audio-capable models (like `gpt-4o-audio-preview`).
@@ -224,6 +248,7 @@ response = chat.ask "What's in this image?", with: { image: "photo.jpg" }
Copy file name to clipboardExpand all lines: docs/guides/rails.md
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -117,7 +117,7 @@ Run the migrations: `rails db:migrate`
117
117
118
118
### ActiveStorage Setup for Attachments (Optional)
119
119
120
-
If you want to use attachments (images, audio, PDFs) with your AI chats, you need to set up ActiveStorage:
120
+
If you want to use attachments (images, videos, audio, PDFs) with your AI chats, you need to set up ActiveStorage:
121
121
122
122
```bash
123
123
# Only needed if you plan to use attachments
@@ -291,7 +291,7 @@ chat_record.ask("Analyze this file", with: params[:uploaded_file])
291
291
chat_record.ask("What's in this document?", with: user.profile_document)
292
292
```
293
293
294
-
The attachment API automatically detects file types based on file extension or content type, so you don't need to specify whether something is an image, audio file, PDF, or text document - RubyLLM figures it out for you!
294
+
The attachment API automatically detects file types based on file extension or content type, so you don't need to specify whether something is an image, video, audio file, PDF, or text document - RubyLLM figures it out for you!
0 commit comments