🧠 Local Vision (WIP)

Status: Experimental
Tag: vision

📘 Overview

Local Vision is a simple experimental n8n workflow designed to test and confirm the use of local vision-capable LLMs (such as qwen/qwen2.5-vl-7b) for analyzing images—specifically for tasks like understanding data fields on a receipt.

This proof-of-concept confirms the ability to download an image from the web and process it using a locally-hosted multimodal model, producing structured data that could later be stored or further processed.

📌 Use Case

Analyze the content of an image (e.g., a store receipt) using a local Vision LLM to extract visual information in base64 format and interpret it intelligently.

🔁 Workflow Structure

Step	Node Name	Description
1	`Manual Trigger`	Used for testing the workflow manually from the n8n UI.
2	`HTTP Request`	Downloads an image file from the given URL (`https://connect.garmin.com/images/login/desktop/signin-hero-3.jpg`).
3	`Qwen2.5 Local Vision`	Uses a local multimodal LLM model (`qwen/qwen2.5-vl-7b`) to analyze the downloaded image.
4	`Sticky Note` (comment)	Describes the experiment, stating that the project attempts to test local vision-based LLM processing.

🧠 Technology Stack

n8n for workflow automation
Qwen 2.5 VL 7B model (via @n8n/n8n-nodes-langchain.openAi) for local vision AI processing
HTTP Node to download the image
Manual Trigger to start the flow

⚙️ Credentials Used

LM Studio (Credential ID: nJxj1tJeQ2RgUtHC)
Required for the OpenAI-compatible local endpoint to run the Vision LLM.

📝 Notes

The image must be processed as base64 for the model to analyze it.
Currently, the workflow stops after analysis—no further data extraction or storage is implemented. You can add a Set, Function, or Spreadsheet node to post-process or export the results.
The workflow uses typeVersion: 4.2 for HTTP and typeVersion: 1.8 for the Langchain/OpenAI node, ensuring compatibility with recent n8n versions.

🚀 Next Steps

Add a Set or Function node to parse the output into structured fields.
Save the result to Google Sheets, PostgreSQL, or Airtable.
Enhance the model prompt or context for more accurate receipt parsing.

✅ How to Run

Make sure your LM Studio or other OpenAI-compatible local inference server is running.
Replace the image URL with a picture of a receipt if needed.
Open n8n → Import this workflow → Click "Test Workflow".
Inspect the output from the Qwen2.5 Vision node.

📷 Example Image

Default image:

https://connect.garmin.com/images/login/desktop/signin-hero-3.jpg

Replace it with your own image if needed.

📁 Tags

vision, local-llm, image-processing, qwen, multimodal

🧩 Requirements

n8n v1.8+ with latest HTTP & Langchain/OpenAI nodes
Local LLM setup via LM Studio or compatible API
Base64 image input format

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Local_Vision.json		Local_Vision.json
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Local Vision (WIP)

📘 Overview

📌 Use Case

🔁 Workflow Structure

🧠 Technology Stack

⚙️ Credentials Used

📝 Notes

🚀 Next Steps

✅ How to Run

📷 Example Image

📁 Tags

🧩 Requirements

About

Uh oh!

Releases

Packages

emooney/Local-Vision

Folders and files

Latest commit

History

Repository files navigation

🧠 Local Vision (WIP)

📘 Overview

📌 Use Case

🔁 Workflow Structure

🧠 Technology Stack

⚙️ Credentials Used

📝 Notes

🚀 Next Steps

✅ How to Run

📷 Example Image

📁 Tags

🧩 Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages