VisionTagger
A desktop app that uses local AI to automatically tag and rename images based on their content.
Note: This project is not yet available on GitHub.
Overview
VisionTagger is a desktop application that uses Ollama's vision models to analyze images and generate descriptive tags and filenames. Point it at a folder of images and it will examine each one, suggest tags, and offer to rename files based on their content—all running locally without cloud dependencies.
Key Features
- Vision AI Analysis: Uses Ollama vision models to understand image content.
- Automatic Tagging: Generates descriptive tags for each image.
- Smart Renaming: Suggests filenames based on image content.
- Batch Processing: Handle entire folders of images at once.
- Local Processing: All AI runs locally via Ollama—no data leaves your machine.
- Flet UI: Cross-platform desktop interface.
Technical Architecture
VisionTagger connects to a local Ollama instance running vision-capable models. It processes images through the vision API, extracts descriptions, and applies naming conventions based on the results.
Core components:
- Ollama Client: Connects to local Ollama for vision model access.
- Image Processor: Handles image loading and format conversion.
- Tag Generator: Parses vision model output into structured tags.
- Rename Engine: Applies naming rules based on generated tags.
Technology Stack
- UI Framework: Flet for cross-platform desktop
- AI Backend: Ollama with vision models (LLaVA, etc.)
- Language: Python
- Image Handling: Pillow for format support
Current Status
Prototype with core tagging working. Note: renaming is currently destructive without undo—use with caution on copies. Planning to add rename preview and undo support.
Have questions about VisionTagger?
Try asking the AI assistant! Here are some ideas:
Related Projects
Gloss
A local-first, privacy-preserving alternative to Google's NotebookLM with RAG-powered chat using local LLM inference via Ollama.
VisionForge
A Tauri 2 desktop app bridging local LLMs with Stable Diffusion through a multi-agent prompt engineering pipeline.
Agent Forge
A multi-agent orchestrator for Claude Code that decomposes projects into parallel agent workstreams.