Skip to main content
Back to projects
prototypeFeb 05, 2026

VisionTagger

A desktop app that uses local AI to automatically tag and rename images based on their content.

pythonfletaiollamadesktop

Note: This project is not yet available on GitHub.

Overview

VisionTagger is a desktop application that uses Ollama's vision models to analyze images and generate descriptive tags and filenames. Point it at a folder of images and it will examine each one, suggest tags, and offer to rename files based on their content—all running locally without cloud dependencies.

Key Features

  • Vision AI Analysis: Uses Ollama vision models to understand image content.
  • Automatic Tagging: Generates descriptive tags for each image.
  • Smart Renaming: Suggests filenames based on image content.
  • Batch Processing: Handle entire folders of images at once.
  • Local Processing: All AI runs locally via Ollama—no data leaves your machine.
  • Flet UI: Cross-platform desktop interface.

Technical Architecture

VisionTagger connects to a local Ollama instance running vision-capable models. It processes images through the vision API, extracts descriptions, and applies naming conventions based on the results.

Core components:

  • Ollama Client: Connects to local Ollama for vision model access.
  • Image Processor: Handles image loading and format conversion.
  • Tag Generator: Parses vision model output into structured tags.
  • Rename Engine: Applies naming rules based on generated tags.

Technology Stack

  • UI Framework: Flet for cross-platform desktop
  • AI Backend: Ollama with vision models (LLaVA, etc.)
  • Language: Python
  • Image Handling: Pillow for format support

Current Status

Prototype with core tagging working. Note: renaming is currently destructive without undo—use with caution on copies. Planning to add rename preview and undo support.

Have questions about VisionTagger?

Try asking the AI assistant! Here are some ideas:

Related Projects