Fork of the original Obsidian Transcriber plugin by Sébastien Dubois, with ongoing changes by Lorenzo Strambi.
An Obsidian plugin that transcribes images to Markdown using either local Ollama vision models or an OpenAI-compatible endpoint.
Point it at any image in your vault and get structured Markdown back — headings, lists, tables, code blocks. You can run fully local with Ollama, or use a hosted provider when desired.
- Transcribe a single image via the command palette or right-click context menu
- Batch-transcribe an entire folder of images (with optional subfolder inclusion)
- Auto-file transcribed notes into the right vault folder using LLM-based semantic classification
- Creates a
.mdfile alongside each image with the transcribed content - Install, select, and remove AI models directly from the command palette — no terminal needed
- Choose your provider: Ollama or OpenAI-compatible API
- Tune model parameters for OpenAI: temperature, top-p, and max tokens
- Progress tracking for batch operations with per-file status
- Configurable prompt so you can tailor the transcription instructions
When auto-filing is enabled, the plugin reads filing tags from the first lines of the transcription output and uses an LLM to determine the best destination folder in your vault.
Tag format — place these at the very top of the transcribed note (or instruct your transcription prompt to emit them):
#folder: EPAM/BH
#project: Sprint 2
The LLM receives all tags together with your vault's existing folder list and semantically resolves the target path. For example, if your vault already contains an EPAM/BH folder, the above tags produce EPAM/BH/Sprint 2. Any subfolder that doesn't exist yet is created automatically.
Fallback: if no tags are found or the LLM call fails, the file is moved to a configurable Inbox folder (default: Inbox).
Tag stripping: filing tag lines are removed from the note content after a successful classification.
Filing log: every filing action is appended to a configurable log note (default: Auto-Filing Log) in markdown table format.
| Setting | Default | Description |
|---|---|---|
| Enable auto-filing | off | Master toggle |
| Inbox folder | Inbox |
Fallback destination when no tags are found or the LLM fails |
| Filing model | (same as transcription model) | Model used for folder classification. Can be set independently. |
| Filing log note | Auto-Filing Log |
Path of the note where filing actions are logged |
| Max lines to scan | 5 |
How many lines from the top of the note are checked for filing tags |
The plugin recommends these vision models for transcription:
maternion/LightOnOCR-2:1b, qwen3.5:2b, qwen3.5:4b, qwen3.5:9b, qwen3.5:27b, qwen3.5:35b
Any other Ollama vision model can be installed directly from the settings or via the Ollama CLI.
- Either:
- Ollama installed and running locally, or
- an OpenAI-compatible API endpoint and API key
- Desktop Obsidian (this plugin is desktop-only)
This plugin is not yet approved by Obsidian. To install it manually:
- Download
main.js,manifest.json, andstyles.cssfrom the latest release - Create a folder at
<vault>/.obsidian/plugins/obsidian-transcriber/and place the three files inside - Open Obsidian, go to Settings → Community plugins, and enable Transcriber
- Open Settings → Transcriber and choose your provider
- Configure provider settings and click Test
- Select a model (for Ollama, you can also install/remove models from the settings)
- Right-click any image in your vault and select Transcribe image
See the user guide for detailed usage, configuration, and troubleshooting.
This fork is maintained by Lorenzo Strambi.
Original plugin created by Sébastien Dubois.
MIT
