Releases: PurpleDoubleD/locally-uncensored
Locally Uncensored v2.3.4 — Reliability Hotfix
TL;DR
Hotfix release on top of v2.3.3. Recommended for everyone — fixes the big "lost my chat history after the update" bug plus the Ollama 0.21 compatibility break.
Fixed
Chat history now survives updates
isTauri() was checking the Tauri v1 global (window.__TAURI__), but Tauri 2 renamed it to window.__TAURI_INTERNALS__. Inside the packaged .exe every Tauri-only backend command (backup_stores, restore_stores, set_onboarding_done, ComfyUI manager, whisper, process control) silently fell through to the dev-mode fetch path and no-op'd.
- Dual-global detection for v1 + v2 compat
- 100 ms × 50-tick polling for async
withGlobalTauriinit - Backup cadence tightened: 30 s → 5 s interval + 1 s event-driven debounce +
beforeunloadsync flush __tsmarker so the snapshot is never empty- Full destructive wipe+restore roundtrip live-verified on the release binary
Ollama 0.21 / 0.20.7 compatibility
The auto-upgraded Ollama now returns HTTP 404 model not found on /api/show for pre-existing models whose on-disk manifest lacks the new capabilities field.
- New top-of-app
StaleModelsBanner+ Header Lichtschalter chip - One-click Refresh all re-pulls each stale model and verifies via a second probe before clearing the warning
- Error parser tolerates 400 / 404 / Rust-proxy-wrapped-500 forms
Codex infinite-loop guard
Small 3 B coder models (qwen2.5-coder:3b, llama3.2:1b) could loop forever repeating the same file_write + shell_execute batch when a test failed. Codex now halts with a clear "same tool sequence repeated — try a larger model" message after two consecutive identical batches.
Stop button is now actually instant
abort.signal.aborted is checked at the top of the for await chat stream and the NDJSON reader loop; reader.cancel() on abort. No more 30–60 s of thinking tokens leaking after you click Stop on a Gemma-4 response.
Other
- Stale-chip state leak on model switch fixed
isHtmlSnippetexport restored (19 failing CodeBlock tests → green)- Create view
getKnownFileSizesCommonJSrequire()→ dynamicimport()(was silently broken in Vite/browser bundle) - flux2 CFG scale test regression corrected
Test & build
- 2161/2161 vitest green (+56 regression tests over 2.3.3)
tsc --noEmitcleancargo checkclean- Auto-update via signed NSIS/MSI channel — existing users get prompted automatically
Upgrading
Existing users: the in-app updater will pick this up automatically on next launch. Click Restart Now when prompted. Your chat history, settings, and model list all survive the upgrade — that's literally what this release fixes.
New users: Download the Windows .exe installer below.
Download
Windows: .exe NSIS installer or .msi. Portable-friendly — no admin rights required.
Linux: .AppImage — chmod +x and run.
v2.3.3 — Remote Access, Codex Streaming, Qwen 3.6, ERNIE-Image, 2105 Tests
What's New in v2.3.3
The biggest release yet — Remote Access, Codex overhaul, 6 new image/video models, and Qwen 3.6 day-0 support.
Remote Access + Mobile Web App
- Access your AI from your phone — Dispatch via LAN or Cloudflare Tunnel (Internet)
- 6-digit passcodes with rate limiting, JWT auth, and auto-regenerating tokens
- Full mobile web app with hamburger drawer, chat list, Codex mode, file attach, thinking toggle, plugins (Caveman + Personas)
- Mobile Agent Mode with 13 tools — Thought/Action/Observation cards, collapsible steps
- Mobile-Desktop sync — messages mirror in real-time, memory extraction works across both
- Security hardened — permissions enforced on proxy, CSP headers, content validation, no static file leaks
Codex Coding Agent — Major Upgrade
- Live streaming between tool calls — see tokens as they generate (was: blank screen for 2+ minutes)
- Continue capability — tool-call history persisted as hidden messages, model remembers what it did
- AUTONOMY CONTRACT — explicit prompt prevents "Now I will..." premature stopping
- Fallback answer — never shows empty bubble after tool calls
- Streaming arg repair — fixes Ollama JSON-string argument issue
Agent Mode — 13-Phase Rewrite
- Parallel tool execution with side-effect grouping (file-write serial, reads parallel)
- Budget system — max 50 tool calls / 25 iterations per task
- Sub-agent delegation —
delegate_taskspawns isolated sub-agents (depth 2) - In-turn cache — deduplicates identical tool calls within one turn
- MCP integration — external tools via
ToolRegistry.registerExternal() - Embedding-based routing — reduces tool definitions by ~80% for large registries
- Filesystem awareness — agent now uses file_list/system_info before acting
New Models
- Qwen 3.6 (day-0) — 35B MoE, 3B active, vision + agentic coding + thinking preservation, 256K context. One-click Ollama pull
- ERNIE-Image (Baidu) — Turbo (8 steps) + Base (50 steps), 28.9 GB each. ConditioningZeroOut workflow, no custom nodes needed
- Z-Image — Own ModelType with correct CLIP matching (was misclassified as flux2)
- 75+ downloadable models — all URLs verified, file sizes corrected
Image + Video
- Image-to-Image — upload source, adjust denoise, transform with any model
- FramePack fixes — correct node names, DualCLIPLoader, CLIPVision
- 6 ComfyUI E2E fixes — real error messages, direct fetch fallback, stale model reset
UI/UX
- AE-style text header — clean typography replaces icon pills for better discoverability
- Plugins dropdown — Caveman Mode (Off/Lite/Full/Ultra) + Personas in one menu
- Thinking mode — tri-state (true/false/undefined), auto-retry on 400, universal tag stripper
- Gemma 3/4 planner bypass — no more "Plan: / Constraint Checklist:" preamble
Developer
- 2105 tests (83 files) — comprehensive smoke tests covering entire app surface
- Auto-update — signed NSIS installers, in-app download with progress bar
- NSIS persistence — localStorage backup/restore survives updates
- Process cleanup — Windows Job Object kills ComfyUI on app close
Bug Fixes
- Thinking tags leaked past toggle (QwQ, DeepSeek-R1, Gemma)
- I2V image upload FormData corruption
- Chat homepage null crash on fresh install
- Light theme contrast issues
- Caveman mode missing in Codex/Claude Code
- Download polling race condition
- 13 file sizes corrected (up to 95% off)
- Terminal window popup on Windows (cloudflared)
Full changelog: See CLAUDE.md entries 1-95
v2.3.2 — GLM-4.7-Flash, Model Loading Fix, Agent Badge Audit
What's New
GLM-4.7-Flash (11 variants)
ZhipuAI's strongest 30B class model with native tool calling and 198K context window.
- 4 Uncensored (Heretic): IQ2_M (10 GB), Q4_K_M (19 GB), Q6_K (25 GB), Q8_0 (32 GB)
- 7 Mainstream: IQ2_M through Q8_0 — fits 12 GB VRAM at IQ2_M
- All variants marked as AGENT (tool calling compatible)
GLM 5.1 754B MoE
Listed as cloud-available via Ollama. 754B MoE (40B active), frontier agentic engineering model.
Model Loading Fix (Discussion #22)
Fixed 3 bugs causing "0 models loaded" in ComfyUI Create View:
- Race condition — ComfyUI responds to health check before scanning model directories. Now calls /api/refresh before querying models.
- Broken auto-retry — 0-models case set both modelsLoaded and modelLoadError, preventing retries. Fixed: retries every 3s, max 12x (~36s).
- Stale cache — After downloading models, ComfyUI's directory cache was never refreshed. Now calls /api/refresh after every download.
Agent Badge Audit
Audited all 75+ models for correct tool calling flags:
- Added agent flag to: Qwen 3.5 (all sizes), Qwen3 8B/14B, GLM-4 9B, GPT-OSS 20B, Llama 3.1 8B, Llama 3.3 70B, Phi-4 14B, Qwen 2.5 7B
- Removed all HOT badges — cleaner UI, only AGENT badges shown
Other Fixes
- Think-mode guard for non-thinking models (amber hint instead of crash)
- Chat homepage null crash fix
- Light theme contrast improvements
Downloads
CI builds the installers automatically. Check back in ~5 minutes for:
- Locally.Uncensored_2.3.2_x64-setup.exe — NSIS installer (recommended)
- Locally.Uncensored_2.3.2_x64_en-US.msi — Windows Installer
Test Results
- 607 tests passing
- E2E tested: GLM downloads (start/cancel/re-download), model loading fix, all existing models
v2.3.1 — In-App Ollama Install, Configurable ComfyUI Port
What's New
In-App Ollama Download & Install
No more hunting for external links — click Install Ollama in the onboarding wizard and watch it download with a real-time progress bar (speed, bytes, elapsed timer). Silent install, auto-start, auto-detect. Zero manual steps.
Configurable ComfyUI Port & Path
The ComfyUI port was hardcoded to 8188 in 20+ places — now fully configurable in Settings > ComfyUI (Image & Video). Users with the new ComfyUI Desktop App (which uses a different port) can now connect by simply changing the port.
Path is also editable in Settings with a Connect button — no need to go through onboarding again.
ComfyUI Install Progress
The one-click ComfyUI install now shows step-by-step progress (Step 1/3: Clone, Step 2/3: PyTorch, Step 3/3: Dependencies) with an elapsed timer. Previously the install got stuck at "Starting..." forever because the Rust thread never reported completion.
Provider Status Fix
Provider connection dots in Settings now show actual status (green = connected, red = failed, gray = unknown) instead of always showing green. Auto-checks connection on page load.
Full Changelog: v2.3.0...v2.3.1
v2.3.0 — ComfyUI Plug & Play, 20 Model Bundles, I2I, I2V
v2.3.0 — ComfyUI Plug & Play, 20 Model Bundles, Image-to-Image, Image-to-Video
Highlights
- ComfyUI Plug & Play — Auto-detect, one-click install, auto-start. Zero config image and video generation.
- 20 Model Bundles — 8 image + 12 video bundles with one-click download. Verified models marked, untested show "Coming Soon".
- Z-Image Turbo/Base — Uncensored image model. 8-15 seconds per image. No safety filters.
- FLUX 2 Klein — Next-gen FLUX architecture with Qwen 3 text encoder.
- Image-to-Image (I2I) — Upload a source image, adjust denoise, transform with any image model.
- Image-to-Video (I2V) — FramePack F1 (6 GB VRAM!), CogVideoX, SVD with drag & drop.
- Dynamic Workflow Builder — 14 strategies auto-detect installed nodes and build correct pipelines.
New Features
- VRAM-aware model filtering (Lightweight / Mid-Range / High-End tabs)
- Unified download manager with progress, speed, retry for failed files
- Think Mode moved to chat input (always accessible)
- Hardware-aware onboarding recommends Gemma 4, Qwen 3.5 based on GPU VRAM
- Verified/Coming Soon badges on model bundles
- ComfyUI process auto-cleanup on app close (Windows Job Object)
- GLM 5.1, Qwen 3.5, Gemma 4 added to Discover models
Bug Fixes
- SSRF protection added to proxy_localhost (localhost-only validation)
- npm vulnerabilities fixed (vite updated)
- FramePack workflow: DualCLIPLoader fix, VAEEncode fix, preflight custom node check
- Z-Image: own ModelType + strategy (was misclassified as flux2)
- Think-Mode guard for non-thinking models
- All 105 download URLs verified HTTP 200/302
Downloads
- Windows (.exe) — Recommended. NSIS installer.
- Windows (.msi) — Windows Installer alternative.
- Other platforms: build from source.
v2.2.3 — Plug & Play Setup, True Multi-Provider
Plug & Play — Install, Launch, Chat.
The setup wizard now auto-detects 12 local backends on first launch. Nothing installed? One-click install links for every backend. Re-Scan after install. Zero config needed.
What's New in v2.2.3
- Plug & Play Setup Wizard — First-launch wizard scans all 12 supported local backends automatically:
- Ollama, LM Studio, vLLM, KoboldCpp, Jan, GPT4All, llama.cpp, LocalAI, text-generation-webui, TabbyAPI, Aphrodite, SGLang
- Detected? Auto-connected. Nothing running? One-click install links with descriptions for every backend
- Re-Scan button — install a backend, hit Re-Scan, done
- 25+ Provider Presets — Every local and cloud backend pre-configured. Just pick and go.
- True Multi-Provider — All providers treated equally throughout the app, landing page, and docs. No more Ollama-centric messaging.
- Backend Selector Overhaul — Removed "Ollama — always active" bias. All detected backends shown as equal options.
Also includes everything from v2.2.2
- Codex Coding Agent (LU | Codex | OpenClaw tabs)
- 13 MCP Tools with dynamic registry
- Granular Permissions (7 categories)
- File Upload + Vision
- Thinking Mode (provider-agnostic)
- Model Load/Unload from VRAM
- Smart Tool Selection + JSON Repair
- 15% larger UI + light mode
Downloads
| File | Description |
|---|---|
Locally.Uncensored_2.2.3_x64-setup.exe |
Windows installer (recommended) |
Locally.Uncensored_2.2.3_x64_en-US.msi |
Windows Installer (MSI) |
Requirements
- Windows 10/11 (64-bit)
- Any local AI backend — the setup wizard helps you pick and install one
- Or use cloud APIs (OpenAI, Anthropic, OpenRouter, Groq) from Settings
v2.2.2 — Codex Agent, MCP Tools, Permissions, File Upload, Thinking
What's New in v2.2.2
The biggest update yet. A dedicated coding agent, 13 built-in tools, granular permissions, file upload with vision, thinking mode, and a completely overhauled UI.
Codex Coding Agent
- Three-tab system: LU | Codex | OpenClaw in sidebar
- Reads your codebase, writes files, runs shell commands autonomously
- File tree browser with native Windows folder picker
- Up to 20 tool iterations per task
- Working directory injection for all file/shell operations
13 MCP Tools
- Dynamic tool registry replacing the old hardcoded 7 tools
web_search,web_fetch,file_read,file_write,file_list,file_search,shell_execute,code_execute,system_info,process_list,screenshot,image_generate,run_workflow- Smart tool selection — keyword-based filtering saves ~80% of tool-definition tokens
- JSON repair — fixes broken JSON from local LLMs (trailing commas, single quotes, missing braces)
- Native + Hermes XML fallback for any model
Granular Permissions
- 7 categories: web, filesystem, terminal, system, desktop, image, workflow
- 3 levels: blocked, confirm, auto-approve
- Per-conversation overrides via Tools dropdown
- Image generation locked as "Coming Soon"
File Upload + Vision
- Attach images via 📎 clip button, drag & drop, or Ctrl+V paste
- Up to 5 images per message
- Vision models describe what they see
- Works across all providers (Ollama, OpenAI, Anthropic)
- Provider-specific image formats handled automatically
Thinking Mode
- Provider-agnostic thinking toggle
- Ollama: native
think: trueAPI - OpenAI / Anthropic: via system prompt +
<think>tag parser - Collapsible thinking blocks in chat
- Gemma 4
<|channel>thoughttag stripping
Model Load/Unload
- Power icons (⏻ / ⏼) next to model selector in header
- Green = loaded in VRAM, gray = not loaded
- Load model into VRAM before chatting (faster first response)
- Unload to free VRAM
- Auto-detection of running models via
/api/ps - Only shown for Ollama models (cloud providers don't need it)
Native PC Control (Rust)
shell_execute— async viatokio::spawn_blocking, no more UI freezefs_read,fs_write,fs_list,fs_search,fs_info— unsandboxed filesystem accesssystem_info,process_list,screenshot— system monitoringpick_folder— native Windows folder dialog viarfdcrate- All commands bypass Tauri sandbox for full PC control
UI Overhaul
- 15% larger UI across all text, padding, and layout (root font-size 18.4px)
- Compact message bubbles with light mode support
- Monochrome tool blocks (collapsed by default, click to expand)
- Collapsible code blocks (>4 lines collapsed, "Show all X lines" button)
- Realtime elapsed counter during generation
- Windows DWM border removed (
set_shadow(false)) - Custom dark titlebar, no native chrome
Other Improvements
- Stop button now works (AbortSignal passed through all fetch calls)
- Context window detection fixed (architecture-specific keys + string num_ctx parsing)
- Settings store v3 migration (merges new defaults into existing settings)
- Agent Mode: removed Beta badge, production-ready
- Conversations separated by mode (LU/Codex/OpenClaw)
- New Chat button at bottom of sidebar
Downloads
| Platform | File |
|---|---|
| Windows | Locally.Uncensored_2.2.2_x64-setup.exe (NSIS installer) |
| Windows | Locally.Uncensored_2.2.2_x64_en-US.msi (MSI installer) |
Requires Ollama for AI chat.
v2.2.1 — Hotfix: GPU Offloading & Model Unload
What's Fixed
Model Unload Broken
The "Unload all models" button and automatic unload on model switch silently failed — the Ollama /generate call was missing the required prompt field. Models stayed in RAM indefinitely, accumulating memory usage.
No GPU Offloading
All Ollama chat calls were missing num_gpu — models could fall back to CPU-only inference, causing 100% CPU/RAM usage while the GPU sat idle. Now sends num_gpu: 99 by default, which tells Ollama to offload as many layers as possible to GPU. Ollama automatically splits between GPU and CPU if VRAM is insufficient, so this is safe for all hardware configurations.
Silent Error Swallowing
Unload errors were caught with .catch(() => {}) and discarded. Failures are now logged to console (console.warn) for debugging.
Impact
Users with dedicated GPUs (especially those with 16 GB RAM or less) should see dramatically lower CPU and RAM usage during inference. The unload button now actually frees memory.
Files Changed
src/api/ollama.ts— fixed unload, addednum_gpu: 99to all chat endpointssrc/api/providers/ollama-provider.ts— addednum_gpu: 99to provider chat callssrc/stores/modelStore.ts— error logging on model switch unload
Full Changelog: v2.2.0...v2.2.1
Locally Uncensored v2.2.0
What's New in v2.2.0
Custom Dark Titlebar
- Frameless Window — Native Windows titlebar removed. Custom dark titlebar with app icon, drag region, and window controls (minimize, maximize, close).
- Premium Look — Matches the app's luxury dark aesthetic. Seamless integration with the existing header.
- Close = Minimize to Tray — Same behavior as before, the close button minimizes to system tray.
Branded NSIS Installer
- Dark Theme — Installer now features dark header and sidebar images with the LU monogram. Premium from first click.
Qwen3-Coder Integration
- Qwen3-Coder 30B featured in Model Manager — 256K context, native tool calling, best coding agent
- Qwen3-Coder-Next and abliterated variants available
Download Manager
- Multi-Pull — Download multiple models simultaneously with per-model progress bars
- Pause / Resume / Dismiss — Full control over active downloads via header badge
- Rust CancellationToken — Clean cancellation using Tauri event system
Model Management
- Auto-Unload on Switch — When you select a new model, the previous one is automatically unloaded from RAM/VRAM. No more 100% CPU from multiple models loaded simultaneously.
- Loading Indicator — ModelSelector button shows a spinning indicator + blue glow while a model is loading.
- Sticky "Unload All" — The "Unload all models" button is now always visible at the bottom of the dropdown.
Redesigned Model Selector
- Completely rewritten dropdown — compact, dark, matches the app's design language
- Color-coded type dots (blue=text, purple=image, green=video)
- Provider sections (Ollama, ComfyUI) with clean grouping
Update Checker
- App now checks GitHub Releases for new versions automatically
- Green pulsing badge appears in the header when an update is available
- One-click "Download Update" redirects to locallyuncensored.com
- Settings > Updates tab shows version info, release notes, and manual check
Bug Fixes
- Fixed CPU 100% issue when switching between Ollama models
- Improved keep_alive management for Ollama models
v2.1.0 — True Multi-Provider Support
What's New
True Multi-Provider Support
- Simultaneous providers: Run Ollama + OpenAI-compatible + Anthropic at the same time
- Independent toggles: Each provider has its own enable/disable switch — no more one-at-a-time limitation
- Improved onboarding: Selecting a non-Ollama backend no longer disables Ollama
Provider Fixes
- OllamaProvider baseUrl: Now respects custom URL from settings (was hardcoded to localhost:11434)
- ComfyUI models: No longer incorrectly tagged as Ollama provider
- Cleaned up dead code: Removed legacy ApiConfig component
Testing
- Added dedicated
vitest.config.ts(fixes test runner crash) - Added comprehensive OllamaProvider test suite (31 tests)
- 447 total tests, all passing
Full Changelog: v2.0.0...v2.1.0