fix: call ripgrep with explicit utf-8 encoding. by Sola-ris · Pull Request #1199 · TagStudioDev/TagStudio

Sola-ris · 2025-11-11T21:50:31Z

Summary

Call ripgrep with explicit UTF-8 encoding to prevent issues with multi-byte characters.

When calling subprocess.Popen, (which is where silent_subprocess::silent_run eventually ends up) with text=True without specifying the encoding, python will use the encoding returned by locale::getencoding.

On Linux this returns UTF-8 while on Windows it will return cp1252 which causes issues with multi-byte characters like em dashes or Japanese characters.

Fixes #1195.

Before

I've omitted こんにちは.txt from the screenshot since it hangs the program with the error reported in #1195

After

Tasks Completed

Platforms Tested:
- Windows x86
- Windows ARM
- macOS x86
- macOS ARM
- Linux x86
- Linux ARM
Tested For:
- Basic functionality
- PyInstaller executable

Computerdores

Code looks good, also very nice that you wrote a regression test

CyanVoxel · 2026-01-23T06:04:04Z

Sorry for the delay on pulling this, and thank you so much for your work on this fix!

fix: call ripgrep with explicit utf-8 encoding.

fa64c27

Computerdores approved these changes Nov 12, 2025

View reviewed changes

Sola-ris mentioned this pull request Nov 12, 2025

feat: add windows runner for pytest #1201

Merged

8 tasks

CyanVoxel added Type: Installation Installing, building, and/or launching the program Type: File System File system interactions Type: Tests Tests or testing related labels Nov 14, 2025

CyanVoxel added this to TagStudio Development Nov 14, 2025

CyanVoxel moved this to 👀 In review in TagStudio Development Nov 14, 2025

CyanVoxel added this to the Alpha v9.5.x milestone Nov 14, 2025

TrigamDev added the Type: Fix A fix for a bug, typo, or other issue label Jan 18, 2026

CyanVoxel merged commit 4c484bc into TagStudioDev:main Jan 23, 2026
5 checks passed

github-project-automation bot moved this from 👀 In review to ✅ Done in TagStudio Development Jan 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: call ripgrep with explicit utf-8 encoding.#1199

fix: call ripgrep with explicit utf-8 encoding.#1199
CyanVoxel merged 1 commit intoTagStudioDev:mainfrom
Sola-ris:ripgrep-utf-8

Sola-ris commented Nov 11, 2025

Uh oh!

Computerdores left a comment

Uh oh!

CyanVoxel commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

Sola-ris commented Nov 11, 2025

Summary

Before

After

Tasks Completed

Uh oh!

Computerdores left a comment

Choose a reason for hiding this comment

Uh oh!

CyanVoxel commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants