Skip to content

[Bug] UnicodeDecodeError on Windows causes superclaude mcp --list to crash: subprocess output encoding not specified in install_mcp.py #492

@SuSuSoo

Description

@SuSuSoo

Environment:

  • OS: Windows 10 LTSC (Simplified Chinese locale)
  • Python: 3.12.10 (via pyenv-win)
  • SuperClaude version: 4.1.9
  • Installation Method: pipx
  • Shell: PowerShell

Description:

Running superclaude mcp --list crashes with a UnicodeDecodeError followed by an AttributeError on Windows systems with non-UTF-8 default encoding (e.g., GBK for Chinese systems).

Steps to Reproduce:

  1. Install SuperClaude via pipx on a Windows 10 system with Chinese (or any non-English) locale
  2. Run: superclaude mcp --list
  3. Observe the crash

Expected Behavior:

The command should list all available MCP servers regardless of system locale/encoding settings.

Actual Behavior:

The program crashes with two cascading errors:

Exception in thread Thread-1 (_readerthread):
Traceback (most recent call last):
  File "C:\Users\SuSuSoo\.pyenv\pyenv-win\versions\3.12.10\Lib\threading.py", line 1599, in _readerthread
    buffer.append(fh.read())
                  ^^^^^^^^^
UnicodeDecodeError: 'gbk' codec can't decode byte 0x93 in position 74: illegal multibyte sequence

Traceback (most recent call last):
  ...
  File "...\superclaude\cli\install_mcp.py", line 168, in check_mcp_server_installed
    output = result.stdout.lower()
             ^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'lower'

Root Cause Analysis:

The issue originates in src/superclaude/cli/install_mcp.py:

  1. Missing encoding parameter (L84-107): The _run_command() function doesn't explicitly set encoding for Windows subprocess calls
  2. Encoding mismatch: Windows cmd /c outputs using the system codepage (GBK for Chinese), but text=True in Python 3.12 defaults to UTF-8 decoding
  3. Missing null check (L168): When decoding fails, result.stdout becomes None, but the code immediately calls .lower() without validation

Proposed Solution:

Fix 1: Add explicit encoding to _run_command() (L84-107):

def _run_command(cmd: List[str], **kwargs) -> subprocess.CompletedProcess:
    """
    Run a command with proper cross-platform shell handling.
    
    Args:
        cmd: Command as list of strings
        **kwargs: Additional subprocess.run arguments
    
    Returns:
        CompletedProcess result
    """
    if platform.system() == "Windows":
        # On Windows, wrap command in 'cmd /c' to properly handle commands like npx
        cmd = ["cmd", "/c"] + cmd
        
        # Force UTF-8 encoding to avoid GBK/locale issues on non-English systems
        if 'encoding' not in kwargs:
            kwargs['encoding'] = 'utf-8'
        if 'errors' not in kwargs:
            kwargs['errors'] = 'replace'  # Replace undecodable bytes instead of crashing
            
        return subprocess.run(cmd, **kwargs)
    else:
        # macOS/Linux: Use string format with proper shell to support aliases
        cmd_str = " ".join(shlex.quote(str(arg)) for arg in cmd)
        
        # Use the user's shell to execute the command, supporting aliases
        user_shell = os.environ.get("SHELL", "/bin/bash")
        return subprocess.run(
            cmd_str, shell=True, env=os.environ, executable=user_shell, **kwargs
        )

Fix 2: Add null safety to check_mcp_server_installed() (L168):

# Parse output to check if server is installed
if result.stdout is None:
    return False
    
output = result.stdout.lower()
return server_name.lower() in output

Impact:

  • Affected systems: Windows with non-English locales (Chinese, Japanese, Korean, etc.)
  • Affected commands: Any command using _run_command(), particularly MCP-related operations
  • Scope: All Windows users in non-UTF-8 regions

Root Cause:

On Windows systems with non-UTF-8 locale (e.g., Chinese systems using GBK), the subprocess.run() call in _run_command() fails to decode subprocess output when text=True is specified. This causes two cascading failures:

  1. Line 98: subprocess.run(cmd, **kwargs) - The subprocess output contains bytes that cannot be decoded by the default encoding, resulting in a UnicodeDecodeError in the background thread reading stdout
  2. Line 168: result.stdout.lower() - Because the stdout capture failed, result.stdout is None, causing an AttributeError

Why PYTHONIOENCODING doesn't work:

The PYTHONIOENCODING environment variable only affects Python's own stdin/stdout/stderr encoding, not the encoding used by subprocess.run() to decode child process output. The subprocess module uses the system's default encoding (GBK on Chinese Windows) unless explicitly overridden.

Temporary Workaround:

Users can set the encoding environment variable before running the command:

# This does NOT work - PYTHONIOENCODING only affects Python's own stdio
$env:PYTHONIOENCODING="utf-8"  
superclaude mcp --list

Tested workaround: Manually patch the installed file at <pipx_venv>\Lib\site-packages\superclaude\cli\install_mcp.py:

  1. Locate line 98
  2. Replace return subprocess.run(cmd, **kwargs) with:
    return subprocess.run(cmd, encoding='utf-8', errors='replace', **kwargs)
  3. Locate line 168
  4. Add null check before accessing stdout:
    if result.stdout is None:
        return False
    output = result.stdout.lower()

Additional Context:

  • This issue does not affect macOS/Linux (they use shell=True and typically default to UTF-8)
  • Similar encoding issues may exist in other subprocess calls throughout the codebase
  • Related to Windows internationalization and Python's text mode subprocess handling

References:


Labels: bug windows encoding good first issue help wanted


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions