Skip to content

feat: default to local data.json and introduce --update flag, and fix Windows test compatibility (#2896)#2952

Open
ZeYuNie wants to merge 1 commit into
sherlock-project:masterfrom
ZeYuNie:fix-local-data-default
Open

feat: default to local data.json and introduce --update flag, and fix Windows test compatibility (#2896)#2952
ZeYuNie wants to merge 1 commit into
sherlock-project:masterfrom
ZeYuNie:fix-local-data-default

Conversation

@ZeYuNie
Copy link
Copy Markdown

@ZeYuNie ZeYuNie commented May 13, 2026

Description

This PR primarily addresses issue #2896 by changing the default loading behavior of data.json to prioritize the local file bundled with the package, preventing implicit network access on execution.

To preserve the ability to fetch the latest site data from the upstream repository, a new --update flag has been introduced. Additionally, the existing --local flag has been kept for backward compatibility but will now display a DeprecationWarning.

Furthermore, while running the test suite locally on a Windows environment, I encountered and fixed two Windows-specific compatibility bugs:

  1. Encoding Issue in test_manifest.py: Fixed a UnicodeDecodeError caused by the Windows default GBK encoding. I explicitly added encoding='utf-8' to the file reading operation.
  2. Subprocess Unicode Path Issue: Fixed a failure in the interactive CLI tests (test_ux.py via sherlock_interactives.py). Previously, using subprocess.check_output with shell=True and the py launcher failed to resolve Windows user paths containing non-ASCII (Unicode) characters.

Motivation and Context

As highlighted in #2896, implicitly fetching data.json from GitHub causes unexpected network access, which is strictly avoided in environments like Debian. This change ensures offline execution out of the box. At the same time, the added test fixes ensure that contributors using Windows machines with non-ASCII usernames can successfully run the test suite without false-negative failures.

Changes Made

  • Added --update argument in sherlock.py.
  • Refactored the loading logic: fetching from api.github.com is now strictly conditioned on args.update == True.
  • Added a DeprecationWarning when args.local is used.
  • Added encoding='utf-8' to the open() function in tests/test_manifest.py.
  • Patched the subprocess execution logic in the test suite to safely handle non-ASCII Windows paths.

How Has This Been Tested?

  • Offline Test: Disconnected from the network and verified that running python sherlock.py user123 executes immediately using the local data.json without timing out.
  • Update Test: Ran with the --update flag and confirmed it successfully fetches the remote data and updates the local JSON file.
  • Backward Compatibility Test: Ran with the --local flag and confirmed the deprecation warning is printed.
  • Windows Compatibility & Automated Tests: Ran the pytest suite locally on a Windows machine with a non-ASCII user path. Verified that all previous failing tests (test_manifest.py and test_ux.py) now pass successfully alongside the rest of the test suite.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • I have read the CONTRIBUTING document.

@ZeYuNie ZeYuNie requested a review from ppfeister as a code owner May 13, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant