Refactor notebooks into reusable src helpers while preserving the existing spatial analysis workflow#1
Open
panghuanzhi62 wants to merge 43 commits into
Open
Refactor notebooks into reusable src helpers while preserving the existing spatial analysis workflow#1panghuanzhi62 wants to merge 43 commits into
panghuanzhi62 wants to merge 43 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR consolidates the current engineering upgrade for the Tokyo foreign population spatial analysis repository while preserving the existing research workflow.
The source of truth for this branch is:
00–08remain the reference orchestration and interpretation layer08src/tokyo_foreigners/data_raw/remains the canonical raw-data directorydata/raw/In addition to the extraction/refactor work, this branch now includes the minimum safe engineering hardening needed to make the repository more reproducible and reviewable.
What changed
1. Reusable helper layer under
src/tokyo_foreigners/Logic previously repeated across notebooks has been centralized into reusable helper modules, including:
paths.pyboundaries.pystation_accessibility.pyland_price.pyols.pyspatial_diagnostics.pymgwr.pyThis keeps the notebook workflow intact while reducing duplication and making future testing and maintenance easier.
2. Path handling aligned with current repository policy
Path handling has been centralized around the repository’s current canonical structure:
data_raw/data_processed/outputs/notebooks/docs/This branch intentionally keeps
data_raw/as canonical and does not introduce adata/raw/migration.3. Environment baseline moved to
uvThis branch adds a minimal, repository-level environment baseline:
pyproject.tomluv.lockuvis now the primary environment and dependency entry point for the project.This change is meant to standardize environment setup while preserving the current notebook-centered workflow.
4. Minimal linting for reusable helpers
A small Ruff baseline has been added and applied to the reusable helper layer under
src/tokyo_foreigners/.Scope is intentionally limited:
src/and related testable helper code5. Minimal pytest coverage for stable helper modules
This branch adds a lightweight
tests/directory with focused tests for stable helper behavior, including examples such as:The goal is not full coverage. The goal is to protect the most stable reusable logic with small, reviewable tests.
6. Lightweight GitHub Actions CI
A minimal CI workflow has been added to run on push / pull request. It currently performs:
uv sync --lockeduv run ruff check ...uv run pytest -qThis is intentionally lightweight and does not attempt notebook execution in CI.
7. Documentation and repository narrative updates
Repository documentation has been updated to reflect the current state more accurately, including:
src/tokyo_foreigners/as the reusable helper layeruvas the environment entry pointdocs/refactor_status.mdWhy this PR matters
This PR moves the repository from a mostly notebook-driven research project with environment assumptions tied to a local machine toward a more reproducible and maintainable research codebase, without changing the project’s core analytical style.
In practical terms, the repository now has:
What this PR does not do
This PR intentionally does not:
data_raw/todata/raw/08Validation
The branch has been validated at a minimal engineering level through:
uv syncuv run jupyter labsrc/tokyo_foreignersReviewer guidance
The most important review questions for this PR are:
src/tokyo_foreigners/coherent and non-disruptive?data_raw/as canonical without introducing path-policy drift?Net result
After this PR, the repository remains a notebook-centered spatial analysis portfolio, but with a stronger engineering baseline:
src/tokyo_foreigners/data_raw/path policy preserveduv-managed environment