Skip to content

Automation: Direction #9

@github-actions

Description

@github-actions

Last generated: 2026-01-22T18:42:46.396Z
Provider: openai
Model: gpt-5.2

Summary

Stabilize CI for this legacy Scrapy/Kafka exporter by (1) removing committed CI-artifact DB files, (2) running tests deterministically in GitHub Actions with a pinned Python matrix, and (3) adding lightweight quality gates (lint + packaging sanity). Goal: reduce flaky builds/toil and prevent repo bloat/regressions while keeping changes small.

Direction (what and why)

  1. Stop tracking .bish* files: The repo currently contains multiple .bish.sqlite / .bish-index files at root and inside packages. These look like local tooling artifacts and will create noise, large diffs, and potential CI issues.
  2. Make GitHub Actions the source of truth (replace Travis): .travis.yml exists but GH Actions is clearly in use. Add a single, simple CI workflow that runs tox (or pytest) on supported Pythons. This repo’s current automation set is huge (many “auto-*” workflows), but none appear to be the basic “run unit tests” gate.
  3. Add minimal, high-value gates:
    • python -m build to ensure packaging remains valid.
    • pip install -e . + run tests to catch dependency issues early.
    • Optional: ruff (or flake8) to prevent obvious mistakes; keep it non-blocking initially if risk-averse.

Plan (next 1-3 steps)

1) Remove .bish* artifacts and prevent reintroduction

  • Delete committed files:
    • /.bish-index, /.bish.sqlite
    • /.github/.bish.sqlite
    • /scrapy_kafka_export/.bish-index, /scrapy_kafka_export/.bish.sqlite
    • /tests/.bish-index, /tests/.bish.sqlite
  • Update .gitignore (root) to include:
    • *.bish-index
    • *.bish.sqlite
  • Add a small CI check to fail if any *.bish.sqlite or *.bish-index is committed (simple git ls-files grep).

2) Add a single canonical CI workflow that runs tests via tox

Create .github/workflows/ci.yml:

  • Triggers: push, pull_request
  • Use actions/setup-python with a small matrix (recommend: 3.9, 3.10, 3.11 unless project constraints require older).
  • Install tox and run tox -q.
  • Cache pip.
  • If tox.ini is already configured, keep it; otherwise add/update envlist.

Suggested tox.ini updates (if needed):

  • Ensure pytest is declared in deps.
  • Ensure tests run with pytest -q.
  • If Kafka is required for integration tests, mark them and keep unit tests independent (see Step 3).

3) Add packaging sanity + (optional) lint in CI

In the same workflow:

  • Run python -m pip install build and python -m build on one Python version (e.g., 3.11) to verify sdist/wheel.
  • Optional lint gate:
    • Add ruff config in pyproject.toml (or setup.cfg) and run ruff check .
    • Start as non-blocking (continue-on-error: true) for one iteration, then enforce.

Risks/unknowns

  • Python support range: setup.py/dependencies may target older Python. If tests currently only pass on e.g. 3.8/3.9, adjust the matrix to match reality. Confirm via classifiers in setup.py.
  • Kafka dependency in tests: If tests/test_extension.py relies on a running Kafka broker, CI may be flaky unless it uses mocks. Prefer mocking producer interactions; if integration coverage is required, run Kafka via docker compose or services: (but that’s a larger change).
  • Workflows sprawl: The repository already has many automation workflows. Adding one more is fine, but keep it clearly named (ci.yml) and required in branch protection to avoid relying on the noisy “auto-*” set.

Suggested tests

  1. Unit test run: tox (or pytest -q) on at least Python 3.9–3.11.
  2. Packaging: python -m build (ensures README/metadata/setup config is consistent).
  3. Artifact check: CI step:
    • ! git ls-files | grep -E '\.bish-(index|sqlite)$'
  4. (If Kafka interactions are non-mocked) Integration test (optional):
    • Spin up Kafka in CI using a container and run a marked test subset, e.g. pytest -m integration.
    • Keep integration non-blocking initially if it’s flaky.

Verification checklist (quick)

  • No *.bish* files tracked by git.
  • ci.yml runs on PRs and reports pass/fail deterministically.
  • tox passes locally and in CI.
  • python -m build succeeds and produces wheel/sdist artifacts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    automationAutomation-generated direction and planning

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions