Skip to content

Commit bff645a

Browse files
authored
Complete update of tournament scraping pipeline and updated GH Action (#108)
## Summary This PR adds a **SQLite wrapper** for offline scraper testing, improves the Longshanks and Rollbetter scrapers, fixes the missing `winner_id` in match records, and updates the GitHub Action to support a SQLite-only run mode. ## Changes ### New file - **`backend/scripts/scrape_tournaments_sqlite.py`** — wrapper that sets `DATABASE_URL` to a local SQLite file before importing the main scraper. Enables offline testing without Postgres. Usage: ```bash python -m backend.scripts.scrape_tournaments_sqlite \ --sqlite-path scraped.db \ --platform longshanks+rollbetter \ --time-range 7 ``` ### Improved scrapers - **`scrape_tournaments.py`** — added `--tournament-url` flag for scraping individual URLs directly (supports Rollbetter, Longshanks, and ListFortress). Added `--tournament-url` support for benchmark validation. - **`rollbetter_scraper.py`** — scoped match table selection to the active tab panel and added a `wait_for_selector` before reading round data. This fixes several rounds that previously reported "No match table found". - **`longshanks_scraper.py`** — replaced jQuery `trigger(change)` with the sites own `load_games()` helper (which is how the page actually loads match data). Scoped `.results` lookups to the `#games` container. Fixed date parsing to use the start date from Longshanks date ranges. ### Bug fix - **`save_tournament_data`** — `winner_id` was never set when converting match dicts to `Match` objects. Now resolves `winner_name_temp` against the player name map to populate the foreign key. **All 20 matches in the validation scrape now have a non-null `winner_id`** (previously 0). ### GitHub Action - **`scrape_tournaments.yml`** — added `none` as an environment option (SQLite-only, no Postgres). Exposed `upload_sqlite_artifact` as an input toggle. Postgres writes only occur for `prod`/`dev`. When `environment: none`, uses the SQLite wrapper. ## Validation - ✅ Local SQLite scrape of 10 benchmark tournaments matches expected counts (Rollbetter events match exactly; Longshanks now captures **more** matches thanks to the `load_games()` fix — previously missed rounds) - ✅ GH Action ran successfully against **dev** Postgres (8 new tournaments from Apr 15–30, grew DB from 604 → 612) - ✅ GH Action correctly uses `DEV_DATABASE_URL` secret when environment=dev - ✅ SQLite artifact uploaded as expected - ✅ `winner_id` now correctly populated in all match records - ✅ Local scrape of Apr 1–7 produced 5 tournaments, 26 players, 36 matches across both platforms Closes # (no issue linked) ---------
1 parent a8f5d71 commit bff645a

62 files changed

Lines changed: 14265 additions & 1914 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
{
2+
"features": {
3+
"ghcr.io/devcontainers/features/copilot-cli:1": {
4+
"version": "1.1.2",
5+
"resolved": "ghcr.io/devcontainers/features/copilot-cli@sha256:757c6c2899dc902c44a9f6164c1f7392832ced13c6bb632d8d60a880f2e92456",
6+
"integrity": "sha256:757c6c2899dc902c44a9f6164c1f7392832ced13c6bb632d8d60a880f2e92456"
7+
},
8+
"ghcr.io/devcontainers/features/git-lfs:1": {
9+
"version": "1.2.5",
10+
"resolved": "ghcr.io/devcontainers/features/git-lfs@sha256:71c2b371cf12ab7fcec47cf17369c6f59156100dad9abf9e4c593049d789de72",
11+
"integrity": "sha256:71c2b371cf12ab7fcec47cf17369c6f59156100dad9abf9e4c593049d789de72"
12+
},
13+
"ghcr.io/devcontainers/features/github-cli:1": {
14+
"version": "1.1.0",
15+
"resolved": "ghcr.io/devcontainers/features/github-cli@sha256:d22f50b70ed75339b4eed1ba9ecde3a1791f90e88d37936517e3bace0bbad671",
16+
"integrity": "sha256:d22f50b70ed75339b4eed1ba9ecde3a1791f90e88d37936517e3bace0bbad671"
17+
},
18+
"ghcr.io/devcontainers/features/node:2.0.0": {
19+
"version": "2.0.0",
20+
"resolved": "ghcr.io/devcontainers/features/node@sha256:fedd4c11f7adfb64283b578dddc7da906728daa25fa293351c9d913231acf12f",
21+
"integrity": "sha256:fedd4c11f7adfb64283b578dddc7da906728daa25fa293351c9d913231acf12f"
22+
},
23+
"ghcr.io/siri404/devcontainer-ai-features/gemini-cli:1": {
24+
"version": "1.0.1",
25+
"resolved": "ghcr.io/siri404/devcontainer-ai-features/gemini-cli@sha256:99754255ee3d9596430bf730c0e55475bd9d7d741845d96e6ec3c81a89fec51b",
26+
"integrity": "sha256:99754255ee3d9596430bf730c0e55475bd9d7d741845d96e6ec3c81a89fec51b"
27+
}
28+
}
29+
}

.devcontainer/devcontainer.json

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
{
2+
"name": "m3tacron-dev",
3+
"image": "mcr.microsoft.com/devcontainers/python:3-3.13-trixie",
4+
"features": {
5+
"ghcr.io/devcontainers/features/node:2.0.0": {},
6+
"ghcr.io/devcontainers/features/github-cli:1": {},
7+
"ghcr.io/devcontainers/features/copilot-cli:1": {},
8+
"ghcr.io/siri404/devcontainer-ai-features/gemini-cli:1": {},
9+
"ghcr.io/devcontainers/features/git-lfs:1": {}
10+
},
11+
"forwardPorts": [8000, 3000, 8100, 5173],
12+
"portsAttributes": {
13+
"8000": {
14+
"label": "FastAPI"
15+
},
16+
"8100": {
17+
"label": "FastAPI (README)"
18+
},
19+
"3000": {
20+
"label": "Svelte"
21+
},
22+
"5173": {
23+
"label": "Vite"
24+
}
25+
},
26+
"customizations": {
27+
"vscode": {
28+
"extensions": [
29+
"ms-python.python",
30+
"ms-python.vscode-pylance",
31+
"ms-playwright.playwright",
32+
"svelte.svelte-vscode",
33+
"esbenp.prettier-vscode",
34+
"ms-python.vscode-python-envs",
35+
"ms-python.debugpy",
36+
"ms-ossdata.vscode-pgsql",
37+
"GitHub.vscode-pull-request-github",
38+
"bretwardjames.gh-projects",
39+
"GitHub.copilot-chat",
40+
"github.vscode-github-actions"
41+
],
42+
"settings": {
43+
"python.analysis.typeCheckingMode": "basic",
44+
"editor.formatOnSave": true
45+
}
46+
}
47+
}
48+
}
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
name: Discovery Web Auditor
3+
description: "Use when discovering UX/UI/responsiveness improvements, performance/refactor opportunities, and project-aligned feature ideas by combining browser auditing with codebase analysis."
4+
argument-hint: "Base URL, routes/flows to inspect, project goals/audience, and focus areas (UX/UI, performance/refactor, feature ideas)"
5+
tools:
6+
[
7+
read,
8+
search,
9+
execute,
10+
web,
11+
open_browser_page,
12+
navigate_page,
13+
read_page,
14+
click_element,
15+
type_in_page,
16+
hover_element,
17+
screenshot_page,
18+
run_playwright_code,
19+
]
20+
user-invocable: true
21+
---
22+
23+
You are a discovery specialist for product, UX/UI, responsiveness, and engineering improvement opportunities.
24+
25+
## Scope
26+
27+
- Analyze the running website for UX, UI consistency, accessibility-adjacent usability, and responsiveness improvements.
28+
- Analyze the codebase for performance and refactor opportunities with clear user or maintainability impact.
29+
- Propose project-aligned feature ideas grounded in current capabilities, architecture, target audience, and repository scope.
30+
- Produce issue-ready findings without writing implementation code.
31+
32+
## Constraints
33+
34+
- Do not edit files or propose code patches.
35+
- Do not invent findings that are not reproducible.
36+
- Do not prescribe detailed implementation unless explicitly requested.
37+
- Do not spend effort validating basic site uptime; assume the site is running and focus on quality and opportunity discovery.
38+
39+
## Procedure
40+
41+
1. Build project context from README, key frontend/backend modules, and existing features.
42+
2. Audit critical website journeys with integrated browser tools, emphasizing UX/UI quality and responsiveness behavior.
43+
3. Inspect code hotspots for performance and refactor candidates using read/search and focused command-line analysis.
44+
4. Derive feature opportunities that are realistic for this codebase and consistent with product goals.
45+
5. Capture evidence for each finding: route/module, reproduction steps, observed impact, and expected outcome.
46+
6. Group findings into cohesive issue candidates sized for independent delivery.
47+
48+
## Output Format
49+
50+
Return a concise report with:
51+
52+
1. `UX/UI/Responsiveness Findings` with severity and evidence.
53+
2. `Performance/Refactor Findings` with code-level context and impact.
54+
3. `Feature Opportunities` with rationale and user value.
55+
4. `Grouped Issue Candidates` with objective and expected outcome.
56+
5. `Clarification Questions` to resolve before issue creation.
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
---
2+
name: github-issues
3+
applyTo: "**/*"
4+
description: "Use when creating or managing GitHub issues for this repository. Enforces the repo's issue creation workflow and required metadata."
5+
---
6+
7+
## GitHub Issue Creation Guidelines
8+
9+
Follow the repository's issue workflow exactly when opening new GitHub issues.
10+
11+
- Check for existing relevant issues first using `gh issue list` to avoid duplicates.
12+
- Clarify scope with the user before creating the issue.
13+
- Use `gh issue create` to open the issue before starting implementation.
14+
15+
### Required issue body format
16+
17+
Each issue body must include:
18+
19+
- `Objective`: A short statement of what the issue is intended to accomplish.
20+
- `Context & Symptoms`: A developer-focused description of the current problem or goal, including relevant files or behavior. Avoid prescribing a rigid technical solution.
21+
- `Expected Outcome`: A concise description of the wanted end state and acceptance outcome, without technical implementation details unless explicitly discussed.
22+
23+
Do not store Priority and Size in the issue body. Manage them through GitHub Project fields.
24+
25+
### Markdown formatting and template
26+
27+
- Write issue bodies in GitHub Flavored Markdown.
28+
- Keep section order fixed: `Objective`, `Context & Symptoms`, `Expected Outcome`.
29+
- Use one blank line between sections and avoid trailing whitespace.
30+
- Use concise bullet points when listing symptoms or acceptance outcomes.
31+
- Keep tone human-written, specific, and repository-aware.
32+
33+
Use this template:
34+
35+
```markdown
36+
## Objective
37+
38+
<1-3 lines describing the goal>
39+
40+
## Context & Symptoms
41+
42+
- <current behavior or problem>
43+
- <where it appears: route/module/component>
44+
- <impact on users/developers>
45+
46+
## Expected Outcome
47+
48+
- <observable end state>
49+
- <acceptance-oriented result>
50+
```
51+
52+
### Newlines and special formatting in CLI
53+
54+
- Prefer `gh issue create --body-file <file>` for multi-line content.
55+
- If creating from shell inline, prefer a heredoc with a quoted delimiter to preserve formatting exactly:
56+
57+
```bash
58+
cat > /tmp/issue-body.md <<'EOF'
59+
## Objective
60+
61+
...
62+
EOF
63+
64+
gh issue create --title "..." --body-file /tmp/issue-body.md
65+
```
66+
67+
- Avoid packing multi-line Markdown into a single `--body` string when possible.
68+
- When issue text includes special characters (for example `` ` ``, `$`, `*`, `_`, or `#`), use `--body-file` to avoid shell escaping errors.
69+
70+
### Project field updates (Priority and Size)
71+
72+
- After issue creation, set Priority and Size in the linked GitHub Project item.
73+
- Allowed `Size` values: `S`, `M`, `L`.
74+
- Assign `Size` using these measurable criteria:
75+
- `S`: small scoped change, usually one flow/module, up to about 3 files touched, with localized edits or small additions.
76+
- `M`: medium scoped change, usually up to 2 related flows/modules, about 4-8 files touched, with moderate additions/refactors.
77+
- `L`: largest allowed single-issue scope, usually up to 3 related flows/modules, about 9-15 files touched, with substantial but cohesive changes.
78+
- If scope exceeds `L` (for example more than 15 files touched, more than 3 flows/modules, or would require more than one PR), split it into cohesive sub-issues before creation.
79+
- Allowed `Priority` values:
80+
- `Urgent & Important`
81+
- `Not Urgent & Important`
82+
- `Urgent & Not Important`
83+
- `Not Urgent & Not Important`
84+
- Use these exact option names when mapping to project single-select options.
85+
- Use CLI commands in this sequence:
86+
- `gh project field-list <project-number> --owner <owner> --format json`
87+
- `gh project item-list <project-number> --owner <owner> --format json`
88+
- `gh project item-edit --id <item-id> --project-id <project-id> --field-id <field-id> --single-select-option-id <option-id>`
89+
- Use one `gh project item-edit` invocation per field update.
90+
91+
### Multi-Issue Discussion and Creation
92+
93+
- When multiple problems or improvement ideas are discussed in one conversation, group them into cohesive issues first.
94+
- Grouping and drafting are a single workflow pass: create grouped issues in the same phase once scope is clear.
95+
- Keep issue scope balanced: avoid very small issues and avoid giant umbrella issues.
96+
- Ask clarifying questions before creating issues, not after.
97+
- Before creating, request explicit confirmation of both:
98+
- the proposed grouping
99+
- the proposed issue titles and bodies
100+
- the proposed Priority and Size for each issue (to be set in the project fields)
101+
102+
### Labels and milestone
103+
104+
- Assign type and domain labels on the Issue only.
105+
- Use one type label: `bug`, `enhancement`, `refactor`, or `documentation`.
106+
- Use one or more domain labels: `frontend`, `backend`, `performance`.
107+
- Assign the appropriate milestone when creating the issue.
108+
109+
### Other workflow rules
110+
111+
- Keep issue text objective, professional, and developer-to-developer.
112+
- Keep issue text concise and human-written; avoid generic AI-style phrasing.
113+
- Issue title and issue body must always be written in English.
114+
- Technical plan comments posted on GitHub must always be written in English.
115+
- After creating issues, do not automatically transition to technical planning unless the user explicitly asks to start planning/implementation.
116+
- Before coding begins for an issue, prepare a technical plan but post it as a comment only after explicit user approval.
117+
- If the approved plan changes later, update the existing plan comment instead of adding a new one.
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
name: workflow-discovery
3+
description: "Use during discovery/investigation conversations: clarify current vs expected state, group issues cohesively, and create issues only after explicit user confirmation."
4+
---
5+
6+
## Discovery Phase Rules
7+
8+
- Focus on investigation, improvement discovery, and scope clarification.
9+
- Ask clarifying questions early and complete clarification before drafting final issue text.
10+
- Keep issue drafts implementation-agnostic unless the user explicitly requests technical details.
11+
- When multiple findings exist, group them into cohesive issue bundles in the same pass.
12+
- Always present grouped issue proposals and draft issue bodies for explicit user confirmation before creating issues.
13+
- Do not defer open clarifications to post-creation; resolve them before creating the issue set.
14+
- After issue creation, remain in discovery mode unless the user explicitly asks to enter planning or implementation.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
name: workflow-implementation
3+
description: "Use when executing approved work: implement changes, handle issue-linking edge cases, and keep GitHub plan-comment sync aligned with approved scope."
4+
---
5+
6+
## Implementation Phase Rules
7+
8+
- Enter implementation phase only when the user explicitly asks to execute changes.
9+
- If issue(s) are selected, proceed directly with implementation-oriented work.
10+
- Do not force discovery or planning again unless the user explicitly requests it.
11+
- Treat the root workspace repository as the repository of reference for implementation and GitHub CLI operations; do not use nested submodules as the primary repo unless the user explicitly requests it.
12+
- If implementation is requested with no issue provided, search open issues for likely matches and ask the user to confirm the correct issue.
13+
- If no matching issue is found, ask explicit confirmation before proceeding without an issue and state clearly that no issue is linked.
14+
- Primary plan source is the approved plan in current chat/context.
15+
- If no approved plan is available in current chat/context and issue(s) are selected, use approved plan comments from the linked GitHub issue(s).
16+
- If no approved plan exists, ask whether to return to planning phase or continue with an explicitly approved quick-fix scope.
17+
- At the end of implementation phase, open a pull request for the implemented changes.
18+
- Pull request must target branch `dev` as base branch.
19+
- Pull request body must include `Closes #<issue-id>` for each linked issue.
20+
- If browser access is available, visually verify changes on the PR preview URL `https://<pr-id>.dev.m3tacron.com`.
21+
- PR preview usually becomes available about 45-60 seconds after PR creation.
22+
- Treat the preview as ready when a PR bot comment states that preview deployment is ready; from that moment, preview URL should be accessible.
23+
- If preview is not yet ready, re-check PR comments and retry access before final handoff.
24+
- Technical plan comments posted to GitHub must always be written in English.
25+
- If the plan changes later, update the existing plan comment rather than adding a new one.
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
name: workflow-phases
3+
description: "Use when routing a request to discovery, planning, or implementation phase in the issue-first workflow."
4+
---
5+
6+
## Phase Routing
7+
8+
- First classify the user request as discovery phase, planning phase, or implementation phase.
9+
- If the user is in discovery phase, follow the discovery instruction set and do not start coding.
10+
- If the user asks to define or refine a technical plan, follow the planning instruction set and keep the phase implementation-free.
11+
- If the user asks to execute changes, follow the implementation instruction set and do not force a new discovery/planning loop unless requested.
12+
- If the request is ambiguous between planning and implementation, ask explicitly whether the user wants plan-only or code execution.
13+
- Do not move from discovery/issue-creation into technical planning unless the user explicitly asks to start planning or implementation.
14+
- For discovery requests that involve UX/UI/responsiveness, feature ideation, or performance/refactor scouting, proactively use or suggest `Discovery Web Auditor`.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
name: workflow-planning
3+
description: "Use when drafting/refining technical plans: keep planning clean, avoid code implementation actions, and require explicit approval before handoff to implementation."
4+
---
5+
6+
## Planning Phase Rules
7+
8+
- Enter planning phase only when the user explicitly asks to start planning.
9+
- Keep this phase focused on plan definition and refinement.
10+
- Do not perform implementation actions in planning phase.
11+
- Treat the root workspace repository as the repository of reference for planning context and GitHub CLI operations; do not anchor planning to nested submodules unless the user explicitly requests it.
12+
- Draft plans with scope/non-goals, approach, validation strategy, and risks.
13+
- Ask explicit user approval on the final plan text before any implementation phase starts.
14+
- At the end of planning phase, post or update the approved plan as a GitHub issue comment when issue(s) are linked.
15+
- Follow all GitHub comment rules already defined in [GitHub Issue Instructions](./github-issues.instructions.md), including English-only and update-instead-of-duplicate behavior.
16+
- If no issue is linked, keep the approved plan in chat context and state clearly that no GitHub comment was posted.
17+
- Keep planning output ready for handoff to implementation phase.
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
---
2+
name: Discovery Phase Audit
3+
description: "Use during discovery to investigate UX/UI/responsiveness improvements, performance/refactor opportunities, and project-aligned feature ideas before issue creation."
4+
argument-hint: "Base URL, routes/flows, project goals/audience, and focus areas"
5+
agent: "Discovery Web Auditor"
6+
---
7+
8+
Run a discovery-phase investigation before issue creation.
9+
10+
## Startup Behavior
11+
12+
If prior chat context is missing or insufficient, ask the user to confirm discovery scope first:
13+
14+
1. Which areas to focus on: UX/UI/responsiveness, performance/refactor, feature opportunities.
15+
2. Which routes, workflows, or modules to inspect first.
16+
3. Any business constraints, audience context, or priority order.
17+
4. Confirmation to start discovery with this scope.
18+
19+
## Discovery Execution
20+
21+
1. Build project understanding from repository context and existing capabilities.
22+
2. Investigate browser UX/UI/responsiveness quality and evidence-based improvement opportunities.
23+
3. Inspect code-level performance/refactor opportunities with practical impact.
24+
4. Propose realistic, project-aligned feature opportunities.
25+
5. Group findings into cohesive issue candidates.
26+
27+
## Output
28+
29+
Return:
30+
31+
1. Discovery findings with evidence.
32+
2. Grouped issue candidates with objective and expected outcome.
33+
3. Clarification questions required before issue drafting/creation.

0 commit comments

Comments
 (0)