Build a site audit platform that provides a sitemap, user flow diagram, and broken link detection for a given website. The tool will handle dynamic websites, offer visual mapping, and support PDF export.
- Frontend: React (Vite) + Vanilla CSS
- Reasoning: React allows for complex state management (dashboard) and rich visualization libraries. Vanilla CSS ensures lightweight, fully custom styling as per user preference.
- Backend: Node.js (Express) + Playwright
- Reasoning: Node.js shares the language with frontend. Playwright is excellent for crawling dynamic websites.
- Architecture:
- Client: Single Page Application (SPA).
- Server: REST API to handle crawl jobs and Project storage.
- Storage:
projects.json(V1 simple persistence) or in-memory.
Important
Data Persistence: For V1, we will save projects to a local JSON file (server/data/projects.json) so they persist between server restarts. This avoids setting up a full database (SQL/Mongo) for now, keeping it "Basic" but functional.
Note
Color Theme:
- Primary (CTA):
#1e7ddf(Blue) - Ink: Black
- Background: White
/
├── client/ # React Frontend
├── server/ # Express Backend
├── server/data/ # JSON storage
└── package.json # Root scripts
- Entry point.
- Express app setup.
- Routes:
/api/projects(List/Create),/api/scan,/api/sse,/api/result.
getProjects(): Read fromprojects.json.createProject(domain): Create entry, trigger logic.
- Technology: Playwright.
- Logic:
- Launch browser.
- Navigate to root URL.
- Breadth-First Search (BFS) to discover links.
- Track:
pages: List of visited URLs.links:source->target.broken_links: 404s.
- Respect limits for V1.
- Normalization logic.
- Routing setup (React Router).
- Routes:
/(Projects List - Landing)/new(New Project Input)/project/:id/scanning(Loader)/project/:id/dashboard(Result)
- Global variables:
--color-primary: #1e7ddf;--color-ink: #000000;--color-bg: #ffffff;
- Reset CSS.
- "Premium" aesthetic styles (clean, high contrast, smooth UI).
- Landing page.
- Lists existing projects (cards/list).
- "New Project" button (Primary CTA).
- (Formerly Onboarding)
- Centered input field for Domain.
- "Start Audit" button.
- Use
html-to-imageto generate PNG of the ReactFlow graph. - Use
jspdffor high-quality PDF export (embedding the graph image). - Use
html-to-image(toSvg) for Figma Export. This creates a vector SVG file that can be dragged directly into Figma. - [Modify]
client/src/components/UserFlowGraph.jsx: Add "Export for Figma" button.
Transform the "User Flow" graph from a noisy all-encompassing sitemap into targeted, actionable user journeys based on specific personas (e.g., "Job Seeker", "Investor", "Customer").
We will use a local Ollama model (e.g., llama3, mistral) to analyze the crawled sitemap and deduce likely personas and their relevant paths. We will continue using React Flow for the visualization.
- No External Dependency: Use native
fetchto call local Ollama API. - Database: Add
personascolumn (JSON) toProjectmodel. - New Endpoint:
POST /api/projects/:id/analyze-personas- Input: Project ID.
- Process:
- Fetch all scanned pages (
url,title). - Call Ollama: POST to
http://localhost:11434/api/generate. - Prompt: "Analyze these URLs and return a JSON list of 3-5 key user personas (e.g., Investor, Customer) with their relevant page paths."
- Parsing: robustly parse likely non-perfect JSON from small models.
- Save results to DB and return.
- Fetch all scanned pages (
- Dashboard UI:
- Add "Generate Personas (Ollama)" button.
- Add Persona Selector dropdown.
- Graph Logic (
UserFlowGraph.jsx):- Library: Continue using React Flow.
- Filtering:
- Receive
activePersonaprop. - If active, filter
nodesarray to only show pages in the persona's list. - Filter
edgesto ensure connectivity is maintained or just show direct links between visible nodes. - Layout: Re-run Dagre layout on the filtered subset so the graph looks clean (not just a sparse version of the big graph).
- Receive
- User clicks "Generate Personas" in Dashboard.
- Backend sends Sitemap -> Ollama (Local).
- Ollama returns JSON (e.g.,
[{ name: "Investor", pages: [...] }]). - User selects "Investor" from dropdown.
- Graph re-renders showing only investor-relevant pages and flows.
- Ollama running locally on port 11434.
- Model pulled (e.g.,
ollama pull llama3).
- Basic unit tests for URL normalization.
- Backend API tests using formatted mock requests.
- Onboarding: Input
https://example.comand verify it starts. - Crawling: Use a controlled test site or a small public site. Verify it captures links.
- Broken Links: Test with a known broken link URL (if available) or mock the response.
- Visuals: Check if the User Flow diagram renders nodes and connections correctly.
- PDF: Click export and verify the PDF is generated and readable.