You are a research analyst studying similarity band formation.
You are given a CSV file named band-report.csv containing the results of a threshold sweep over a similarity graph.
band-report.csvwith columns:- threshold
- components
- largest_component
- component_sizes
Produce a Markdown document called band-report.md that explains:
- How the number of components changes as the threshold decreases.
- Where stable regimes (bands) appear.
- Which thresholds represent meaningful structural transitions.
- What each band corresponds to conceptually (e.g. near-duplicate, family, lineage, noise).
- Which threshold(s) are suitable defaults for:
- deduplication
- canonical representative selection
- coarse lineage analysis
- Focus on stability and transitions, not individual rows.
- A band must persist across multiple thresholds to be considered real.
- Treat the largest component size as a primary signal.
- Avoid speculative explanations not grounded in the data.
- Output Markdown only.
- Do NOT restate the CSV verbatim.
- Do NOT include code.
- Use concise analytical language.
- Prefer tables and bullet points where helpful.
- Overview
- Observed regimes
- Stability analysis
- Band definitions
- Recommended thresholds
- What to ignore
Begin analysis.