Skip to content

Commit 2026b0b

Browse files
committed
Add 2025 Gatineau budget
1 parent 3c701e0 commit 2026b0b

File tree

10 files changed

+1159
-1
lines changed

10 files changed

+1159
-1
lines changed

data/gatineau/README.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Gatineau Budget Data
2+
3+
Scripts and data for extracting and processing Gatineau's municipal budget data.
4+
5+
## Setup
6+
7+
```bash
8+
python3 -m venv venv
9+
source venv/bin/activate
10+
pip install -r data/gatineau/scripts/requirements.txt
11+
```
12+
13+
## Usage
14+
15+
### Full Workflow
16+
17+
```bash
18+
export OPENAI_API_KEY='your-key-here'
19+
python3 data/gatineau/scripts/processor.py \
20+
--year 2025 \
21+
--pdf-url "https://www.gatineau.ca/..." \
22+
--extract
23+
```
24+
25+
### Step-by-Step
26+
27+
**Download PDF:**
28+
29+
```bash
30+
python3 data/gatineau/scripts/processor.py --year 2025 --pdf-url "https://..."
31+
```
32+
33+
**Extract Data (requires API key):**
34+
35+
```bash
36+
export OPENAI_API_KEY='your-key-here'
37+
python3 data/gatineau/scripts/processor.py --year 2025 --extract
38+
```
39+
40+
**Or Extract Manually:**
41+
42+
1. Open `raw/Gatineau Budget <year>.pdf`
43+
2. Use prompt from `llm_prompt.txt` with your LLM
44+
3. Save output to `extracted/<year>/llm_extracted.md`
45+
46+
**Convert to JSON:**
47+
48+
```bash
49+
python3 data/gatineau/scripts/processor.py --year 2025
50+
```
51+
52+
## Output
53+
54+
- `sankey.json` - Hierarchical budget data for visualization
55+
- `summary.json` - Metadata, metrics, and ministry breakdowns
56+
57+
## Data Format
58+
59+
Extracted markdown must include:
60+
61+
- `## Key Metrics – <year>` section (population, employees, debt, property tax)
62+
- `## Revenues – <year>` section with nested bullets
63+
- `## Expenses – <year>` section with nested bullets
64+
65+
All amounts in thousands of dollars (`k$`). See `llm_prompt.txt` for full specification.
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
## Key Metrics – 2025
2+
3+
- Debt interest – **63 111 k$**
4+
- Property tax revenue – **748 509 k$**
5+
6+
---
7+
8+
## Revenues – 2025
9+
10+
- **Taxes**
11+
- Sur la valeur foncière – **587 687 k$**
12+
- Compensations tenant lieu de taxes – **83 618 k$**
13+
- Sur une autre base – **77 205 k$**
14+
- **Total Taxes – 748 509 k$**
15+
(Check: 587 687 + 83 618 + 77 205 = 748 510 → rounding difference of 1 k$)
16+
17+
- **Subventions gouvernementales**
18+
- Subventions – **42 285 k$**
19+
- **Total Subventions – 42 285 k$**
20+
21+
- **Services rendus**
22+
- Tarification – **15 766 k$**
23+
- Revenus recouvrables – **871 k$**
24+
- Ententes – **806 k$**
25+
- Autres services rendus – **606 k$**
26+
- Activités commerciales – **31 k$**
27+
- **Total Services rendus – 18 080 k$**
28+
(Check: 15 766 + 871 + 806 + 606 + 31 = 18 080)
29+
30+
- **Imposition de droits**
31+
- Droits de mutation immobilière – **35 100 k$**
32+
- Licences et permis – **5 183 k$**
33+
- **Total Imposition de droits – 40 283 k$**
34+
(Check: 35 100 + 5 183 = 40 283)
35+
36+
- **Autres revenus**
37+
- Revenus d’intérêts – **20 685 k$**
38+
- Amendes et pénalités – **15 769 k$**
39+
- Affectations – **1 349 k$**
40+
- Autres revenus – **0 k$**
41+
- **Total Autres revenus – 37 803 k$**
42+
(Check: 20 685 + 15 769 + 1 349 + 0 = 37 803)
43+
44+
- **Total Revenus – 886 960 k$**
45+
(Check: 748 509 + 42 285 + 18 080 + 40 283 + 37 803 = 886 960)
46+
47+
---
48+
49+
## Expenses – 2025
50+
51+
- **Rémunération**
52+
- Salaires – **300 259 k$**
53+
- Charges sociales – **82 633 k$**
54+
- **Total Rémunération – 382 892 k$**
55+
(Check: 300 259 + 82 633 = 382 892)
56+
57+
- **Biens et services**
58+
- Services professionnels, techniques et autres – **66 531 k$**
59+
- Fourniture de service et biens non durables – **51 266 k$**
60+
- Entretien et réparation – **28 567 k$**
61+
- Location – **19 221 k$**
62+
- Transport et communication – **4 625 k$**
63+
- **Total Biens et services – 170 209 k$**
64+
(Check: sum = 170 210 → rounding difference of 1 k$)
65+
66+
- **Service de la dette**
67+
- Paiements du service de la dette – **63 111 k$**
68+
- **Total Service de la dette – 63 111 k$**
69+
70+
- **Contribution à des organismes**
71+
- Transport en commun – **95 340 k$**
72+
- Contributions diverses à des organismes – **20 244 k$**
73+
- **Total Contribution à des organismes – 115 584 k$**
74+
(Check: 95 340 + 20 244 = 115 584)
75+
76+
- **Autres dépenses**
77+
- Immobilisations payées comptant – **117 701 k$**
78+
- Autres dépenses – **37 463 k$**
79+
- **Total Autres dépenses – 155 164 k$**
80+
(Check: 117 701 + 37 463 = 155 164)
81+
82+
- **Total Dépenses – 886 960 k$**
83+
(Check: 382 892 + 170 209 + 63 111 + 115 584 + 155 164 = 886 960)

data/gatineau/llm_prompt.txt

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
The attached document is the City of Gatineau budget. Extract the information for the target financial year only (ignore other years). Output MUST follow this structure:
2+
3+
## Key Metrics – <year>
4+
* Population – **<number of residents>** (or omit if not available)
5+
* Total employees – **<number of employees>** (or omit if not available)
6+
* Net debt – **<amount in k$>** (or omit if not available)
7+
* Total debt – **<amount in k$>** (or omit if not available)
8+
* Debt interest – **<amount in k$>** (or omit if not available)
9+
* Property tax revenue – **<amount in k$>** (or omit if not available)
10+
11+
- Use plain numbers for population/employees (no commas or units).
12+
- Monetary values MUST be in thousands of dollars and suffixed with `k$`.
13+
- If a figure is not available in the source document, omit that line entirely (do not write `0`).
14+
15+
## Revenues – <year>
16+
* **Category Name** (top-level category, no amount on this line)
17+
* Sub-category – **123,456 k$** (sub-item with amount)
18+
* Sub-category with children (sub-item that has children):
19+
* Sub-sub-category – **50,000 k$**
20+
* Sub-sub-category – **30,000 k$**
21+
* **Total Sub-category – 80,000 k$** (total for this sub-category)
22+
* Another sub-category – **40,000 k$**
23+
* **Total Category Name – 120,000 k$** (total for the top-level category)
24+
- Format rules:
25+
* Top-level categories: `* **Category Name**` (bold, no amount, no dash)
26+
* Sub-items with amounts: ` * Item Name – **amount k$**` (2 spaces indent, dash separator)
27+
* Sub-items with children: ` * Item Name` (2 spaces indent, no amount, no dash)
28+
* Totals: ` * **Total Item Name – amount k$**` (bold, includes dash and amount)
29+
* Use 2 spaces per indentation level
30+
* Subchildren can have further subchildren - go as deep as the document provides
31+
* Always include a total line for each category/subcategory that has children
32+
33+
## Expenses – <year>
34+
* Follow the exact same format as Revenues above.
35+
* Preserve all levels of hierarchy from the source document.
36+
* Each category must have a total line if it has sub-items.
37+
38+
Rules:
39+
1. Only include numbers for the requested year.
40+
2. Sub-bullets must never sum to more than their parent (call out rounding differences if needed).
41+
3. Be precise: double-check math, preserve signage, and format every monetary amount as `123,456 k$`.
42+
4. Do not add commentary outside the structure above.
43+
2.04 MB
Binary file not shown.

data/gatineau/sankey.json

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
{
2+
"city": "Gatineau",
3+
"total": 0.88696,
4+
"spending": 0.88696,
5+
"revenue": 0.88696,
6+
"budget_balance": 0.0,
7+
"spending_data": {
8+
"name": "Spending",
9+
"children": [
10+
{
11+
"name": "Rémunération",
12+
"children": [
13+
{
14+
"name": "Salaires",
15+
"amount": 0.300259
16+
},
17+
{
18+
"name": "Charges sociales",
19+
"amount": 0.082633
20+
}
21+
]
22+
},
23+
{
24+
"name": "Biens et services",
25+
"children": [
26+
{
27+
"name": "Services professionnels, techniques et autres",
28+
"amount": 0.066531
29+
},
30+
{
31+
"name": "Fourniture de service et biens non durables",
32+
"amount": 0.051266
33+
},
34+
{
35+
"name": "Entretien et réparation",
36+
"amount": 0.028567
37+
},
38+
{
39+
"name": "Location",
40+
"amount": 0.019221
41+
},
42+
{
43+
"name": "Transport et communication",
44+
"amount": 0.004625
45+
}
46+
]
47+
},
48+
{
49+
"name": "Service de la dette",
50+
"children": [
51+
{
52+
"name": "Paiements du service de la dette",
53+
"amount": 0.063111
54+
}
55+
]
56+
},
57+
{
58+
"name": "Contribution à des organismes",
59+
"children": [
60+
{
61+
"name": "Transport en commun",
62+
"amount": 0.09534
63+
},
64+
{
65+
"name": "Contributions diverses à des organismes",
66+
"amount": 0.020244
67+
}
68+
]
69+
},
70+
{
71+
"name": "Autres dépenses",
72+
"children": [
73+
{
74+
"name": "Immobilisations payées comptant",
75+
"amount": 0.117701
76+
},
77+
{
78+
"name": "Autres dépenses",
79+
"amount": 0.037463
80+
}
81+
]
82+
}
83+
]
84+
},
85+
"revenue_data": {
86+
"name": "Revenue",
87+
"children": [
88+
{
89+
"name": "Taxes",
90+
"children": [
91+
{
92+
"name": "Sur la valeur foncière",
93+
"amount": 0.587687
94+
},
95+
{
96+
"name": "Compensations tenant lieu de taxes",
97+
"amount": 0.083618
98+
},
99+
{
100+
"name": "Sur une autre base",
101+
"amount": 0.077205
102+
}
103+
]
104+
},
105+
{
106+
"name": "Subventions gouvernementales",
107+
"children": [
108+
{
109+
"name": "Subventions",
110+
"amount": 0.042285
111+
}
112+
]
113+
},
114+
{
115+
"name": "Services rendus",
116+
"children": [
117+
{
118+
"name": "Tarification",
119+
"amount": 0.015766
120+
},
121+
{
122+
"name": "Revenus recouvrables",
123+
"amount": 0.000871
124+
},
125+
{
126+
"name": "Ententes",
127+
"amount": 0.000806
128+
},
129+
{
130+
"name": "Autres services rendus",
131+
"amount": 0.000606
132+
},
133+
{
134+
"name": "Activités commerciales",
135+
"amount": 3.1e-5
136+
}
137+
]
138+
},
139+
{
140+
"name": "Imposition de droits",
141+
"children": [
142+
{
143+
"name": "Droits de mutation immobilière",
144+
"amount": 0.0351
145+
},
146+
{
147+
"name": "Licences et permis",
148+
"amount": 0.005183
149+
}
150+
]
151+
},
152+
{
153+
"name": "Autres revenus",
154+
"children": [
155+
{
156+
"name": "Revenus d’intérêts",
157+
"amount": 0.020685
158+
},
159+
{
160+
"name": "Amendes et pénalités",
161+
"amount": 0.015769
162+
},
163+
{
164+
"name": "Affectations",
165+
"amount": 0.001349
166+
},
167+
{
168+
"name": "Autres revenus",
169+
"amount": 0.0
170+
}
171+
]
172+
}
173+
]
174+
},
175+
"population": null,
176+
"per_capita_spending": null,
177+
"property_tax_per_capita": null,
178+
"property_tax_revenue": 0.748509
179+
}

0 commit comments

Comments
 (0)