Summary
Meta-issue tracking 13 performance optimization opportunities identified in code audit.
Performance Items
| ID |
Location |
Issue |
Expected Gain |
| P1 |
cansim.R:365-379 |
fold_in_metadata repeated left_joins |
50-70% |
| P2 |
cansim_metadata.R:98-111 |
parse_metadata nested loops |
60-80% |
| P3 |
cansim_parquet.R:675-715 |
cached_tables repeated reads |
65-85% |
| P4 |
cansim_metadata.R:127-145 |
hierarchy O(n) cycle detection |
40-60% |
| P5 |
cansim.R:156-162 |
gsub loop in factor conversion |
30-50% |
| P6 |
cansim_parquet.R:254-263 |
field cache read miss |
70-90% |
| P7 |
cansim_parquet.R:219-232 |
csv2sqlite transform copies |
25-40% |
| P8 |
cansim_vectors.R:20-24 |
lapply to vapply |
30-45% |
| P9 |
Multiple files |
French string constants |
20-35% |
| P10 |
cansim.R:64 |
unnecessary as_tibble |
5-15% |
| P11 |
cansim_metadata.R:123-124 |
hash lookup for parents |
80-95% |
| P12 |
cansim_vectors.R:244-251 |
coordinate metadata loop |
35-50% |
| P13 |
cansim.R:885,889,898 |
lapply unlist chains |
20-30% |
Proposed Implementation Plan
- PR 4: Hot Paths (P1, P2, P5, P13)
- PR 5: Caching & I/O (P3, P6, P7, P10)
- PR 6: Lookups & Vectorization (P4, P8, P9, P11, P12)
All performance PRs will include microbenchmark results.
From code audit - 35-45% overall throughput improvement potential
Summary
Meta-issue tracking 13 performance optimization opportunities identified in code audit.
Performance Items
Proposed Implementation Plan
All performance PRs will include microbenchmark results.
From code audit - 35-45% overall throughput improvement potential