Optimize the coordinates conversion and some internal functions performance by AdamDrewsTR · Pull Request #2320 · qax-os/excelize

AdamDrewsTR · 2026-05-12T20:14:00Z

Summary

This PR optimizes several frequently-called functions to reduce heap allocations and improve throughput for large spreadsheet operations. Each change targets a specific hot path identified through profiling.

Changes

`ColumnNumberToName` — O(1) lookup table (~16KB init cost)

Precomputes all 16,384 valid column names at package init into a flat []string slice. Subsequent calls become a bounds check + slice index — zero allocations.

Before: Each call allocated a []byte and computed the column name via division loop.
After: Single table lookup.

`CoordinatesToCellName` — avoid empty-prefix concatenation

When abs is not set (the common case), the old code concatenated "" + colName + "" + rowStr, producing unnecessary string copies. The new code returns colName + strconv.Itoa(row) directly and early-returns for the absolute case.

`namespaceStrictToTransitional` — fast path for Transitional files

The vast majority of XLSX files use the Transitional namespace. All Strict namespace URIs contain "purl.oclc.org", so a single bytes.Contains check can skip the entire replacement loop and avoid allocating a copy of the sheet XML. This applies to >99% of real-world files.

`isNumeric` — replace `math/big.Float` with `strconv.ParseFloat`

The big.Float parser allocates significantly more than strconv.ParseFloat for the same input. This also removes the math/big import entirely. The digit-counting step uses strings.Count instead of strings.ReplaceAll to avoid allocating a modified copy of the string.

`bstrUnmarshal` — skip regex when no escape sequences present

Over 99% of cell values contain no _x escape sequences. A simple strings.Contains(s, "_x") guard skips the regex entirely for these cells. In profiling, this eliminated ~54 MB of regex-related allocations per 100K rows.

`workSheetReader` — avoid double-reading sheet data

The original code called f.readBytes(name) twice — once for getRootElement and once for Decode. Caching the result in a local variable halves the I/O for worksheets read from temp files.

Benchmark Impact

These are foundational optimizations — the impact compounds with sheet size since ColumnNumberToName, CoordinatesToCellName, bstrUnmarshal, and namespaceStrictToTransitional are called per-cell or per-sheet. Individually each saves microseconds; collectively they reduce GC pressure significantly for large workbooks.

- ColumnNumberToName: precompute all 16384 column names at init for O(1) lookup, eliminating per-call byte slice allocation - CoordinatesToCellName: avoid unnecessary string concatenation with empty "$" prefix in non-absolute case - namespaceStrictToTransitional: skip replacement loop entirely when no Strict namespace URIs are present (fast path for >99% of XLSX files) - isNumeric: replace math/big.Float with strconv.ParseFloat, removing the math/big dependency and reducing allocations - bstrUnmarshal: skip regex matching when "_x" is not present in the string, avoiding ~54 MB of regex allocations per 100K rows - workSheetReader: cache readBytes result in a local variable to avoid reading the same sheet data twice during XML parsing

codecov · 2026-05-12T20:36:16Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.60%. Comparing base (4bebb61) to head (d4cbd14).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #2320   +/-   ##
=======================================
  Coverage   99.60%   99.60%           
=======================================
  Files          32       32           
  Lines       26791    26803   +12     
=======================================
+ Hits        26685    26697   +12     
  Misses         55       55           
  Partials       51       51

Flag	Coverage Δ
unittests	`99.60% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…unction

xuri

Thanks for your contribution. I've made some update based on your branch.

AdamDrewsTR changed the title ~~Optimize hot-path functions to reduce allocations and improve throughput~~ Optimize hot-path functions to reduce allocations and improve throughput (1) May 12, 2026

AdamDrewsTR changed the title ~~Optimize hot-path functions to reduce allocations and improve throughput (1)~~ Optimize hot-path functions to reduce allocations and improve throughput (1A) May 12, 2026

AdamDrewsTR changed the title ~~Optimize hot-path functions to reduce allocations and improve throughput (1A)~~ Optimize hot-path functions to reduce allocations and improve throughput (A1) May 12, 2026

xuri added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 13, 2026

Remove unnecessary comments and fix documentation type for AddChart f…

d4cbd14

…unction

xuri added this to Excelize v2.11.0 May 21, 2026

xuri moved this to Performance in Excelize v2.11.0 May 21, 2026

xuri changed the title ~~Optimize hot-path functions to reduce allocations and improve throughput (A1)~~ Optimize the coordinates conversion and some internal functions performance May 21, 2026

xuri approved these changes May 22, 2026

View reviewed changes

xuri merged commit 7240c79 into qax-os:master May 22, 2026
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Optimize the coordinates conversion and some internal functions performance#2320

Optimize the coordinates conversion and some internal functions performance#2320
xuri merged 2 commits into
qax-os:masterfrom
AdamDrewsTR:perf/lib-micro-optimizations

AdamDrewsTR commented May 12, 2026

Uh oh!

codecov Bot commented May 12, 2026 •

edited

Loading

Uh oh!

xuri left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

Conversation

AdamDrewsTR commented May 12, 2026

Summary

Changes

ColumnNumberToName — O(1) lookup table (~16KB init cost)

CoordinatesToCellName — avoid empty-prefix concatenation

namespaceStrictToTransitional — fast path for Transitional files

isNumeric — replace math/big.Float with strconv.ParseFloat

bstrUnmarshal — skip regex when no escape sequences present

workSheetReader — avoid double-reading sheet data

Benchmark Impact

Uh oh!

codecov Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

xuri left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`ColumnNumberToName` — O(1) lookup table (~16KB init cost)

`CoordinatesToCellName` — avoid empty-prefix concatenation

`namespaceStrictToTransitional` — fast path for Transitional files

`isNumeric` — replace `math/big.Float` with `strconv.ParseFloat`

`bstrUnmarshal` — skip regex when no escape sequences present

`workSheetReader` — avoid double-reading sheet data

codecov Bot commented May 12, 2026 •

edited

Loading