Fix scaling problems by frankharkins · Pull Request #197 · wooorm/markdown-rs

frankharkins · 2026-02-07T22:16:10Z

I believe this fixes #113 by switching EditMap from a Vec to a BTreeMap.

I created a version of @robsimmons benchmark and added it to benches. Here's the result of running the large_jsx_expressions benchmark against the main branch (Vec) and my BTreeMap implementation (this PR), varying the num_jsx_lines_per_component variable.

frankharkins · 2026-02-07T22:18:56Z

benches/bench.rs

+    }
+    fn uuids(len: usize) -> String {
+        (0..len)
+            .map(|_| "770f93e8-b4ee-4ce8-ab0f-4ece7d8c1090")


Hardcoding the same UUID seemed to reproduce the scaling behaviour, I left it like this to avoid adding another dependency.

frankharkins · 2026-02-07T22:21:43Z

benches/bench.rs

+fn tiny_markdown_string(c: &mut Criterion) {
+    let doc = "A *single* [markdown](/path) string!".to_owned();


I noticed a very small slowdown in this tiny test case (~8.9us -> ~9.1us; ~3%). This seems to even out very quickly as size grows. I tried creating an adaptive data structure that switched from Vec to BTreeMap at a certain size, but the complexity was quite high and the performance improvement was basically negligible.

frankharkins · 2026-02-07T22:22:04Z

benches/bench.rs

 fn readme(c: &mut Criterion) {
    let doc = fs::read_to_string("readme.md").unwrap();


Criterion reported no difference for this benchmark.

frankharkins · 2026-02-07T22:22:54Z

src/util/edit_map.rs

-        self.map
-            .sort_unstable_by(|a, b| a.0.partial_cmp(&b.0).unwrap());


BTreeMap.iter() is sorted already

frankharkins · 2026-02-07T22:25:33Z

src/util/edit_map.rs

-    while index < edit_map.map.len() {
-        if edit_map.map[index].0 == at {
-            edit_map.map[index].1 += remove;
-
+    match edit_map.map.get_mut(&at) {


I think was the source of the quadratic behaviour; beforehand we were iterating through the Vec to find elements, which is $O(n)$. With BTreeMap, we can get an element in $O(log(n))$ time.

Murderlon · 2026-02-08T09:30:29Z

I'm not an knowledgable enough in this codebase to have opinions but just to wanted to say this looks like a great optimization 🙏

CI is still failing btw

Copilot

Pull request overview

This PR addresses the scaling/performance issues reported in #113 by changing EditMap’s internal storage from a linear Vec to an ordered BTreeMap, and adds a benchmark that stresses large MDX/JSX expressions to measure the improvement.

Changes:

Switch EditMap from Vec-backed storage to BTreeMap to avoid O(n²) behavior when accumulating edits.
Update EditMap::consume to iterate edits in key order (and reverse key order for application) without sorting.
Add Criterion benchmarks (including a large JSX-expression case) and add itertools as a dev dependency for string generation.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`src/util/edit_map.rs`	Replaces the edit record with a `BTreeMap` and updates edit accumulation/application logic accordingly.
`benches/bench.rs`	Adds new benchmarks, including a large JSX/MDX-style stress case for parser scaling.
`Cargo.toml`	Adds `itertools` as a dev-dependency to support benchmark string generation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/util/edit_map.rs

frankharkins · 2026-02-08T11:31:01Z

Other than the Clippy lint, I believe CI is failing because our version of swc_common uses serde::__private, which shouldn't have been relied on and has been renamed in more recent patch versions.

I tried upgrading swc_common but it required quite a few changes to the codebase, including changing user-facing error messages. Instead, I've manually edited the lockfile to use the version of serde that was available when this version of swc_common was released.

Phaqui · 2026-02-16T17:19:12Z

What's the holdup on this? Is there anything I can do to help? For my use case, the runtime of a tool I have that checks links in markdown files, went from around 2 minutes, to 2 seconds, with this pr, with exactly the same behavior.

frankharkins · 2026-02-17T11:38:29Z

Glad to hear it worked well! I imagine the maintainers are busy and this isn't the highest priority thing on their plate.

To maintainers: If you have a working lockfile, feel free to push it over mine to make review easier.

Murderlon · 2026-02-23T09:27:34Z

friendly ping @wooorm

frankharkins added 2 commits February 7, 2026 13:08

Add benchmarks

c78243b

Switch edit_map to BTreeMap

ae429b5

frankharkins commented Feb 7, 2026

View reviewed changes

frankharkins marked this pull request as ready for review February 7, 2026 22:27

Murderlon requested a review from Copilot February 8, 2026 09:28

Copilot started reviewing on behalf of Murderlon February 8, 2026 09:29 View session

Copilot AI reviewed Feb 8, 2026

View reviewed changes

src/util/edit_map.rs Show resolved Hide resolved

src/util/edit_map.rs Show resolved Hide resolved

frankharkins added 2 commits February 8, 2026 11:15

Commit lockfile

1336ac5

Clippy

a92def7

Murderlon requested a review from wooorm February 16, 2026 17:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix scaling problems#197

Fix scaling problems#197
frankharkins wants to merge 4 commits intowooorm:mainfrom
frankharkins:FH/edit-map-btree

frankharkins commented Feb 7, 2026 •

edited

Loading

Uh oh!

frankharkins Feb 7, 2026

Uh oh!

frankharkins Feb 7, 2026 •

edited

Loading

Uh oh!

frankharkins Feb 7, 2026

Uh oh!

frankharkins Feb 7, 2026

Uh oh!

frankharkins Feb 7, 2026

Uh oh!

Murderlon commented Feb 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

frankharkins commented Feb 8, 2026

Uh oh!

Phaqui commented Feb 16, 2026

Uh oh!

frankharkins commented Feb 17, 2026

Uh oh!

Murderlon commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		fn tiny_markdown_string(c: &mut Criterion) {
		let doc = "A single [markdown](/path) string!".to_owned();

		fn readme(c: &mut Criterion) {
		let doc = fs::read_to_string("readme.md").unwrap();

		self.map
		.sort_unstable_by(\|a, b\| a.0.partial_cmp(&b.0).unwrap());

Uh oh!

Conversation

frankharkins commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frankharkins Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

frankharkins Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

frankharkins Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

frankharkins Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

frankharkins Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Murderlon commented Feb 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

frankharkins commented Feb 8, 2026

Uh oh!

Phaqui commented Feb 16, 2026

Uh oh!

frankharkins commented Feb 17, 2026

Uh oh!

Murderlon commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

frankharkins commented Feb 7, 2026 •

edited

Loading

frankharkins Feb 7, 2026 •

edited

Loading