sum / reduce(+, …) over symbolic vectors is O(n²); route to add_worker for O(n)

Building a big symbolic sum by reducing a vector of terms is quadratic right now. `sum(v)` and `reduce(+, v)` go through Base's pairwise binary `+`, and every binary `+` rebuilds the whole `Add` dictionary, so folding `n` terms copies `1 + 2 + … + n` entries.

The interesting part is that the O(n) fix already exists: `add_worker` (with the task-local `AddWorkerBuffer`) threads one dictionary through the entire reduction in a single pass. But today it's only reachable through the variadic `+`, i.e. a literal `a + b + c + …` chain. A `sum`/`reduce` over a vector never gets there. `add_worker` already takes a collection (it's called with a `Vector` internally in `code.jl` and `polyform.jl`), so the only thing missing is the dispatch.

Here's the gap, summing `n` distinct terms `k * xₖ` at the `Num` layer (fresh variables per row, min of many runs):

| n | `foldl(+, v)` | `sum(v)` | `add_worker(v)` | speedup |
|:----:|:----:|:----:|:----:|:----:|
| 8 | 26 µs | 25 µs | 10 µs | ~2.6× |
| 50 | 342 µs | 348 µs | 48 µs | ~7× |
| 100 | 1.16 ms | 1.14 ms | 82 µs | ~14× |
| 250 | 6.87 ms | 7.86 ms | 202 µs | ~34× |
| 500 | 21.8 ms | 16.6 ms | 270 µs | ~70× |
| 1000 | 70.9 ms | 71.6 ms | 598 µs | ~119× |

`foldl`/`sum` grow quadratically (~61× for 10× the terms), `add_worker` grows linearly. The raw `BasicSymbolic` layer is the same story.

The straightforward fix is to route the collection reductions to `add_worker`, e.g. in `addsub.jl`:

```julia
function Base.sum(a::AbstractArray{S}) where {S <: NonTreeSym}
    isempty(a) && return zero_of_vartype(vartype(S))
    return add_worker(vartype(S), a)
end
function Base.sum(f, a::AbstractArray{S}) where {S <: NonTreeSym}
    isempty(a) && return zero_of_vartype(vartype(S))
    return add_worker(vartype(S), map(f, a))
end
# Vector (not AbstractArray) avoids an ambiguity with StaticArrays.reduce.
function Base.reduce(::typeof(+), a::Vector{S}) where {S <: NonTreeSym}
    isempty(a) && return zero_of_vartype(vartype(S))
    return add_worker(vartype(S), a)
end
```

plus the `Num` mirror in Symbolics (unwrap, `add_worker`, rewrap; vartype taken from the first symbolic element so a vector mixing constants and symbolics still works). This is byte-identical to `foldl(+, v)` since `add_worker` is what `a + b + c` already calls, so it's purely a complexity change. Checking locally on SymbolicUtils 4.35.5 / Symbolics 7.28.1: results match `foldl` for distinct, collecting, mixed-constant and all-constant vectors up to n=1000; `detect_ambiguities` stays at 485 with the methods added; and `sum(M; dims=1)` still goes through Base's array path, so matrix reductions are untouched.

Since this changes `Base.sum`/`reduce` behavior I wanted to check here before sending a PR. If you'd rather not touch `Base`, the lower-risk alternative is to just document/export `add_worker` (or a thin `fast_sum`) as public API and let callers opt in. Happy to open the PR either way, whichever you prefer. One scoping note: this covers `SymReal`/`SafeReal` only, `TreeReal` falls through to the default.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sum / reduce(+, …) over symbolic vectors is O(n²); route to add_worker for O(n) #983

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

n	`foldl(+, v)`	`sum(v)`	`add_worker(v)`	speedup
8	26 µs	25 µs	10 µs	~2.6×
50	342 µs	348 µs	48 µs	~7×
100	1.16 ms	1.14 ms	82 µs	~14×
250	6.87 ms	7.86 ms	202 µs	~34×
500	21.8 ms	16.6 ms	270 µs	~70×
1000	70.9 ms	71.6 ms	598 µs	~119×

sum / reduce(+, …) over symbolic vectors is O(n²); route to add_worker for O(n) #983

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions