Use an append-only buffer for the additions in the piecetable.#422
Conversation
There was a problem hiding this comment.
Pull request overview
This draft PR introduces an append-only buffer optimization for the PieceTable's addition buffer, replacing the standard UnitBuffer with a new AppendOnlyUnitBuffer class that uses a cumulative length array for O(log n) binary search lookups instead of O(n) linear iteration.
Key changes:
- Implements
AppendOnlyUnitBufferwith a fixed 4K element capacity using binary search for improved lookup performance - Changes visibility of
UnitBuffer.unitListandUnitBuffer.dirtyfrom private to package-private to enable subclass access - Replaces the mutable
UnitBufferwith the append-only variant inPieceTable.additionBuffer
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| AppendOnlyUnitBuffer.java | New class extending UnitBuffer with optimized binary search-based lookups using a cumulative length array |
| UnitBuffer.java | Relaxed visibility of unitList and dirty fields to package-private for subclass access |
| PieceTable.java | Changed additionBuffer type from UnitBuffer to AppendOnlyUnitBuffer and made it final |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| public class AppendOnlyUnitBuffer extends UnitBuffer { | ||
|
|
||
| private int[] unitLengths = new int[4096]; | ||
|
|
||
| @Override | ||
| public void append(List<Unit> units) { | ||
| ensureCapacity(unitLengths.length + units.size()); | ||
| for (Unit u: units) { | ||
| doAppend(u); | ||
| } | ||
| } | ||
|
|
||
| @Override | ||
| public void append(Unit unit) { | ||
| ensureCapacity(unitLengths.length + 1); | ||
| doAppend(unit); | ||
| } | ||
|
|
||
| private void doAppend(Unit unit) { | ||
| int idx = unitList.size(); | ||
| int oldLength = idx == 0 ? 0 : unitLengths[idx-1]; | ||
| unitLengths[idx]= oldLength + unit.length(); | ||
| unitList.add(unit); | ||
| dirty = true; | ||
| } | ||
| @Override | ||
| public Unit getUnitWithRange(int start, int end) { | ||
| if (start < 0 || unitList.isEmpty()) return new TextUnit(""); | ||
| int index = Arrays.binarySearch(unitLengths, 0, unitList.size(), start); | ||
| if (index >=0) { // exact start at index | ||
| index = index+1; | ||
| } else { // no exact item found | ||
| index = -(index) -1; | ||
| } | ||
| return unitList.get(index); | ||
| } | ||
|
|
||
| private void ensureCapacity(int required) { | ||
| if (required < unitLengths.length) { | ||
| int newCapacity = Math.max(unitLengths.length * 2, required); | ||
| unitLengths = Arrays.copyOf(unitLengths, newCapacity); | ||
| } | ||
| } | ||
|
|
||
|
|
||
| } |
There was a problem hiding this comment.
The new AppendOnlyUnitBuffer class lacks test coverage. Given that there are comprehensive tests for PieceTable and UnitBuffer in the test directory, this new class should also have tests to verify the correctness of its append methods, getUnitWithRange binary search logic, and ensureCapacity growth behavior.
| if (start < 0 || unitList.isEmpty()) return new TextUnit(""); | ||
| int index = Arrays.binarySearch(unitLengths, 0, unitList.size(), start); | ||
| if (index >=0) { // exact start at index | ||
| index = index+1; | ||
| } else { // no exact item found | ||
| index = -(index) -1; | ||
| } | ||
| return unitList.get(index); |
There was a problem hiding this comment.
The getUnitWithRange method does not use the 'end' parameter at all and only performs a lookup based on 'start'. This differs from the parent class implementation which checks if the range [start, end] fits within a unit. The current implementation will return the unit at the start position regardless of whether the entire range fits within that unit.
| if (start < 0 || unitList.isEmpty()) return new TextUnit(""); | |
| int index = Arrays.binarySearch(unitLengths, 0, unitList.size(), start); | |
| if (index >=0) { // exact start at index | |
| index = index+1; | |
| } else { // no exact item found | |
| index = -(index) -1; | |
| } | |
| return unitList.get(index); | |
| if (start < 0 || end <= start || unitList.isEmpty()) return new TextUnit(""); | |
| int index = Arrays.binarySearch(unitLengths, 0, unitList.size(), start); | |
| if (index >= 0) { // exact start at index | |
| index = index + 1; | |
| } else { // no exact item found | |
| index = -(index) - 1; | |
| } | |
| // Now, index is the first unit whose end offset is > start | |
| if (index >= unitList.size()) return new TextUnit(""); | |
| int unitStart = (index == 0) ? 0 : unitLengths[index - 1]; | |
| int unitEnd = unitLengths[index]; | |
| if (start >= unitStart && end <= unitEnd) { | |
| return unitList.get(index); | |
| } else { | |
| return new TextUnit(""); | |
| } |
| * | ||
| * Append-only buffer used as the PieceTable Addition Buffer |
There was a problem hiding this comment.
Missing class-level documentation. The comment is empty and should describe the purpose of this class, particularly explaining that it uses a cumulative length array for optimized lookups and any limitations (such as the hard limit on 4K elements mentioned in the PR description).
| * | |
| * Append-only buffer used as the PieceTable Addition Buffer | |
| * An append-only buffer used as the PieceTable Addition Buffer. | |
| * <p> | |
| * This class maintains a cumulative length array ({@code unitLengths}) to enable | |
| * efficient lookup of units by character offset. The cumulative array allows | |
| * for optimized binary search operations when retrieving units that span a given range. | |
| * <p> | |
| * <b>Limitations:</b> This buffer has a hard limit of 4096 elements (4K units), | |
| * as defined by the initial size of the {@code unitLengths} array. Exceeding this | |
| * limit will require resizing, but performance and memory usage are optimized for | |
| * use cases within this bound. |
| } | ||
| } | ||
|
|
||
|
|
There was a problem hiding this comment.
Trailing whitespace on this otherwise empty line. This can cause issues with some linters and version control systems.
|
This PR improves FPS (locally) from 29FPS to 34FPS, and reduces average processing+rendering time from 24ms to 20ms. |
|
The issues from the first glance I see in
@Override
public int length() {
return unitList.isEmpty() ? 0 : unitLengths[unitList.size() - 1];
}
} |
|
I also think that it could be a good idea to replace linear search in |
True, will do that
It's not a bottleneck, as this is invoked O(n) while the getUnitWithRange (which is the real bottleneck) is invoked O(n^2). Nevertheless, since we do have the length handy in the appendOnlyBuffer, it is best to use that indeed.
That is another option indeed, but I'd like to defer that to some architectural discussion. I'm not sure we really need both a UnitBuffer and an AppendOnlyUnitBuffer. |
Override length, and override with exception remove
This PR introduces an append-only buffer to be used in the piecetable. This buffer caches the lengths of units which are never going to be removed.
This fixes #421