All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
namespace-uri()andlang()XPath functions- Namespace-aware node test matching for prefixed element queries
id()and$variablenow return explicit errors instead of silent empty results
- XPath string functions (
contains,starts-with, etc.) now work correctly on node-sets - XPath comparisons (
<,>,<=,>=) on node-sets no longer return NaN - Malformed XPath expressions no longer crash the BEAM VM
- Reduced streaming buffer reallocation churn
- Faster node-set equality comparison
- Removed dead code
- Resolved all
cargo clippywarnings
- Alpine/musl compatibility — added
local_dynamic_tlsfeature tomimallocto ensure correct thread-local storage handling when loaded as a dynamic library (NIF) on Alpine Linux and other musl-based distributions.
- Saxy drop-in replacement — full 1:1 API parity with Saxy
parse_string/4— SAX parsing with handler callbacksparse_stream/4— streaming SAX with binary-encoded eventsstream_events/2— lazy stream of SAX eventsencode!/2,encode_to_iodata/2— XML encodingRustyXML.Handler— behaviour (=Saxy.Handler)RustyXML.Partial— incremental parsing (=Saxy.Partial)RustyXML.SimpleForm— tuple tree output (=Saxy.SimpleForm)RustyXML.XML— builder DSL (=Saxy.XML)RustyXML.Builder— struct→XML protocol (=Saxy.Builder)RustyXML.Encoder— XML string encoding
- Binary-encoded SAX events — NIF packs all events from a chunk into a single binary instead of ~1,700 BEAM tuples per 64 KB chunk; Elixir decodes one event at a time via pattern matching
Measured on Apple Silicon M1 Pro. Numbers vary between runs; treat as representative.
| Operation | Speedup | Memory |
|---|---|---|
parse_string/4 (14.6 KB) |
~1.3x faster | ~2.4x less |
parse_string/4 (2.93 MB) |
~1.6x faster | ~2.3x less |
SimpleForm (290 KB) |
~1.5x faster | ~7x less |
parse_stream/4 (2.93 MB) |
~1.8x faster | ~1x (comparable) |
Streaming memory is bounded on both sides (~128–162 KB vs ~124–133 KB).
# From SweetXml — just change the import
sed -i 's/SweetXml/RustyXML/g' lib/**/*.ex
# From Saxy — just change the module name
sed -i 's/Saxy/RustyXML/g' lib/**/*.ex- Unified structural index — single zero-copy parse path replaces the old DOM
UnifiedScannerwithScanHandlertrait for extensible tokenizationsax_parse/1NIF for SAX event parsingxpath_text_listNIF for fast text extraction without building element tuples- Lightweight
validate_strict— checks well-formedness without allocating a DOM
| Metric | v0.1.1 | v0.1.2 | Improvement |
|---|---|---|---|
| Parse throughput (2.93 MB) | 30.7 ips | 54.0 ips | 1.76x faster |
| Parse vs SweetXml (2.93 MB) | 41x | 72x | |
XPath //item vs SweetXml |
0.83x (slower) | 1.48x faster | fixed |
XPath //item throughput |
336 ips | 589 ips | 1.75x faster |
| Streaming vs SweetXml | 8.7x | 16.2x | 1.87x faster |
| Parse memory (2.93 MB) | 30.2 MB | 12.8 MB | 58% less |
| XPath memory (290 KB doc) | 28.3 MB | 475 KB | 60x less |
| Streaming memory | 52.8 MB | 319 KB | 165x less |
- All parse NIFs now use the structural index (compact spans into original input)
parse_strict/1no longer builds then discards a full DOM
xpath/3subspecs on document refs — was silently ignoring subspecsxmap/3third argument — now acceptstruefor keyword list outputparse/2charlist support — accepts charlists in addition to binaries- Lenient mode infinite loop on malformed markup like
<1invalid/>
parse_events/1— redundant with structural index (was 7x slower, 4x more memory)
- Parse NIFs (
parse/1,parse_strict/1,parse_and_xpath/2,xpath_with_subspecs/3,xpath_string_value/2,sax_parse/1) moved to dirty CPU schedulers - Internal failures now return
{:error, :mutex_poisoned}instead of silent nil/empty values - Batch accessors (
result_texts,result_attrs,result_extract) clamp ranges to actual result count
- Purpose-built Rust NIF for XML parsing in Elixir
- SIMD-accelerated scanning via
memchr - Full XPath 1.0 — all 13 axes, 27+ functions, LRU expression cache
- SweetXml-compatible API —
xpath/2,3,xmap/2,3,~xsigil,stream_tags/3 - Lazy XPath (
xpath_lazy/2) — results stay in Rust memory, 3x faster for partial access - Parallel XPath (
xpath_parallel/2) — multi-threaded query evaluation via Rayon - Streaming parser with bounded memory for large files
- 100% W3C/OASIS XML Conformance — 1089/1089 applicable tests pass
- XXE immune, Billion Laughs immune, panic-safe NIFs, atom table safe