Skip to content

Commit d8a84e9

Browse files
committed
update for version 0.2.0: perf improvements, api changes
1 parent 3904911 commit d8a84e9

18 files changed

Lines changed: 501 additions & 199 deletions

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,30 @@
33
Format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/);
44
the project follows [Semantic Versioning](https://semver.org/).
55

6+
## [0.2.0]
7+
8+
### Added
9+
- `Parser.then p q` — runs `p` then `q`, returning `q`'s value.
10+
Avoids the continuation-closure allocation of
11+
`(Parser.bind p (fn [_] q))`.
12+
- `Parser.before p q` — runs `p` then `q`, returning `p`'s value.
13+
Mirror of `Parser.then`.
14+
- `Parser.bind-result p f` — variant of `bind` where `f` takes the
15+
parsed value and a `&Cursor` and returns `(Result b ParseErr)`.
16+
Skips the per-parse `Parser.pure` / `Parser.fail` rebuild the
17+
equivalent `bind` formulation does.
18+
- `Expected` sumtype (`(One String)` / `(Many (Array String))`)
19+
representing the expected-set on a `ParseErr`, plus
20+
`Expected.empty`, `Expected.to-array`, `Expected.length`, and
21+
`Expected.append-into`.
22+
23+
### Changed
24+
- **Breaking**: `ParseErr.expected` is now `Expected`, not
25+
`(Array String)`. Code that read the field directly should call
26+
`(Expected.to-array (ParseErr.expected &e))` for the old
27+
behaviour, or use `Expected.length` /
28+
`Expected.append-into` where appropriate.
29+
630
## [0.1.0]
731

832
- Initial release.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ A Parsec-style parser combinator library for Carp.
55
## Install
66

77
```clojure
8-
(load "git@github.com:carpentry-org/parsec@0.1.0")
8+
(load "git@github.com:carpentry-org/parsec@0.2.0")
99
```
1010

1111
## Example

bench/compare/README.md

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -42,12 +42,25 @@ best-case timings:
4242

4343
| Grammar | Haskell Parsec | Carp parsec |
4444
|---|---|---|
45-
| kv 100 pairs | 58 µs | 33 µs |
46-
| kv 1000 pairs | 655 µs | 352 µs |
47-
| int-list 100 | 15 µs | 3.5 µs |
48-
| int-list 1000 | 167 µs | 35 µs |
49-
| sexp deep 100 | 46 µs | 52 µs |
50-
| sexp flat 1000 | 219 µs | 68 µs |
45+
| kv 100 pairs | 57 µs | 30 µs |
46+
| kv 1000 pairs | 660 µs | 303 µs |
47+
| int-list 100 | 15 µs | 3.3 µs |
48+
| int-list 1000 | 168 µs | 34 µs |
49+
| sexp deep 100 | 46 µs | 36 µs |
50+
| sexp flat 1000 | 219 µs | 66 µs |
5151

5252
On a 40k-iteration combined workload, peak RSS is ~3.2 MB (Carp) vs
5353
~13.6 MB (Haskell GHC).
54+
55+
## Stdlib parse
56+
57+
`examples/carp.carp` — a Carp source-form reader written on
58+
`parsec.carp` — parses the full Carp standard library (425 KB,
59+
692 top-level forms) and round-trips correctly. Compared against the
60+
compiler's own `Parsing.hs` (also Haskell parsec, GHC `-O2`) on the
61+
same input:
62+
63+
| | Time | Throughput |
64+
|---|---|---|
65+
| Carp parsec (`examples/carp.carp`) | 55 ms | ~7.5 MB/s |
66+
| Compiler reader (`Parsing.hs`) | 258 ms | ~1.6 MB/s |

bench/compare/carp/all_bench.carp

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -59,15 +59,32 @@
5959
; ============================================================
6060

6161
; --- kv pair ---
62+
; Leaf parsers are built once into top-level cells. The bind chain
63+
; references them via `Parser.recurse` so the grammar is well-built
64+
; (no per-parse parser construction inside continuations), mirroring
65+
; how GHC -O2 hoists invariant subexpressions in the Haskell `pair`.
66+
(def -kv-id (the (Parser String) (Parser.placeholder)))
67+
(def -kv-eq (the (Parser String) (Parser.placeholder)))
68+
(def -kv-int (the (Parser Int) (Parser.placeholder)))
69+
(def -kv-semi (the (Parser String) (Parser.placeholder)))
70+
6271
(defn make-kv-pair []
63-
(Parser.bind (Parser.Lexer.lexeme (Parser.Lexer.identifier))
72+
(Parser.bind (Parser.recurse &-kv-id)
6473
(fn [name]
65-
(Parser.bind (Parser.Lexer.symbol @"=")
74+
(Parser.bind (Parser.recurse &-kv-eq)
6675
(fn [_]
67-
(Parser.map (Parser.Lexer.integer)
76+
(Parser.map (Parser.recurse &-kv-int)
6877
(fn [n] (Pair.init @&name n))))))))
6978

70-
(def p-kv (Parser.sep-by1 (make-kv-pair) (Parser.Lexer.symbol @";")))
79+
(def p-kv (the (Parser (Array (Pair String Int))) (Parser.placeholder)))
80+
81+
(defn init-kv []
82+
(do
83+
(set! -kv-id (Parser.Lexer.lexeme (Parser.Lexer.identifier)))
84+
(set! -kv-eq (Parser.Lexer.symbol @"="))
85+
(set! -kv-int (Parser.Lexer.integer))
86+
(set! -kv-semi (Parser.Lexer.symbol @";"))
87+
(set! p-kv (Parser.sep-by1 (make-kv-pair) (Parser.recurse &-kv-semi)))))
7188

7289
; --- comma-separated integers ---
7390
(def p-int-list
@@ -126,6 +143,7 @@
126143
(defn main []
127144
(do
128145
(setup)
146+
(init-kv)
129147
(init-sexp)
130148
(Bench.set-min-runs! 100)
131149
(println "=== Carp parsec ===")

docs/Cursor.html

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@
4848
ParseErr
4949
</a>
5050
</li>
51+
<li>
52+
<a href="Expected.html">
53+
Expected
54+
</a>
55+
</li>
5156
<li>
5257
<a href="Reply.html">
5358
Reply

docs/ParseErr.html

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@
4848
ParseErr
4949
</a>
5050
</li>
51+
<li>
52+
<a href="Expected.html">
53+
Expected
54+
</a>
55+
</li>
5156
<li>
5257
<a href="Reply.html">
5358
Reply
@@ -167,7 +172,7 @@ <h3 id="expected">
167172
instantiate
168173
</div>
169174
<p class="sig">
170-
(Fn [(Ref ParseErr a)] (Ref (Array String) a))
175+
(Fn [(Ref ParseErr a)] (Ref Expected a))
171176
</p>
172177
<span>
173178

@@ -207,7 +212,7 @@ <h3 id="init">
207212
instantiate
208213
</div>
209214
<p class="sig">
210-
(Fn [Int, Int, Int, (Maybe String), (Array String)] ParseErr)
215+
(Fn [Int, Int, Int, (Maybe String), Expected] ParseErr)
211216
</p>
212217
<span>
213218

@@ -327,7 +332,7 @@ <h3 id="set-expected">
327332
instantiate
328333
</div>
329334
<p class="sig">
330-
(Fn [ParseErr, (Array String)] ParseErr)
335+
(Fn [ParseErr, Expected] ParseErr)
331336
</p>
332337
<span>
333338

@@ -347,7 +352,7 @@ <h3 id="set-expected!">
347352
instantiate
348353
</div>
349354
<p class="sig">
350-
(Fn [(Ref ParseErr a), (Array String)] ())
355+
(Fn [(Ref ParseErr a), Expected] ())
351356
</p>
352357
<span>
353358

@@ -547,7 +552,7 @@ <h3 id="update-expected">
547552
instantiate
548553
</div>
549554
<p class="sig">
550-
(Fn [ParseErr, (Ref (Fn [(Array String)] (Array String) a) b)] ParseErr)
555+
(Fn [ParseErr, (Ref (Fn [Expected] Expected a) b)] ParseErr)
551556
</p>
552557
<span>
553558

docs/Parser.Lexer.html

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@
4848
ParseErr
4949
</a>
5050
</li>
51+
<li>
52+
<a href="Expected.html">
53+
Expected
54+
</a>
55+
</li>
5156
<li>
5257
<a href="Reply.html">
5358
Reply

docs/Parser.UTF8.html

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@
4848
ParseErr
4949
</a>
5050
</li>
51+
<li>
52+
<a href="Expected.html">
53+
Expected
54+
</a>
55+
</li>
5156
<li>
5257
<a href="Reply.html">
5358
Reply

docs/Parser.html

Lines changed: 74 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@
4848
ParseErr
4949
</a>
5050
</li>
51+
<li>
52+
<a href="Expected.html">
53+
Expected
54+
</a>
55+
</li>
5156
<li>
5257
<a href="Reply.html">
5358
Reply
@@ -73,7 +78,7 @@ <h1>
7378
<div class="module-description">
7479
<p>is a Parsec-style parser combinator library for Carp.</p>
7580
<h2>Installation</h2>
76-
<pre><code>(load &quot;git@github.com:carpentry-org/parsec@0.1.0&quot;)
81+
<pre><code>(load &quot;git@github.com:carpentry-org/parsec@0.2.0&quot;)
7782
</code></pre>
7883
<h2>Usage</h2>
7984
<pre><code>(let [p (Parser.seq (Parser.byte \h) (Parser.byte \i))]
@@ -172,6 +177,27 @@ <h3 id="any-byte">
172177

173178
</p>
174179
</div>
180+
<div class="binder">
181+
<a class="anchor" href="#before">
182+
<h3 id="before">
183+
before
184+
</h3>
185+
</a>
186+
<div class="description">
187+
defn
188+
</div>
189+
<p class="sig">
190+
(Fn [(Parser a), (Parser b)] (Parser a))
191+
</p>
192+
<pre class="args">
193+
(before p q)
194+
</pre>
195+
<p class="doc">
196+
<p>runs <code>p</code> then <code>q</code>, returning <code>p</code>'s value and discarding
197+
<code>q</code>'s. The mirror image of <code>then</code>.</p>
198+
199+
</p>
200+
</div>
175201
<div class="binder">
176202
<a class="anchor" href="#between">
177203
<h3 id="between">
@@ -216,6 +242,31 @@ <h3 id="bind">
216242

217243
</p>
218244
</div>
245+
<div class="binder">
246+
<a class="anchor" href="#bind-result">
247+
<h3 id="bind-result">
248+
bind-result
249+
</h3>
250+
</a>
251+
<div class="description">
252+
defn
253+
</div>
254+
<p class="sig">
255+
(Fn [(Parser a), (Fn [a, (Ref Cursor b)] (Result c ParseErr))] (Parser c))
256+
</p>
257+
<pre class="args">
258+
(bind-result p f)
259+
</pre>
260+
<p class="doc">
261+
<p>runs <code>p</code>, then applies <code>f</code> to its value and the
262+
cursor <em>after</em> <code>p</code>. <code>f</code> returns a <code>(Result b ParseErr)</code> directly,
263+
avoiding a per-parse <code>Parser.pure</code>/<code>Parser.fail</code> rebuild that the
264+
equivalent <code>bind</code> formulation would do. The cursor argument lets <code>f</code>
265+
build a <code>ParseErr</code> at the right position when it returns
266+
<code>Result.Error</code>.</p>
267+
268+
</p>
269+
</div>
219270
<div class="binder">
220271
<a class="anchor" href="#byte">
221272
<h3 id="byte">
@@ -974,6 +1025,28 @@ <h3 id="take-while1">
9741025

9751026
</p>
9761027
</div>
1028+
<div class="binder">
1029+
<a class="anchor" href="#then">
1030+
<h3 id="then">
1031+
then
1032+
</h3>
1033+
</a>
1034+
<div class="description">
1035+
defn
1036+
</div>
1037+
<p class="sig">
1038+
(Fn [(Parser a), (Parser b)] (Parser b))
1039+
</p>
1040+
<pre class="args">
1041+
(then p q)
1042+
</pre>
1043+
<p class="doc">
1044+
<p>runs <code>p</code> then <code>q</code>, discarding <code>p</code>'s value and returning
1045+
<code>q</code>'s. Equivalent to <code>(bind p (fn [_] q))</code> but does not allocate the
1046+
continuation closure per parse.</p>
1047+
1048+
</p>
1049+
</div>
9771050
<div class="binder">
9781051
<a class="anchor" href="#try">
9791052
<h3 id="try">

docs/Pitfalls.html

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@
4848
ParseErr
4949
</a>
5050
</li>
51+
<li>
52+
<a href="Expected.html">
53+
Expected
54+
</a>
55+
</li>
5156
<li>
5257
<a href="Reply.html">
5358
Reply

0 commit comments

Comments
 (0)