-
Notifications
You must be signed in to change notification settings - Fork 45
Expand file tree
/
Copy pathCHANGES
More file actions
12145 lines (7662 loc) · 458 KB
/
CHANGES
File metadata and controls
12145 lines (7662 loc) · 458 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1.16.0-dev.91 | 2026-03-19 20:29:36 +0100
* Rework CHANGES slightly. (Benjamin Bannier, Corelight)
The previous formatting confused the spellchecker.
* Fix windows build. (Maor Hamami)
This adds a first iteration for adding Windows support for Spicy. With
this patch the code builds, and unit tests pass. We also tested that
this version of Spicy is usably in a fixed up version of Zeek.
1.16.0-dev.88 | 2026-03-19 13:37:35 +0100
* Introduce `hilti::rt::String` as a proper runtime class for HILTI strings. (Robin Sommer, Corelight)
This is cleaning up technical debt. Strings were the only major HILTI
type that wasn't yet represented by a dedicated runtime class: we were
mapping them directly to `std::string` instead. That's inconsistent,
and prevents future optimizations that might want change how strings
are stored and managed. This commit introduces a dedicated class
`rt::String` that runtime and codegen now use for storing HILTI
strings. Behind the scenes, the class still derives from
`std::string`, just like other runtime types derive from corresponding
`std` types, so performance-wise nothing should change. For
convenience, there's also a new literal to create HILTI string
instances: `"foo"_hs` ("HILTI string), similar to how we already
have `"foo"_b` for `bytes`.
Note that `rt::String` deliberately does not convert from/to
`std::string` to avoid accidentally mixing up the two. Instead,
`std::string_view` can act as a go-between the two types; `rt::String`
does convert from/to views. Overall, it's a bit of a puzzle game at
times to get the various conversions/overloads correct between
String/string/string_view/const char*/Bytes; trying to keep changes
consistent and minimal.
1.16.0-dev.86 | 2026-03-18 07:31:30 -0400
* Use an intermediate file rather than pipe in tests (Evan Typanski, Corelight)
This means you can look at the output more easily for FileCheck tests.
* Loosen `FileCheck` tests a bit (Evan Typanski, Corelight)
I had two primary rules:
1) Only test what is necessary. If we're looking for a variable
declaration, but it wouldn't really matter what's in it, just {{.*}}
it. If all that matters is that `self` is used between two statements,
only test for that.
2) Anything that's not *exactly* as it appears in source should have
regular expressions to catch the exact minimal case. For example, if
a hook is named `_t_on_x`, the only part we need is that it is `on_x`
so use `{{.*}}on_x{{.*}}`. This makes it more resilient to unrelated
compiler name changes.
There are other simple rules, like not checking for semicolons. The goal
should just be make the simplest looking test that catches the case you
want.
1.16.0-dev.83 | 2026-03-12 16:38:10 +0100
* Fix signed/unsigned comparison warning (Tim Wojtulewicz, Corelight)
1.16.0-dev.81 | 2026-03-03 15:43:56 +0100
* Fix change tracking in `remove-unused-fields` pass. (Robin Sommer, Corelight)
This is necessary with the recent `replaceNode()` changes.
* Let the `dead-code-static` pass honor `--no-strict-public-api`. (Robin Sommer, Corelight)
There were also a couple of places either not quite working as
intended or not (any longer) needed, so tweaking things a bit. Also
adding one extension for removing not needed member functions.
* Let the `propagate-function-returns` pass honor `--no-strict-public-api`. (Robin Sommer, Corelight)
* Let the `remove-unused-params` pass honor `--no-strict-public-api`. (Robin Sommer, Corelight)
* Add optimizer helpers to determine if AST entities can be modified. (Robin Sommer, Corelight)
These honor ``--no-strict-public-api` as well as various attributes
and other AST properties.
Also port the `remove-unused-fields` pass over to use one of them,
more to come.
* Codegen fix for wrong qualification of non-extern struct members. (Robin Sommer, Corelight)
* Extend printer to always output functions' calling conventions. (Robin Sommer, Corelight)
This was missing for some function types, so that one wasn't able to
see what their calling convention was.
1.16.0-dev.73 | 2026-03-03 10:39:20 +0100
* Optimize grouping with side-effect free local variables. (Robin Sommer, Corelight)
Our resolver can sometimes leave new groupings behind that are more
complex than necessary because they introduce local temporary
variables even though their value is constant and side-effect free.
The problem is that the resolver can't tell if something is
side-effect free (that needs data flow analysis) and hence it needs to
remain conservative.
This commit adds a peephole optimization that removes local variables
from groupings if they aren't necessary. We generally prefer repeating
the side-effect expressions for readability and further optimizations
down the line.
* Extend analysis of side effects. (Robin Sommer, Corelight)
Hardcode a couple of additional cases. Just like the existing code,
this is temporary until we gain expression-level data flow analysis.
This had no effect on any existing tests.
* Make the visitors' `replaceNode()` safe. (Robin Sommer, Corelight)
The existing `replaceNode()` method was impossible to use safely: if
it was passed a replacement node that was already part of the AST
somewhere, the method would disconnect that node from its original
position first. That, however, is generally not safe because AST nodes
do not give any guarantees about their internal child layout, so that
simply removing one can lead to trouble. Plus, one always had to keep
mind that the replacement node would now disappear from its original
place, which is error-prone even in safe cases.
This changes `replaceNode()` to instead use our standard semantics
when making AST modifications: If a node is being inserted that
already has a parent, we deep-copy it first. That way, the caller
doesn't need to worry about safe memory management. In addition, we
add a new method `replaceNodeWithChild()` that optimizes the operation
for a special case: If an existing child is taking the position of a
parent of itself, then we can always safely move it over into its new
place without copying. We now use that for cases across the code base
that match this special case.
1.16.0-dev.68 | 2026-03-03 10:27:41 +0100
* Fix printer for functions. (Benjamin Bannier, Corelight)
We would previously inject a redundant `function` before dispatching to
the printer of implementations which already emits the kind.
1.16.0-dev.66 | 2026-02-25 12:55:28 +0100
* Fix a GCC false positive on CI. (Robin Sommer, Corelight)
The `validate_release_tarball` task was failing.
1.16.0-dev.64 | 2026-02-20 09:41:54 +0100
* Updating documentation for optimizer changes. (Robin Sommer, Corelight)
* Compute AST dependencies always right after resolving. (Robin Sommer, Corelight)
Before, it was pretty inconsistent when exactly the information would
be available. In particular it wasn't available to post-compilation
hooks, which was unfortunate. With this change, we can now leverage
dependencies on the Zeek side.
Implementation note: We need to make the dependency computation a bit
more robust here now because it can potentially run on a not validated
(and hence invalid) AST. That can happen if the resolver flags any
errors. On the other hand, we can't easily run the validator just
before the dependency computation because that would change some
assumptions on when exactly validation runs, potentially leading to
different results.
* Use new helper for side effects elsewhere. (Robin Sommer, Corelight)
* Provide two flow-based helpers through the CFG context. (Robin Sommer, Corelight)
The first provides flow information for a given expression, the second
determines if a given expression might contain side effects. Right
now, both are approximate, as we don't have expression-level dataflow
tracking yet. In the future, these should be able to become more
precise.
* Parse fields into stack variables first. (Robin Sommer, Corelight)
So far, when parse a standard unit field, we'd immediately store the
retrieved value into the corresponding struct member. Now, if we parse
a non-mutable value, we instead store it into a local temporary first,
finish the field's processing, and only then move the value over from
the temporary to the struct field. The advantage is better
optimization potential: we have more opportunities to identify the
struct field as not (productively) used, and the C++ compiler can
likewise work on the stack temporary for most of the time. However, we
do this only for non-mutable values, because fields of mutable types
need to reflect incremental updates correctly at all times.
* Add new optimizer pass removing unused struct fields. (Robin Sommer, Corelight)
Add optimizer pass that removes struct fields with no productive
accesses. The pass marks unused fields with `&no-emit="optimized"`,
allowing them to remain in the AST through codegen, which then emits
code to render them as `(optimized out)`. Fields with `&always-emit`,
or inside types with `&always-emit`, are never removed.
The new optimization is enabled only in `--no-strict-public-api` mode,
which means by default it's on release builds, but not in debug
builds. (That aligns pretty well with the fact that in debug mode,
Spicy generates accesses to *all* parsed fields through the debug
messages it emits, which means they wouldn't be optimized out
anyways.)
* Improve tuple assignment. (Robin Sommer, Corelight)
We would in some cases introduce a temporary where we didn't need to.
Also adding a check to catch coercion errors, which could previously
end up not being reported.
* Switch tests to debug builds. (Robin Sommer, Corelight)
Most, but not all, of our tests are compiling HILTI/Spicy code with
`--debug`. This changes some of the remaining tests over, both for
consistency and to prepare for upcoming optimizations that would hide
the anticipated output in release builds. None of this changes any
baselines.
(This does not touch any tests for optimizations or CFGs in case
they explicitly want the release builds.)
* Fix CFG logic for adding function parameters. (Robin Sommer, Corelight)
I was getting assertion errors and noticed the logic wasn't making use
of the AST information that the resolver computes for types linked to
methods. So this switches that over.
It's a slightly larger change because we need to get the CFG access to
the AST context, which is a bit of hassle to get it in place. On the
plus side, having the context available seems generally useful for the
CFG class, so that seems ok.
* Extend groupings to support multiple expressions. (Robin Sommer, Corelight)
The HILTI AST so far did not have a way to express a sequence of
expressions (think: `,` operator in C). This adds support for that by
extending grouping expressions to contain more than one expression.
They will be evaluated in sequence, with all having access to the
local tmp if defined; and the grouping will return the value of the
final expression as its value. Nor surprisingly, these are codegened
to C++ through the `,` operator.
* Add a couple of AST node editing methods. (Robin Sommer, Corelight)
Most of the these allow detaching an existing child expression from a
node, to then re-use it elsewhere without requiring a deep copy.
* Expose a couple of helper runtime functions to throw exceptions. (Robin Sommer, Corelight)
This adds `hilti::throw_unset_optional()` and
`hilti::throw_attribute_not_set()` that throw the corresponding
exceptions. Having functions will allow us to use throw from inside
expressions, which C++' `throw` cannot do. The C++ implementation of
the former already existed but wasn't accessible to HILTI code yet.
* Add null checks to AST printers. (Robin Sommer, Corelight)
For convenience, to avoid caller-side checks.
* Add new linkage `export`. (Robin Sommer, Corelight)
This applies to structs and units, and generally works like `public`, but in addition declares that the
HILTI optimizer/codegen may not change the fields of the type. It's
like `--strict-public-api` but on a per type basis.
To export a type, it first needs to declared normally, and can then
be exported. In Spicy:
``
type Foo = unit { ... }
export Foo;
``
(`export type Foo` doesn't work for parsing reasons; plus we want to
be able to export already existing types.)
* Add `&always-emit` attribute to Spicy. (Robin Sommer, Corelight)
This attribute already existed at the HILTI level to express that
struct fields are not to be skipped from the emitted code. We allow
the same in Spicy now for unit fields, and carry the attribute through
to the corresponding Spicy struct.
* Extend `&no-emit` attribute with string argument support. (Robin Sommer, Corelight)
The existing (purely internal) `&no-emit` attribute now requires a
string argument to distinguish between 'private' fields (never visible
to user) and 'optimized' fields (removed by optimizer, but still
included in type information and type rendering; in the latter case,
they will be printed as `(optimized out)`.
`optimized` isn't in actual use yet. The corresponding optimizer pass
adding it will come in a subsequent change. However, we do already
include a test that adds `&no-emit` manually to exercise the code
generation and rendering. We already adapt `spicy-dump` as well, but
can't test it yet.
This comes with two slightly backwards-incompatible changes:
- Before, we used to include private struct fields into the type
information, even though we weren't using them anywhere (our tools
using type information were just skipping over them already). For
simplicity, we now directly skip them when emitting type
information. That means that host application can assume that any
fields included into the type information but marked as
not-emitted, are fields that would normally be visible to the user
but have been optimized out.
- spicy-dump output now includes struct fields that don't have a value
set. In text output, they are rendered as `(unset)`, in JSON output
with `null` values. The reason is that, because we want to have
optimized fields rendered as "(optimized out)" in text output, it
would seem surprising to have any non-optimized fields not appear at
all. Similarly for JSON, where we render optimized fields as `null`
(although one could debate whether they should be included in JSON
at all).
* Add new options `--[no-]strict-public-api` to toolchain commands. (Robin Sommer, Corelight)
If active, this disallows any optimizations that affect the external
C++ API of the generated code. By default, this is on when generating
debug code and off when generating release code.
* Add environment variable `HILTI_DISABLE_OPTIMIZER_PASSES`. (Robin Sommer, Corelight)
We used to have an environment variable to specify which optimizer
passes to run. This brings it back in inverse: a colon-separated list
of passes to disable. Leaving out individual passes seems more useful
than having to list all passes that one wants to run.
Not documenting the names of passes as this is primarily for
development purposes.
* Catch up on docs. (Robin Sommer, Corelight)
We had some changes that weren't reflected in the auto-generated docs
yet.
1.16.0-dev.42 | 2026-02-09 13:36:59 +0100
* Implicitly declare `Undef` for C++ enum like we do already in HILTI. (Benjamin Bannier, Corelight)
* Remove macro to create enums with non-`Undef` default value. (Benjamin Bannier, Corelight)
* GH-2259: Require that all runtime enums have an explicit `Undef` variant. (Benjamin Bannier, Corelight)
We previously would not require that C++ runtime enums had an `Undef`
variant. Instead we override the "default" entry which changed the
default variant such an enum value would take. This worked in C++ code,
but never in HILTI.
This patch adds an `Undef` variant for all C++ runtime enums.
Closes #2259
* Fix runtime enums implemented in C++. (Benjamin Bannier, Corelight)
Enums which we declare in C++ but expose to HILTI code we currently
declare twice: in C++, and again in HILTI (with a `&cxxname`). The way
we did that was inconsistent in that HILTI would implicitly add an
`Undef = -1` variant while many of our C++ declarations with
`HILTI_RT_ENUM` would declare `Undef` first and without value, leading
to `Undef = 0`; this likely also lead to all other enum variants being
inconsistently numbered. The most visible effect of this was that `Undef`
values of such enums were often truthy.
This patch cleans this up:
- C++ enums now declare `Undef` as last entry to ensure consistent
numbering
- C++ enums now explicitly set `Undef = -1`
1.16.0-dev.37 | 2026-02-09 13:36:44 +0100
* Bump CodSpeedHQ/action from 4.8.2 to 4.10.4 (dependabot[bot])
1.16.0-dev.35 | 2026-02-05 14:42:05 -0500
* GH-2260: Fix regression with propagating tuple returns. (Evan Typanski, Corelight)
Fixes #2260
With groupings, you can assign to a tuple without assigning all values.
Since this would break parts of the placements mechanism when
propagating returns, just disallow this for safety.
1.16.0-dev.33 | 2026-02-02 13:25:18 +0100
* GH-2251: Actually implement `enum` coercion to `bool`. (Benjamin Bannier, Corelight)
Previously we implemented `enum` coercion to `bool` as a comparison
against the `Undef` label. While `Undef` values coerce to `false`,
different `Undef` values only compare equal if they map to the exact
same underlying integer value, so our implementation was not correct.
This patch implements coercion in terms of `has_label`. While this is
unlikely to be the most performant implementation a better impl
would likely require to change how we declare runtime enums so we could
have the macro generate code which explicitly checks against all know
labels. Given that `HILTI_RT_ENUM..` allows declaring values both with
and without values that is currently hard to implement.
Closes #2251.
1.16.0-dev.31 | 2026-02-02 13:24:50 +0100
* GH-2238: Correctly compute argument constness in CFG. (Benjamin Bannier, Corelight)
We previously tried to fix this in #2238, but did not take into account
that one cannot simply iterate operands since often they are present as
tuples. This patch reimplements the helper function we introduced then,
and removes a helper which is not used anymore.
* Add explicit CFG handling for `assert` arguments. (Benjamin Bannier, Corelight)
By default arguments to an `assert` are just read. Add explicit handling
for that so we do not mark them as writes as well.
* GH-2254: Add explicit CFG handling for member accesses. (Benjamin Bannier, Corelight)
We previously didn't explicitly handle access of struct and union
members, and instead used the default treatment of marking them as both
read and write operations. Since both these accesses are just reads this
patch adds explicit treatment for that.
Closes #2254.
* Bump CodSpeedHQ/action from 4.5.2 to 4.8.2 (dependabot[bot])
1.16.0-dev.25 | 2026-01-20 17:41:16 -0500
* Include optimizer header when used. (Evan Typanski, Corelight)
This header had a bit of messiness, and that's an important header.
* Do not propagate returns if same name and test. (Evan Typanski, Corelight)
* GH-2031: Add function return propagation. (Evan Typanski, Corelight)
Closes #2031
This optimization aims to reduce the amount of tuples propagated through
Spicy parsers. Many of the parse functions get a parameter, then return
the unchanged parameter. This adds unnecessary work, as we have to
construct a tuple at the end with the parameter, even though it does
nothing.
The general form is this transformation:
function tuple<uint<64>, uint<64>> f(uint<64> x, uint<64> y) {
return (x, y);
}
global x = 0, y = 1;
(x, y) = f(x, y);
Turns into:
function void f() {
return;
}
global x = 0, y = 1;
f();
In order for this to actually apply to real parsers, there is an extra
consideration: sometimes there is a second function in the middle, like:
function tuple<uint<64>, uint<64>> f(uint<64> x, uint<64> y) {
return (x, y);
}
function tuple<uint<64>, uint<64>> passthrough(uint<64> x, uint<64> y) {
return f(x, y);
}
Then, something will call passthrough, not f. For this case, we need a
concept of passthroughs, where we check the *passthroughs* uses instead
of the particular function we are optimizing.
This can often make the parsing functions only pass around `_t_error`,
which means that the tuple is no longer constructed at all. Many others
will never use lookahead, so they get to only a 2-element tuple, which
may make a significant impact. Any sufficiently complex Spicy parser
would likely not see much change.
This also aligns more with what a user would expect when reading the
code, giving a nice simplification.
* Remove unnecessary quote include. (Evan Typanski, Corelight)
* Move callers optimizer collector to header. (Evan Typanski, Corelight)
This way, we can reuse it, namely in the function return propagation.
* Allow void functions to return void expressions. (Evan Typanski, Corelight)
This is needed if we remove an expression in a return.
* Move `enclosingFunction` to the optimizer. (Evan Typanski, Corelight)
* Propagate variable that is immediately returned. (Evan Typanski, Corelight)
This will propagate the `result` variable in cases like:
```
result = my_method_call();
return result;
```
into:
```
return my_method_call();
```
1.16.0-dev.16 | 2026-01-20 09:47:15 +0100
* Apply `&on-heap` selectively only where needed. (Robin Sommer, Corelight)
Previously, we would attach `&on-heap` to any `struct` compiled from a
Spicy `unit`. Now we only attach it only for specific cases that need it,
moving state to the stack otherwise.
* Prepare for moving parser temporaries to the stack. (Robin Sommer, Corelight)
The parser generator is creating temporaries at a few places that will
then later receive parsed values. Currently, those temporaries are
always simply of the type being parsed. Turns out that works for
units/structs only because we're always wrapping them into
`value_ref`, meaning we can create those temporaries as null values.
However, in a subsequent commit, we will get rid of those wrappings in
some cases, meaning the structs would need to be default initialized
at the time they are created on the stack. That doesn't work for some
struct types though: if a struct receives parameters, we would need to
pass them at instantiation time, but for these temporaries we don't
know parameters at that point. So to prepare for the upcoming change,
this commit wraps such struct values types into `optional<>`.
* Skip optimizer consistency checks with `--skip-validation`. (Robin Sommer, Corelight)
1.16.0-dev.12 | 2026-01-14 18:23:48 +0100
* Bump 3rdparty/utfcpp from `7079562` to `cfc9112` (dependabot[bot])
* Bump 3rdparty/utf8proc from `a36778d` to `e5e7992` (dependabot[bot])
* Bump CodSpeedHQ/action from 4.4.1 to 4.5.2 (dependabot[bot])
1.16.0-dev.8 | 2026-01-12 10:03:48 -0500
* Avoid lookahead in size benchmarks. (Evan Typanski, Corelight)
* Separate lookahead and size benchmarks. (Evan Typanski, Corelight)
Since the two benchmarks used the same `Inner`, they would get the same
interprocedural optimizations. This doesn't give the full picture about
how something changes, so separate them so that we can get a better
picture.
This definitely makes the benchmarks more "micro," but I believe that we
see more people using *either* lookahead *or* size, not both on the same
unit.
1.16.0-dev.5 | 2026-01-12 14:51:21 +0100
* Fix a typo. (Benjamin Bannier, Corelight)
* Take operand constness into account in CFG construction. (Benjamin Bannier, Corelight)
When computing the CFG data access information we would previously
assume that any operator use would both read and write its operands,
e.g., we would assume that an equality check could write its operands.
Due to how we resolve operators of non-const operands this does reflect
what we see in many ASTs, but does not take into account when we
actually use an operand as const.
This patch adds handling of operators so that we can avoid recording
writes when the operand is used as const.
* Fix method signatures. (Benjamin Bannier, Corelight)
The methods claimed to not modify `self` when they did in fact do that.
* Add side effects to test. (Benjamin Bannier, Corelight)
This prevents removal of these operations should we ever (i.e., in a
following patch) detect them as having no effect of their own.
* Start of v1.16.0 development. (Benjamin Bannier, Corelight)
1.15.0 | 2026-01-12 11:43:12 +0100
* Add spicy-1.15 release notes. [skip CI] (Benjamin Bannier, Corelight)
1.15.0-dev.189 | 2026-01-07 13:41:24 +0100
* Extend our notion of reserved C++ identifiers. (Robin Sommer, Corelight)
Now including anything starting with `__` or `_[A-Z]`.
* Revert change to disallow leading underscores in user identifiers. (Robin Sommer, Corelight)
In #2205, we started disallowing leading underscores in user
identifiers as we were using them internally in generated code for
private identifiers. That's a pretty drastic change, however,
affecting existing Spicy code that's now no longer valid. It's also
generally surprising to not have access to identifiers with leading
underscores.
This reverts the change and allows leading underscores again. Instead,
we internally now use a `_t_` prefix for generated IDs. We
now disallow *that* for users but don't specially mention the
restriction in the docs as that feels quite specific.
1.15.0-dev.186 | 2025-12-17 17:38:15 -0500
* Remove expression statements without side effects. (Evan Typanski, Corelight)
This is necessary to be separate from the statement/declaration tracking
since there are more cases of just "floating" expression statements,
like `(a, b, c, d)` after some constant propagation and dead code
elimination.
* Replace assignments with their RHS, not remove. (Evan Typanski, Corelight)
After 5197e79, we replace some declarations with the RHS of the
assignment so that we don't remove method calls. The same problem applies
to assignments generally, so add cases for those.
This also makes some code stick around (particularly when paired with
constant propagation), so we need more aggressive removal of expressions
whose result is unused and who don't have side effects.
1.15.0-dev.183 | 2025-12-17 12:43:42 +0100
* GH-2236: Include `Flow` nodes in CFG construction pruning step. (Benjamin Bannier, Corelight)
* Bump 3rdparty/utf8proc from `90daf9f` to `a36778d` (dependabot[bot])
1.15.0-dev.179 | 2025-12-17 11:49:52 +0100
* Add new validator pass that checks CFG-based properties. (Robin Sommer, Corelight)
We run this *after* optimization because then we can just reuse the
CFGs the optimizer is leaving behind, meaning no performance
impact. Downside is that we won't catch stuff in code that was
optimized out—-which however seems like a reasonable trade-off.
* Pass externally instantiated CFG cache into optimizer. (Robin Sommer, Corelight)
This moves ownership of the cache over to the AST processing. That
way we will be able to re-use it elsewhere.
* Factor out logic & state for CFG caching into separate class. (Robin Sommer, Corelight)
This encapsulates the functionality, making it reusable; and will also
allow us to later pass the cache around to different parts of the
compiler pipeline.
While the internal structure of the cache is slightly different to the
previous implementation inside the optimizer, there's no functionality
change to the optimizer except for one fix: we were not correctly
discarding all affected block CFGs when invalidating a module. We do
now, with the result that we're computing some more CFGs overall,
costing us about 1-2% in performance.
* Correct guarantees provided by feature-requirements visitor. (Robin Sommer, Corelight)
The pass *is* changing the CFG. Doesn't really matter right now,
because it's always the first pass running, but still.
* Fix optimizer for `if`/`while` statements with initializers. (Robin Sommer, Corelight)
Constant folding was removing initializer expressions. For simplicity,
we know just don't fold if/while-statements at all if they have
initializers.
Also improving optimizer logic for `if` and `while`.
Baseline diffs are due to avoiding more unnecessary node copies now.
1.15.0-dev.172 | 2025-12-16 18:09:03 +0100
* GH-2214: Update documentation for #2214. (Benjamin Bannier, Corelight)
1.15.0-dev.170 | 2025-12-16 09:54:28 -0500
* Fix nondeterminism in CFG dead code elimination. (Evan Typanski, Corelight)
Co-authored-by: Benjamin Bannier <benjamin.bannier@corelight.com>
* GH-2231: Fix dead code removing methods when removing vars. (Evan Typanski, Corelight)
Fixes #2231
Given code like this:
```
local y = dat.test();
```
if `y` was never used, then the whole statement would get removed. This
makes it so that the RHS sticks around. Then, if the RHS should also be
removed, it gets removed later.
1.15.0-dev.167 | 2025-12-16 15:35:32 +0100
* Add validator rejecting invalid uses of capture groups. (Benjamin Bannier, Corelight)
* GH-2222: Add resolving for capture groups. (Benjamin Bannier, Corelight)
We would previously not perform proper resolving of capture groups.
Instead we emitted a magic C++ identifier so this only resolved at C++
compile time. This introduced a number of downsides:
- incorrect uses of capture groups via `$1`, `$2`, ... outside of field
hooks which defined the magic captures was impossible to diagnose;
instead this caused failed C++ compilation
- code rewrites needed to treat captures special since their use was not
correctly reflected on the AST
This patch lowers uses of capture keywords to the identifier name
much earlier so it behaves like any other use of an identifier.
Closes #2222.
1.15.0-dev.164 | 2025-12-11 16:47:37 +0100
* Add rvalue overloads for `Result` accessors. (Benjamin Bannier, Corelight)
A somewhat typical use case of results is to get them from some function
and then immediately dereference the returned value and bind its value to
a new value (i.e., not a reference). The binding to a new value prevents
us introducing a reference to a temporary which will go out of scope and
the end of the statement.
This is incorrectly diagnosed by Coverity as a unnecessary copy
(`AUTO_CAUSES_COPY`), but also exposes an API which makes it easy to
create dangling references. This patch introduces rvalue overloads for
`Result` accessors so we can directly obtain a value inside a `Result`
without having to go through a copy of a reference. With that we also
catch incorrect uses of temporary `Result` values which we also fix.
* Remove dependency of global pass registry on global logger. (Benjamin Bannier, Corelight)
Both these objects represent singletons which live in different
translation units. That means that the order in which they are
constructed is unspecified and we shouldn't rely on a particular one.
This patch removes a user of the logger to raise an internal error with
an assertion to break that dependency.
* Drop stray CFG edge for `while` loops with inits. (Benjamin Bannier, Corelight)
It looks like this was simply added by accident, and never caught since
we had no test. This patch fixes the code and adds a test.
* Assert the Spicy plugin is invoked with a `spicy::Builder`. (Benjamin Bannier, Corelight)
* Add missing scope computations in `flatten-blocks` pass. (Benjamin Bannier, Corelight)
We incorrectly assumed that a `Node` scope was always present when it
might need to be computed. This patch fixes the code to create a scope
if it is missing.
* Fix optimizer to use pre-existing builder instance. (Benjamin Bannier, Corelight)
The pre-existing builder is of the right kind: if we're running
through the Spicy driver, it'll be a Spicy builder (in contrast to a
HILTI builder).
1.15.0-dev.157 | 2025-12-11 09:24:35 +0100
* Remove `expression::BuiltInFunction`. (Robin Sommer, Corelight)
This wasn't used anywhere.
* Fix C++ compile error with tuple assignment expressions. (Robin Sommer, Corelight)
* GH-2175: Fix tuple assignment when coercing individual elements. (Robin Sommer, Corelight)
The tuple expression could end up being evaluated multiple times.
Closes #2175.
* Add helper to create a grouping with a temporary local. (Robin Sommer, Corelight)
The helper ensures the temporary's ID is unique.
Factors out some existing code for creating unique names to avoid
duplication.
* Extend HILTI group expression to support a local temporary. (Robin Sommer, Corelight)
This extends the HILTI `Grouping` expression (`(...)`) so that it
allows declaring a variable local to the group. To do that, one
prefixes the group's contained expression with a `local` declaration
inside `{...}`:
({local int i = 2;} i * i) # evaluates to 4
This isn't meant as a user-level feature but will be useful for
internal transformations. It's helpful for situations where an
expression is needed multiple times inside the group but must be
evaluated only once.
The HILTI syntax is pretty arbitrary and primarily meant to not
interfere with any code that's currently valid.
Due to the internal nature, we don't expose this in Spicy.
* Fix ambiguous C++ constructors for generated struct. (Robin Sommer, Corelight)
This issue has been hidings for a while but was eventually bound to
come up: If a struct had just a single field or parameter, the
generated C++ struct would get a constructor that likewise receives a
single parameter. However, we also generated constructor copy and move
constructors, and if we're unlucky, then creating a struct with its
single argument would would become ambiguous with those constructors.
The second code example in https://github.com/zeek/spicy/issues/2197
taggers just that.
Additionally, we could also run into trouble if the types of a
struct's parameters and its fields matched: then our constructors
could likewise become ambiguous.
This fixes things by adding an additional tag parameters, one for
constructors receiving struct fields and one for constructors
receiving parameter values. That disambiguates them from each other,
and from the copy/move constructors.
We should have done this a long time ago already. The patch is pretty
technical but mostly straight-forward. The main challenge is finding
all the places where the tags need to be added now.
* GH-2197: Support coercion from empty list to struct. (Robin Sommer, Corelight)
Closes #2197.
1.15.0-dev.148 | 2025-12-10 17:29:38 +0100
* Make optimizer guarantees less granular. (Robin Sommer, Corelight)
This makes the guarantees that an optimizer pass can provide less
granular: it can either say the AST is fully resolved, or it's not. If
not, the full suite of
resolver/coercion/type-unification/scope-building will be run on any
changed parts of the AST. A pass can no longer just have some of those
parts run.
This is because it's too hard to understand when exactly what type of
guarantee if ok to give. For example, in an upcoming patch, the
coercer will start adding new temporaries. That then requires scope
building and a full resolver pass afterwards, meaning doing just
coercions will never be possible anymore.
More specifically, this commit includes:
- Remove `resolver::coerce()`.
- When re-resolving an AST during optimization, loop over a sequence
of scope-building, type unifications, and the standard resolver
processing.
- Remove `optimizer::Guarantees::{ResolvedExceptCoercions,
ScopesBuilt, TypesUnified}` and replace
`optimizer::Guarantees::FullyResolved` with
`optimizer::Guarantees::Resolved`. The latter now encompasses scope
building and type-unification as well.
Inside the resolver code, I'm leavings the coercion visitor (V3) split
out, even though strictly speaking that isn't necessary anymore. But
seems fine to keep that semantic distinction internally.
1.15.0-dev.146 | 2025-12-09 18:25:47 +0100
* GH-1425: Add pass which performs block flattening. (Benjamin Bannier, Corelight)
A number of other rewrites currently depend on a simpler flow without
additional scopes. This patch implements such a pass which removes
nested scopes. This involves first renaming any identifiers which would
clash when lifted to the parent scope, and then merging the scope's
contents into its parent.
We also perform extra handling where an value aliased a value from the
removed scope. The previous behavior was that the alias would have
become dangling after the scope. Since this is something we explicitly
allow and diagnose at runtime we preserve that behavior by now pointing
it to fresh data.
Closes #1425.
* Add `Block::removeStatements`. (Benjamin Bannier, Corelight)
1.15.0-dev.143 | 2025-12-09 17:25:36 +0100
* GH-2205: Uglify internal identifiers. (Benjamin Bannier, Corelight)
We prefixed the generated identifiers with double underscore so they do
not clash with user-defined identifiers, but identifiers with leading
underscores are reserved by C++. In fact, `__hlt` is already defined for
some ARM64 compilers where it refers to the `halt` instruction.
This patch replaces the leading `__` with a new reserved prefix
`hlt_internal` for identifiers in global scope, and else a `_`.
Closes #2205.
1.15.0-dev.141 | 2025-12-05 12:54:14 +0100
* Refactor optimizer. (Robin Sommer, Corelight)
This is a large change refactoring, streamlining, and speeding up of
the optimizer and its individual passes.
Conceptually, the core change is that we now have a well-defined
contract between optimizer and optimization passes: before running a
pass, the optimizer ensures that the AST is in a consistent,
fully-resolved state; and when a pass finishes, the pass tells the
optimizer what needs to be done to get the new AST back into that
state. See `optimizer::PassInfo` for the details.
We also substantially simplify the optimizer loop: it's no longer
nested across multiple phases & levels, and just runs all passes in
order until the AST stabilizes. Optionally, a individual pass maybe
itself repeated until stabilization, too, but that's the only nesting.
The passes themselves gain a new structure/API that clearly separates
between collection and mutation. We provide base visitor classes for
both kinds to derive from. The new structure no longer joins the
collection/mutation into a single visitor, which was hard to follow.
This new API enables the optimizer to track what parts of the AST were
changed by a pass, so that it can then re-resolve only the
corresponding subparts of the AST after each pass. We track changes
at the function-level so that we just need to recompute state for any
functions that have changed. The exception is global code/types, where
we re-resolve the whole module.
The optimizer's CFG state is centralized so all passes can share the
same, cached CFGs when possible, rather than each computing their own.
We use the function-tracking mentioned above to decide which CFGs need
re-computation.
The previous type/member/function passes got merged into single
dead-code-static pass, with joint collection and mutation visitors.
The constant-folder pass now re-uses the AST resolver's corresponding
functionality. No more two implementations of constant folding.
In debug builds, we do extensive checks that after a pass has
finished, and the AST state updated as prescribed, the AST is indeed
fully resolved again. We do this by recomputing all AST state from
scratch and comparing it against the current one. If there's a
difference, that's hard internal error and the compiler will abort.
This check is expensive and disabled in release builds. It can also be
disabled even in debug builds by setting a new CMake variable
HILTI_SKIP_EXPENSIVE_DEBUG_CHECKS.
This removes `HILTI_OPTIMIZER_PASSES` and
`HILTI_OPTIMIZER_ENABLE_CFG`. Neither seems that useful anymore, and
removing them simplifies the main optimizer loop. If we do still need
one of them, we can re-add later.
* Speed up the graph data structure. (Robin Sommer, Corelight)
We now maintain more pre-computed state about edges and neighbors,
making it slightly more expensive to build the graph, but much faster
to retrieve the information.
* Extend resolver APIs. (Robin Sommer, Corelight)
This primarily extends the resolver APIs for more flexibility for
external callers. There's no functionality change here, it's all at
the API level.
This commit:
- Enables the various resolver operators to operate on sub-trees
instead of the whole AST. There's no functionality change beyond
the APIs here, the resolver logic was already operating
independently of which node is started from. However, there's
one larger internal change inside the Resolver: we move all
coercion logic into its own visitor pass, so that we will be
able to call that individually later
(`detail::resolver::coerce()`). This change is the source of the
one baseline update I believe.
- Extends the constant folder with some options. These will later
be used by the optimizer, which has slightly different
requirements on what's to be folded.
- Adds a couple new APIs to validate that resolver properties are
satisfied by the current AST.