-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsrfi-44.html
2498 lines (2103 loc) · 103 KB
/
srfi-44.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content=
"HTML Tidy for Linux/x86 (vers 1st July 2003), see www.w3.org"
name="generator">
<title>SRFI 44: Collections</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link rel="stylesheet" href="/srfi.css" type="text/css" />
</head>
<body>
<H1>Title</H1>
Collections
<H1>Author</H1>
Scott G. Miller
<H1>Status</H1>
<p>This SRFI is currently in <em>final</em> status. Here is <a href="https://srfi.schemers.org/srfi-process.html">an explanation</a> of each status that a SRFI can hold. To provide input on this SRFI, please send email to <code><a href="mailto:srfi+minus+44+at+srfi+dotschemers+dot+org">srfi-44@<span class="antispam">nospam</span>srfi.schemers.org</a></code>. To subscribe to the list, follow <a href="https://srfi.schemers.org/srfi-list-subscribe.html">these instructions</a>. You can access previous messages via the mailing list <a href="https://srfi-email.schemers.org/srfi-44">archive</a>.</p>
<ul>
<li>Received: 2003-04-23</li>
<li>Draft: 2003-04-28--2003-07-28</li>
<li>Revised: 2003-07-02</li>
<li>Revised: 2003-07-24</li>
<li>Revised: 2003-08-10</li>
<li>Revised: 2003-08-19</li>
<li>Revised: 2003-08-27</li>
<li>Revised: 2003-09-11</li>
<li>Revised: 2003-10-07</li>
<li>Revised: 2003-10-08</li>
<li>Revised: 2003-11-24</li>
<li>Revised: 2003-11-28</li>
<li>Final: 2004-03-07</li>
</ul></p>
<H1>Abstract</H1>
<p>A Collections API which defines a common naming scheme and set
of operations for creating, accessing, and manipulating common
datastructures in Scheme. The API defines accessors, a common
protocol for value access via generic and specific enumeration,
and functions for inter-datastructure cooperation. Finally, a concrete
specification of a compliant set of operators for the standard
Scheme heterogenous datastructures (lists and vectors) and for
the homogeneous Scheme string is provided.</p>
<H1>Table of Contents</H1>
<blockquote>
<a href="#issues">Issues</a><br>
<a href="#rationale">Rationale</a><br>
<a href="#specification">Specification</a><br>
<blockquote>
<a href="#chier">Collections Hierarchy</a><br>
<a href="#storagemodel">Storage Model</a><br>
<a href="#dispatch">Dispatch Mechanism</a><br>
<a href="#typing">Typing</a><br>
<a href="#attribute">Collection Attributes</a><br>
<blockquote>
<a href="#ordered">Ordered Collections</a><br>
<a href="#ordered">Directional Collections</a><br>
<a href="#mutable">Purely Mutable Collections</a><br>
<a href="#limitedcollections">Limited Collections</a>
</blockquote><a href="#foldingenums">Folding
Enumerators</a><br>
<blockquote>
<a href="#stability">Enumeration Stability</a>
</blockquote><a href="#equivalence">Equivalence</a><br>
<a href="#immutablecols">Immutable Collections</a><br>
<a href="#homogeneity">Homogeneity</a><br>
<a href="#sizevslength">Size versus Length</a><br>
<a href="#infcoll">Infinite Collections</a><br>
<a href="#orderedcols">Ordered Collections</a><br>
<a href="#valueequality">Bags, Sets, Maps and Value
Equality</a><br>
<a href="#sequences">Sequences</a><br>
<blockquote>
<a href="#flexsequences">Flexible Sequences</a>
</blockquote><a href="#procedures">Procedures</a><br>
<blockquote>
<a href="#flup">Functional, Linear Update, and Purely
Mutable Collections</a><br>
<a href="#enumprocs">Enumeration</a><br>
<a href="#collprocs">Collections</a><br>
<a href="#orderedcollprocs">Ordered Collections</a><br>
<a href="#directionalcollprocs">Directional
Collections</a><br>
<a href="#limitedcollprocs">Limited Collections</a><br>
<a href="#mutablecollprocs">Purely Mutable
Collections</a><br>
<a href="#bagprocs">Bags</a><br>
<a href="#setprocs">Sets</a><br>
<a href="#sequenceprocs">Sequences</a><br>
<a href="#flexsequenceprocs">Flexible Sequences</a><br>
<a href="#mapprocs">Maps</a><br>
<!-- <a href="#dictprocs">Dictionaries</a><br>-->
</blockquote><a href="#schemecolls">Scheme
Collections</a><br>
<blockquote>
<a href="#lists">Lists</a><br>
<a href="#alists">Association List Maps</a><br>
<a href="#vectors">Vectors</a><br>
<a href="#strings">Strings</a>
</blockquote>
</blockquote><a href="#compat">Compatibility</a><br>
<blockquote>
<a href="#r5rsext">Extensions to R5RS</a><br>
<a href="#othersrfis">Relation to other SRFIs</a><br>
<a href="#impls">Relation to Scheme implementations</a>
</blockquote><a href="#implementation">Implementation</a><br>
<a href="#futurework">Future Work</a><br>
<a href="#references">References</a>
</blockquote>
<H1><a name="issues">Issues</a></H1>
<p>Several functions in this SRFI are required to take arbitrary
collections or major collection subtypes as input. This requires
that the function in question be able to determine the specific
type of the collection and dispatch to an appropriate
implementation.</p>
<p>As standard Scheme lacks the ability to automatically dispatch
on different function behaviors based on the types of arguments,
it is difficult to devise a portable implementation of this SRFI
which allows arbitrary future collections. It is expected that
Scheme system implementors will take advantage of generic
function, module system, or object-oriented features of their
systems to implement this SRFI efficiently. At the same time it
is hoped that a future SRFI will specify a mechanism which will
allow a portable and efficient implementation of collections to
exist. The reference implementation in this SRFI compromises by
using the portable Tiny-CLOS object system as a framework for
collection-function dispatch.</p>
<p>It should finally be noted that this SRFI does <i>not</i>
require that all function calls to collection supertypes remain
capable of dynamic dispatch. If a compiler or optimizer can infer
the types of a collection function's arguments, or a user
provides hints to that effect, it is permitted to monomorphize
the call. That is, to statically compile a more specific but
semantically equivalent collection function in place.</p>
<H1><a name="rationale">Rationale</a></H1>
<p>The Scheme language provides two basic datastructures for
containing any first-class value. Scheme pairs, and by extension
lists, provide a variable sized datastructure with constant time
extension and linear time access. Scheme vectors provide a fixed
sized datastructure with typically constant time access. From
these two datastructures it is possible to build up many other
types of datastructures, including hash tables, trees, and
sets.</p>
<p>However, no such datastructures are specified by the Scheme
standard, and preceding this SRFI, no attempt has been made to
specify a standard API for these most common datastructures. It
is anticipated that this will change in the near future. In
advance of these specifications, this SRFI will outline a
consistent naming scheme and set of operators and semantics that
future data structure specifications may follow. The intended
result is to allow current and future standard datastructures to
look and behave in a predictable fashion, and to allow the values
stored in them to be easily and efficiently transferred to and
from different datastructures as necessary.</p>
<p>This SRFI specifies a folding enumerator as the primary means
of traversing and obtaining all values of a collection. The
rationale for this paradigm is to support a diverse variety of
collections whose contents may reside in memory, slow secondary
storage, over networks, or as results of computational processes.
The folding enumerator allows for precise control over the
resources of the underlying collection, while still providing
flexible generic access to collections.</p>
<p>Allowing stepwise enumeration over a collection, as opposed to
a <tt>collection->list</tt> function allows the collection
enumerator to fetch its elements lazily from some slower or more
costly storage or retrieval mechanism. The canonical example is a
collection whose elements are rows from a complex database query.
While it may not be possible to store the entire results of the
query in memory on which a <tt>map</tt> or <tt>for-each</tt>
operation could be performed, it may be much more feasible to
allow the enumerator to fetch each row one at a time, saving both
memory and delay in transfering the results to the Scheme
program. Additionally, the enumerator may cooperate much more
cleanly with a garbage collector as references to each value
retrieved can be discarded after processing, allowing the garbage
collector to free large values immediately if the Scheme program
does not itself capture a reference to the collection value. It
has also been shown that enumeration can be the basis for other
traversal protocols such as cursors, through straightforward
conversion operators. Finally, the enumeration protocol described
allows for early termination by the accessing function, which may
obviate the need to transfer many unaccessed future elements.</p>
<H1><a name="specification">Specification</a></H1>
<blockquote>
<h2><a name="chier">Collections Hierarchy</a></h2>
<center>
<table>
<tbody>
<tr>
<td>
<pre>
Collection
|
+--------+---------+
| | |
Bag Set Map
|
Sequence
|
Flexible Sequence
</pre>
</td>
</tr>
</tbody>
</table>
</center>
<p>This SRFI first defines a hierarchy of possible
datastructure types. Each subtype inherits the behavior and
functionality of its parent (its interfaces). Any collection
may be immutable.</p>
<blockquote>
<h4>Collection</h4>
<p>The base type. All collections have a concept of size and
emptiness, and it is possible to perform a folding
enumeration over the values of any collection.</p>
<h4>Bag</h4>
<p>A bag is a collection of possibly ordered, possibly unique
values. Given a bag, it is only possible to determine if it
contains or does not contain any of a given value.</p>
<h4>Set</h4>
<p>A set is a collection of unique values. Similar to but
distinct from a bag, it is only possible to determine if a
set contains or does not contain a given value. Unlike a bag,
only one of any value can exist in the collection at a time.
Removing one value from the set means that the set no longer
contains the value at all.</p>
<h4>Sequence</h4>
<p>A sequence extends a bag with the ability to access and
replace all of its members using a contiguous sequence of
non-negative exact integers. Sequences are <i>stable.</i>
That is, if no mutation operation occurs on a sequence, its
elements must remain in the same order according to their
previously observed indices. When values are added to a base
Sequence, they may be added anywhere in the sequence.
Sequences may or may not allow multiple instances of a
value.</p>
<h4>Flexible Sequence</h4>
<p>A flexible sequence is a datastructure that allows new
values to be added and old values removed at any position in
the sequence, as specified by the programmer.</p>
<h4>Map</h4>
<p>A map is a collection which stores associations between
single keys and single values. Maps typically provide better
than linear time retrieval of the value given a key.</p><!--
<h4>
Dictionary
</h4>
<p>
The dictionary collection is similar to a map, but allows
multiple mappings to exist from one key to numerous values
(in other words an n-to-m mapping).
</p> -->
</blockquote>
<h2><a name="typing">Typing</a></h2>
<p>This SRFI uses the word type to refer to a class of
datastructures as a whole, and subtyping to refer to classes of
datastructures who have the properties of and pass the type-checking
predicates of their parents. This however is a
metaphor used to more concisely describe the signatures of
operators that each concrete collection must provide. It does
not necessarily map to any actual type system or inheritance
rules. Implementations are permitted to use type systems to
enforce or facilitate providing the correct set of operators
and behaviors, but this is not required.</p>
<h2><a name="attribute">Collection Attributes</a></h2>
<p>In addition to the conceptual type hierarchy which dictates
the available functions for a given collection, collections
also may possess any of four (in this SRFI) attributes which
specify whether certain global functions are well defined for
the collection. Attributes may also specify additional
parameters to the collection's constructor.</p>
<p>This SRFI specifies four such attributes:
<i>orderedness</i>, <i>directionality</i>, <i>mutability</i>,
and <i>limitedness</i>.</p>
<blockquote>
<h3><a name="ordered">Ordered Collections</a></h3>
<p>Ordered collections maintain an ordering to their values
(or in the case of maps, their keys) which is based on those
values' relationships to each other. The collection fold is
guaranteed to enumerate over the collection's values/keys in
increasing value precedence as defined by the collection's
<i>ordering function</i>.</p>
<h3><a name="directional">Directional Collections</a></h3>
<p>Directional collections have a notion of a left/right
handedness to their interface which is independent of any
relationships between their values. For example, sequences
are always directional collections, for which the left side
implies lower indices, while the right side implies higher
indices. Directional collections must provide operators to
retrieve the left and rightmost values, and may provide
update operators to add or delete values to/from the left and
right hand sides of the collection. This is useful for
non-sequences as well, for example to implement a queue
datastructure as a bag.</p>
<p>The collection left fold operator must enumerate
over the values of a directional collection from left to right
order as defined by the directional collection, and the right
fold must correspondingly enumerate from right to left.
</p>
<p>A collection cannot be both ordered and directional
simultaneously. The left and right accessors of ordered
collections access least-precedent and most-precedent values,
rather than the left and right values which are unrelated to
precedence in directional collections.</p>
<h3><a name="limitedcollections">Limited Collections</a></h3>
<p>Limited collections are collections with a fixed capacity
and thus a fixed maximum size. It may be possible to add and
remove values from limited collections by using a special
value to indicate unused slots in the underlying
representation, so limited collections do not necessarily
have fixed size. Update procedures which add or remove values
from a limited collection may be undefined on limited
collections.</p>
<h3><a name="mutable">Purely Mutable Collections</a></h3>
<p>This SRFI's API is accommodating to purely functional and
linear-updating collections by default. Some collections may
be <i>purely mutable</i>. That is, updates are done purely
through side effects within the collection. In such a
collection,</p>
<pre>
(eq? <input collection> (update-procedure <input collection>))
</pre>will always return <tt>#t</tt>. Much Scheme code is often
written to such purely mutable collections (hash tables are an
example). It is not possible to support these collections by
requiring that code use the side-effecting procedures defined in
this SRFI in the linear-update fashion, as it would require
significant changes to such code and could be expensive in terms of
unnecessarily allocated storage.<br>
<br>
<p>For these reasons, <i>pure mutability</i> is an additional
attribute that may be held by a collection. When so held, the
results of the side-effecting update procedures may be
ignored, Programmers may then utilize these procedures in the
usual manner and agnostic to the collection type so long as
the pure mutability attribute is checked. Note that purely
mutable collections exist comfortably within the protocol for
linear update procedures, so code written to the latter will
accomodate the former with no changes.</p>
</blockquote>
<h2><a name="storagemodel">Storage Model</a></h2>
<p>Collection instances, as specified in this SRFI, must be
treated as first-class values. This means they must be storable
in any variable or heterogenous datastructure. Collections may
be divisible, that is, may be composed as combinations of
like-typed collections. This SRFI defines modification
operators for both purely functional and any range of mutable
datastructures.</p>
<h2><a name="dispatch">Dispatch Mechanism</a></h2>
<p>The precise mechanism of dispatch for collection operators
capable of receiving multiple collection types is deliberately
unspecified. Many existing Scheme systems have generic
procedure functionality and/or object oriented extensions that
permit dispatch of functions based on the type of their
arguments. In other cases, self-contained programs may wish to
include a small, fixed set of collections and eschew the
overhead of a generic dispatch system, instead relying on
simple dispatch using explicit type checking (for example with
a <tt>cond</tt> statement). This SRFI is designed to allow
integration with any of those systems. Should a standard
dispatch system emerge in the future, this SRFI can be cleanly
integrated with it.</p>
<p>In the case where a generic dispatch system is used, it
needs only a minimum set of features. In particular, a single
dispatch system should be sufficient to implement collection
extension, as all operators where more than one generic type
may be passed can be implemented in terms of generic dispatch
on the first argument and utilization of a second generic
operator for its implementation. For example,
<tt>*-add-from</tt> need only dispatch on the primary
collection, and can use the fold operator to iterate through
the second collection's values to add them.</p>
<p>Future SRFIs may add additional abstract collection types,
but such types <i>must</i> inherit their interface singly from
an existing abstract type (including the base collection type)
in this SRFI. Concrete collection types can inherit from an
abstract type or other concrete types, but abstract types may
not inherit their interface from a concrete type. For example,
a new dictionary abstract type may yield <tt>#t</tt> when
applied from <tt>dictionary??</tt>, and
<tt>map?</tt> but must not for <tt>alist-map?</tt>.</p>
<h2><a name="foldingenums">Folding Enumerators</a></h2>
<p>Regardless of the collection type, all collections must be
supported by the generic folding enumeration procedures,
<tt>collection-fold-left</tt> and
<tt>collection-fold-right</tt> in addition to providing
concrete fold operators specific to the collection type
(<tt>%-fold-left, %-fold-right</tt>) (see <a href=
"#procedures">Procedures</a>). The collection fold procedures
take a collection and a folding function as arguments, as well
as any number of optional seed values. The collection fold
procedures then invoke the folding function on a value of the
collection and the seeds, and receive from the folding function
a <i>proceed</i> value followed by a like number of seed return
values. If the proceed value is non-false, the collection fold
procedure then invokes the same fold function again with the
next collection value and the newly returned seed values. This
continues until the elements of the collection are exhausted,
or the fold function returns false as the proceed value, at
which point the collection fold procedure returns the seed
values (if any) returned by the last call to the fold
function.</p>In an ordered collection, the left fold procedure
is required to apply the values in the order of the underlying
ordered collection. In addition, an ordered collection must
apply the values in the reverse order as the left fold when
enumerated by <tt>collection-fold-right</tt>. The
left fold procedure must proceed left to right
through the values of a directional collection such as a sequence
(increasing index order), while the right fold
proceeds right to left (decreasing index order.) <br>
<br>
<p>Finally, all Map and Sequence datastructures must be
supported by <tt>collection-fold-keys-left</tt> and
<tt>collection-fold-keys-right</tt> which during enumeration
applies the fold function to both the map's keys or the
sequence's indices and their corresponding values.</p>
<blockquote>
<h3><a name="stability">Enumeration Stability</a></h3>
<p>It is undefined in this SRFI whether the process of
enumeration over a collection is <i>stable</i>. A stable
enumeration will enumerate over the values that exist in a
given collection at a certain time. Removing, adding, or
changing values from the underlying collection while
enumerating will not cause those values to be missing,
discovered, or the new value observed respectively in future
steps of the enumeration. Modifying a collection while
enumerating is permitted to cause an error in either the
collection modification or in the enumeration, though this
behavior is discouraged.</p>
<p>Enumerations must, however, be stable in the absence of
updates. That is, once started, an enumeration should touch
all values in the collection (in the proper order as
described above) so long as no updates are performed on the
underlying collection. Enumerations need not be stable
between each other. That is, though an enumeration may be
internally stable, a second enumeration need not return the
values in the same order, unless the collection type and
attributes require this as described above.</p>
<p>Note that if a collection is purely functional, it will by
definition be stable in the presence of modification, as the
modified collection will be space-distinct from the
enumerated collection.</p>
</blockquote>
<h2><a name="equivalence">Equivalence</a></h2>
<p>Collections are considered equivalent with respect to
Scheme's <tt>equal?</tt> operator when they are of the same
type and contain a like number of values or mappings, and where
each value/mapping in one collection is <tt>equal?</tt> to a
value in the second collection.</p>
<p>For sequences and ordered collections the ordering of the
contained values must also be equivalent. For maps, each key in
the first map must be equal to a key in the second map, and the
value(s) mapped to by that key in each map must also be
equivalent. If the map is ordered, the order of the mappings
must also be the same.</p>
<p>Equivalence is checked with this SRFI's <tt>*=</tt>
operator, described later. Implementations may also extend
Scheme's <tt>equal?</tt> and <tt>eqv?</tt> operators to
collections, as long as the above semantics hold for
<tt>equal?</tt>. In other words,</p>
<pre>
(equal? <i>collection ...</i>)
</pre>must return the same value as
<pre>
(collection= equal? <i>collection ...</i>)
</pre>if the collections are of the same type, otherwise it must
return false.<br>
<br>
<h2><a name="immutablecols">Immutable Collections</a></h2>
<p>Any collection or instance of a collection may be immutable,
or made immutable at any point in its lifecycle. It is an error
to add, remove, or modify any values or mappings in an
immutable collection.</p>
<h2><a name="homogeneity">Homogeneity</a></h2>
<p>Collections may be homogeneous (capable of storing values of
only one type), though it is anticipated that the majority of
collections will be heterogenous. If a collection is
homogeneous, it is an error to attempt to store a value or key
of the wrong type within it.</p>
<h2><a name="sizevslength">Size versus Length</a></h2>
<p>Most collections possess a concept of size. The size of a
collection is the number of values or mappings it currently
contains. This differs from the concept of length in Scheme
datastructures, which corresponds to the number of cons cells
or vector slots the structure contains. A collection may
contain more cells or slots than required to contain its values
or mappings. An example might be a hashtable collection, which
may at any given time contain numerous unoccupied,
discontiguous cells. This matter is confused by the collections
specified in this API, whose size and length are the
same.</p>
<p>Put another way, <tt>collection-size</tt> should return the
number of enumeration steps that will occur, if known.</p>
<p>A collection may not have such a concept, in which case the
<tt>collection-size</tt> function must return <tt>#f</tt>.
Infinite collections (such as the collection of natural
numbers) or lazy collections (such as a network stream) are
examples of size-indeterminate collections.</p>
<h2><a name="infcoll">Infinite Collections</a></h2>
<p>Infinite collections are collections which contain or
generate a potentially unlimited number of values. Infinite
collections must return <tt>#f</tt> from the
<tt>collection-size</tt> function. Enumerations over an
infinite collection may proceed indefinitely, and may be
haltable only via the folding function.</p>
<p>Furthermore, left enumeration over an infinite collection
should whenever possible not require an unboundedly increasing
amount of storage as the enumeration proceeds. It should be
possible to enumerate indefinitely without consuming a fatal
amount of resources assuming the fold function does not
contribute to resource consumption. The behavior of
<tt>collection-fold-right</tt> is undefined on infinite
collections. Furthermore, the collection fold operators are
strict. Applying a fold operator to a stream will force each of
its values as they are applied to the fold function.</p>
<p>Some collections may use a self-referential representation,
such as the circular list for redundant value access. The
purpose of these collections is not to appear to contain an
infinite number of values, and are thus <i>not</i> considered
infinite. Naive iteration may proceed indefinitely, but the
folding enumerations of this SRFI must halt after enumerating
over the actual values of such a collection.
<tt>collection-size</tt> should return the number of actual
values in the collection unless it cannot efficiently do
so.</p>
<h2><a name="orderedcols">Ordered Collections</a></h2>
<p>Some collections maintain an ordering. These ordered
collections provide the additional property of guaranteeing a
left fold will progress over the collection in a least to
greatest value precedence fashion, as defined by the
collection's <i>ordering function</i>. An ordered collection's
constructor must take such a function as input. An ordering
function takes two arguments, and returns a boolean, indicating
whether the first value is to take precedence in the collection
over the second. As an example, an ordered collection of
numbers may use <tt><</tt> or <tt><=</tt> to order
numeric values added to the collection.</p>
<p>In order to ensure a consistent ordering in the collection,
the ordering function must return the same result for like
inputs over time (i.e. it must be functional). In most cases,
an ordering function should also treat like values as if tested
using <tt>equal?</tt> in order to ensure that duplicate values
are stored and retrieved consistently.</p>
<p>Many collections may derive the equivalence function from
the ordering function since the ordering function has the
property of being strict weak. It is suggested that ordered
collections do so if no equivalence function is provided by the
programmer. Otherwise the collection should document its
behavior in this respect.</p>
<h2><a name="valueequality">Bags, Sets, Maps and Value
Equality</a></h2>
<p>Maps store mappings from keys to values. Bags and Sets store
values in an opaque fashion which only indicates the presence
or absence of an object. In order to determine whether a given
value exists in a set or bag, or whether a given key matches
another in a map, these collections must use an <i>equivalence
function.</i> Other collections may use equivalence functions
for other purposes.</p>
<p>Any collection may require or accept an equivalence function
in its primary constructor (<tt>make-%</tt>) and in its
initializing constructor (<tt>%</tt>). The provided equivalence
function must take two values as input and return true if they
should be considered equivalent for the purposes of
<tt>contains?</tt>, value insertion, or removal. If provided,
an equivalence function follows an ordering function in an
ordered collection, but precedes any other arguments. If not
provided, <tt>eqv?</tt> is used as a default equivalence
function.</p>
<p>Maps additionally may take a second equivalence function for
comparing keys. Thus, the initializing constructor for a map
which accepts user-defined equivalence functions takes the form
<tt>(% [value-equivalence-function [key-equivalence-function]]
...)</tt> . If not provided, the map must use the value
equivalence function for comparing values and keys.</p>
<h2><a name="sequences">Sequences</a></h2>
<p>A sequence is a directional collection of values named by a
contiguous sequence of exact integers in the range
[0,<tt>(collection-size sequence)</tt>). Scheme vectors are a
natural example of a sequence. Sequences may or may not be limited.
If they are not, sequences allow new values to be
added at an undefined point in the sequence, after which some
value will exist at the index <tt>(- (collection-size
<i>sequence</i>) 1)</tt>.</p>
<p>Sequences may allow other indices outside the range
specified above to retrieve values that can also be retrieved
with indices inside the normal range. For example, a circular
list sequence may allow you to retrieve the last value at
position <tt>(- (collection-size seq) 1)</tt> and at position
<tt>-1</tt>. Such extensions must be well documented in the
specification of that sequence.</p>
<blockquote>
<h3><a name="flexsequences">Flexible Sequences</a></h3>
<p>Flexible sequences are sequences which allow insertion of
values at arbitrary points in the sequence. Inserting a value
at any position except the end of the sequence causes all
values with higher indices than the insertion point to
<i>shift right</i>, thus having an index increased by one.
Similarly, if values are removed from a sequence at any
position except the end, all values with higher indices are
<i>shifted left</i>, so their former indices are now
decreased by one.</p>
</blockquote>
<h2><a name="procedures">Procedures</a></h2>
<p>Below we describe the naming conventions and semantics of
Collections API functions, grouped by collection type. In each
function, the name of the collection would replace the
percentage mark in the function prototype. Function definitions
in child collection types or in attributes absolutely override
the same named function in the parent. For example
<tt>make-%</tt> in the ordered collection requires an
equivalence function while general collection constructors do
not.</p>
<p>When <tt>*</tt> is encountered in the definitions below, it
is implied that the asterisk is replaced with a function for
the name of the section type and all of its subtypes. For
example, if we had a 'list' flexible sequence collection, the
functions <tt>list-contains?</tt>,
<tt>flexible-sequence-contains?</tt>,
<tt>sequence-contains?</tt>, <tt>bag-contains?</tt> must all
exist, but <tt>collection-contains?</tt> does not. In addition,
it is an error to apply any such function to a collection whose
type does not satisfy that implied by the function name.</p>
<p>Encountering <tt>*</tt> as a function argument indicates
that the argument must be a collection of the type the function
is defined for, or any sub-type. These functions are
polymorphic on the input collection type.</p>
<p>When <tt>%</tt> is encountered in the definitions below, the
actual name of the collection is implied. Again, assuming a
'list' flexible sequence, <tt>make-%</tt> implies that the
function <tt>make-list</tt> exists. <tt>%</tt> as a return
value from a constructor indicates a collection of that specific type.</p>
<p>Encountering <tt>%</tt> as a return type indicates that a
collection of the same type as the input is returned.</p>
<p>Occasionally an operator will be specified in a collection
subtype which was already defined with the same signature
in a supertype or by an
attribute which the collection subtype is known to support.
This is done to clarify the semantics of the operator as
defined for that specific subtype.</p>
<p>Finally, some procedures defined below indicate more than
one item as a return value (following <tt>=></tt> in the
definition.) These procedures do in fact return more than one
value as multiple Scheme values. These values are returned as
if by the <tt>values</tt> Scheme function, <i>not</i> as a
Scheme list.</p>
<blockquote>
<h3><a name="flup">Functional, Linear Update, and Purely
Mutable Collections</a></h3>
<p>Functions in this SRFI which modify a collection are
provided in two flavors. Both <i>must</i> be implemented by
collections. Purely functional updating functions must not
side effect the input collection, but must return a new
collection which reflects the update. The returned collection
may share structure with the input collection, but the
effects of the update must <i>not</i> be reflected in the
input collection.</p>
<p>The linear update/purely mutable update operators share
the name of the functional update function but have the bang
(!) character appended. They too return a collection, which
is allowed to be the same as and/or share structure with the
input collection. However, side-effects to the input
collection are allowed. The caller <i>must</i> use the
returned collection to view the effects of the update. The
structure of the input collection after the modification is
undefined.</p>
<p>Purely mutable collections will <i>only</i> side effect
the input collection. They will therefor always return the
input collection as the output collection. The programmer may
ignore the resulting value if she is certain that the
collection is purely mutable. Operating on the input
collection after an update to a linear updating collection
may have unexpected results.</p>
<p>A collection may naturally be amenable to either purely
functional or purely side-effected updates. In the former
case, the linear updating version of the procedure may return
the purely functionally updated collection. This does not
conflict with the definition of the linear update procedure.
Conversely, the purely functional updating function can
return a distinct collection by cloning a collection which
cannot be functionally updated, and performing the
side-effecting modifications to the clone of the input
collection.</p>
<p>As the case of functionally updating a collection whose
structure is updatable only using side-effects can be
expensive (due to the worst-case need to clone a large
collection), the specifications of all collections are highly
encouraged to document the nature of their compatibility and
efficiency for both the functional and side-effecting update
functions.</p>
<h3><a name="enumprocs">Enumeration</a></h3><i>procedure:</i>
<b>collection-fold-left</b> collection fold-function seed ...
<b>=></b> seed ...<br>
<i>procedure:</i> <b>%-fold-left</b> % fold-function seed ...
<b>=></b> seed ...<br>
<blockquote>
<i>fold-function</i> is a procedure which accepts one more
than the number of seed values. The function accepts a
single collection value as its first argument, and the
seeds as remaining arguments. It must then return a
<i>proceed</i> value, which if false halts the enumeration,
as well as an equal number of returned seed values as
arguments. These seed values are then passed to the next
call to the fold function on the next collection value.<br>
<br>
When the collection values are exhausted or a false proceed
value is received from the fold function, the enumeration
ceases and the fold operator returns the last set of seed
values returned by the fold-function.
</blockquote><i>procedure:</i> <b>collection-fold-right</b>
collection fold-function seed ... <b>=></b> seed ...<br>
<i>procedure:</i> <b>%-fold-right</b> % fold-function seed
... <b>=></b> seed ...<br>
<blockquote>
Behaves like <tt>collection-fold-left</tt>, except that the
fold function is applied to the values of the collection in
reverse order.
</blockquote><i>procedure:</i>
<b>collection-fold-keys-left</b> collection fold-function
seed ... <b>=></b> seed ...<br>
<i>procedure:</i> <b>%-fold-keys-left</b> % fold-function
seed ... <b>=></b> seed ...<br>
<blockquote>
<p>Behaves like <tt>collection-fold-left</tt>, but
enumerates over the keys or indices of a map or sequence
respectively as well as the corresponding values. This
procedure is only defined for sequences and maps.</p>
<p>Also, the fold function of a key enumerator must accept
<i>two</i> more operands than the number of seed values.
The first two operands to the function are the key and
corresponding value at the current point in the
enumeration.</p><!--
<p>
Dictionaries should present one key/value pair at a
time to the fold function. In other words, if
two values are mapped by a single key, the fold function
is called twice, the first time with the key and the first
value, and the second time with the key and the second
value.
</p>-->
</blockquote><i>procedure:</i>
<b>collection-fold-keys-right</b> collection fold-function
seed ... <b>=></b> seed ...<br>
<i>procedure:</i> <b>%-fold-keys-right</b> % fold-function
seed ... <b>=></b> seed ...<br>
<blockquote>
Behaves like <tt>collection-fold-keys-left</tt>, but
enumerates in the reverse order over the keys or indices of
an ordered map or sequence and values. This procedure is
only defined for sequences and ordered maps.
</blockquote>
<h3><a name="collprocs">Collections</a></h3>
<p><i>procedure:</i> <b>collection?</b> value <b>=></b>
value</p>
<blockquote>
Returns a non-false value if the provided value is a
collection.
</blockquote>
<p><i>procedure:</i> <b>collection-name</b> collection
<b>=></b> symbol (%)</p>
<blockquote>
Returns the collection name of the provided collection. The
name is a symbol containing the type name of the specific
collection. A collection whose constructor is make-list,
for example, would have the symbol <tt>list</tt> returned
from <tt>collection-name</tt>
</blockquote>
<p><i>procedure:</i> <b>%?</b> value <b>=></b> value</p>
<blockquote>
Returns a non-false value iff the provided value is an
instance of the specific collection type.
</blockquote>
<p><i>procedure:</i> <b>*-size</b> * <b>=></b> exact
integer</p>
<blockquote>
If the collection has a concept of size, this function
returns the number of actual values in the collection (or
mapped values in a map) If it does not, <tt>#f</tt> must be
returned. When an integer is returned from this function,
it indicates that an enumeration will proceed exactly
<tt>*-size</tt> steps in the absence of any updates.
</blockquote>
<p><i>procedure:</i> <b>*-count</b> * value<b>=></b> exact
integer</p>
<blockquote>
Counts the number of times <tt>value</tt> occurs in the
collection, according to the collection's equivalence
function.
</blockquote>
<p><i>procedure:</i> <b>*-get-any</b> *
[absence-thunk]<b>=></b> value</p>
<blockquote>
Returns a value from the given collection. It is
unspecified which of its values is returned. If the
collection is empty, <tt>absence-thunk</tt> is invoked if
present and its value returned, otherwise <tt>#f</tt> is
returned.
</blockquote>
<p><i>procedure:</i> <b>*-empty?</b> * <b>=></b>
boolean</p>
<blockquote>
Returns a non-false value iff the given collection is known
to be empty. This function should return false if it is
known that there are values within the collection, or if it
is unknown whether any values exist.
</blockquote>
<p><i>procedure:</i> <b>*->list</b> * <b>=></b>
list</p>
<blockquote>
Returns a newly allocated list containing the values of the
collection. This can be done trivially with enumeration,
but an implementation may choose to allow this function to
behave more efficiently on certain collections. If the
collection is ordered, the list must contain the values of
the collection in the same order as the left collection
fold.
</blockquote>
<p><i>procedure:</i> <b>*-clear</b> * <b>=></b> %<br>
<i>procedure:</i> <b>*-clear!</b> * <b>=></b> %</p>
<blockquote>
Clears the collection. In all cases a collection of the
same type as the input is returned with no values or
mappings. It is an error to clear an immutable collection
and may be an error to clear a limited collection.
</blockquote>
<p><i>procedure:</i> <b>*=</b> elt= * ... <b>=></b>
boolean</p>
<blockquote>
<p>Compares the provided (zero or more) collections for
equivalence given the equivalence predicate <tt>elt=</tt>.
If fewer than two collections are provided, a non-false
value is returned. For any other number of collections, the
collections are compared pairwise from left to right. As
soon as any two collections contain different numbers of
values or mappings, or their values aren't equivalent as in
the section "Equivalence" above with the provided
<tt>elt=</tt> function used for value comparison, false is
returned. If all collections are equivalent, a non-false
value is returned.</p>
</blockquote>
<p><i>procedure:</i> <b>make-% =></b> %</p>
<blockquote>
Constructs a % collection.
</blockquote>
<p><i>procedure:</i> <b>%</b> value ... <b>=> %</b></p>
<blockquote>
Constructs a % collection with zero or more values provided
as its initial contents.
</blockquote>
<p><i>procedure:</i> <b>*-copy</b> * <b>=></b> %</p>
<blockquote>
Creates a new collection whose type and contents are the