Skip to content

Commit 8d40d10

Browse files
committed
add draft for eep-73
1 parent 44c891a commit 8d40d10

File tree

1 file changed

+212
-0
lines changed

1 file changed

+212
-0
lines changed

eeps/eep-0073.md

Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
Author: Isabell Huang <isabell(at)erlang(dot)org>
2+
Status: Draft
3+
Type: Standard Track
4+
Created: 21-Sep-2024
5+
Erlang-Version: 28
6+
Post-History:
7+
Replaces: 19
8+
****
9+
EEP 73: Zip generator
10+
----
11+
12+
Abstract
13+
========
14+
15+
This EEP proposes the addition of zip generators with a syntax of `&&` for
16+
comprehensions in Erlang. The idea and syntax of zip generators (comprehension
17+
multigenerators) was first brought up by EEP-19. Even if the syntax and
18+
usages of zip generators proposed by this EEP is mostly the same with EEP-19,
19+
the comprehension language of Erlang has undergone many changes since EEP-19
20+
was accepted. With an implementation that is compatible with all existing
21+
comprehensions, this EEP defines the behavior of zip generators with more
22+
clarification on the compiler's part.
23+
24+
Rationale
25+
=========
26+
27+
List comprehension is a way to create a new list from existing list(s). Lists
28+
are traversed in a dependent (nested) way. In the following example, the
29+
resulting list has length 4 when the two input lists both have length 2.
30+
31+
[{X,Y} || X <- [1,2], Y <- [3,4]] = [{1,3}, {1,4}, {2,3}, {2,4}].
32+
33+
In contrast, parallel list comprehension (also known as zip comprehension)
34+
evaluates qualifiers (a generalization of lists) in parallel. Qualifiers are
35+
first "zipped" together, and then evaluated. Many functional languages
36+
([Haskell][2], [Racket][3], etc.) and non-functional languages (Python etc.) support
37+
this variation. Suppose the two lists in the example above are evaluated as
38+
a zip comprehension, the result would be `[{1,3}, {2,4}]`.
39+
40+
Zip comprehensions allow the user to conveniently iterate over several lists
41+
at once. Without it, the standard way to accomplish the same task in Erlang
42+
is to use `lists:zip` to zip two lists into two-tuples, or to use `lists:zip3`
43+
to zip three lists into three-tuples. The list module does not provide a
44+
function to zip more than three lists. Functions like `lists:zip` always
45+
create intermediate data structures when compiled. The compiler does not
46+
perform deforestation to eliminate the unwanted tuples.
47+
48+
Zip generators is a generalization of zip comprehensions. Every set of zipped
49+
generators is treated as one generator. Instead of constraining a comprehension
50+
to be either zipped or non-zipped, any generator can be either a zip generator
51+
(containing at least two generators zipped together), or a non-zip generator
52+
(all existing generators are non-zip generator). Therefore, zip generators
53+
can be mixed freely with all existing generators and filters. Zip comprehension
54+
then becomes a special case of comprehension where only zip generators are
55+
used.
56+
57+
In summary, zip generator removes the user's need to call the zip function
58+
and allows for any number of lists to be zipped at once. It can be used in
59+
list, binary, and map comprehensions, and mixed freely with all existing
60+
generators and filters. Internally, the compiler does not create any
61+
intermediate data structure, therefore also removing the need of deforestation.
62+
63+
Specification
64+
========================
65+
66+
Currently, Erlang supports three kinds of comprehensions, list comprehension,
67+
binary comprehension, and map comprehension. Their names refer to the result
68+
of the comprehension. List comprehension produces a list; binary comprehension
69+
produces a binary, etc.
70+
71+
[Expression || Qualifier(s)] %% List Comprehension
72+
<<Expression || Qualifier(s)>> %% Binary Comprehension
73+
#{Expression || Qualifier(s)} %% Map Comprehension
74+
75+
76+
Qualifiers can have the following kind: filter, list generator, bitstring
77+
generator, and map generator. Except for filters, the other three kinds of
78+
qualifiers are generators. Their names refer to the type on the right hand
79+
side of `<-` or `<=`. Generators have the following form:
80+
81+
Pattern <- List %% List Generator
82+
Pattern <= Bitstring %% Bitstring Generator
83+
Pattern_1 := Pattern_2 <- Map %% Map Generator
84+
85+
All qualifiers and filters can be freely used and mixed in all 3 kinds of
86+
comprehensions. The following example shows a list comprehension with a
87+
list generator and a bitstring generator.
88+
89+
[{X,Y} || X <- [1,2,3], <<Y>> <= <<4,5,6>>].
90+
91+
This EEP proposes the addition of zip generators. A zip generator is two or
92+
more generators connected by `&&`. Zip generators is constructed to connect
93+
any number of the 3 kinds of generators above. Zip generators can be used
94+
in list, binary, or map comprehensions in the same way.
95+
96+
For example, if the two generators in the above example is combined together
97+
as a zip generator, the comprehension would look like:
98+
99+
[{X,Y} || X <- [1,2,3] && <<Y>> <= <<4,5,6>>].
100+
101+
For every zip generator of the form
102+
`G1 && ... && Gn`, it is evaluated to have the same result as `zip/n` where
103+
104+
zip([H1|T1], ..., [Hn|Tn]) ->
105+
[{H1,...,Hn} | zip(T1, ..., Tn)];
106+
zip([], ..., []) ->
107+
[].
108+
109+
Therefore, the above comprehension evaluates to `[{1,4}, {2,5}, {3,6}]`, which
110+
is the same as if using `lists:zip/2`.
111+
112+
Zip generator can also be used when a comprehension contains other non-zip
113+
generators and/or filters. The `&&` symbol has a higher precedence than `,`.
114+
115+
The following example evaluates to `[{b,4}, {c,6}]`. The element `{a,2}` is
116+
filtered out from the resulting list.
117+
118+
[{X, Y} || X <- [a, b, c] && <<Y>> <= <<2, 4, 6>>, Y =/= 2].
119+
120+
Comparing to using helper functions, there is one advantage of using a zip
121+
generator: The Erlang compiler does not generate any tuple when a zip
122+
generator is translated into core Erlang. The generated code reflects the
123+
programmer's intent, which is to collect one element from every list at a
124+
time without creating a list of tuples.
125+
126+
Error Behaviors
127+
================
128+
129+
One would expect that when errors happen, a zip generator behaves the same
130+
as `lists:zip/2`, `lists:zip3/3`, and also the `zip/n` function above when
131+
more than 3 lists are zipped together. The design and implementation of
132+
zip generators aim to achieve that both for compiled code and for comprehensions
133+
evaluated in Erlang shell.
134+
135+
Generators of Different Lengths
136+
--------------
137+
138+
`lists:zip/2` and `lists:zip3/3` will fail if the given lists are not of the
139+
same length, where `zip/n` will also crash. Therefore, a zip generator raises a
140+
`bad generators` error when it discovers that the given generators are of
141+
different lengths.
142+
143+
When a zip generator crashes because the containing generators are of
144+
different lengths, the error message is a tuple, where the first element
145+
is the atom `bad_generators`, and the second element is a tuple that contains
146+
the remaining data from all generators.
147+
148+
For example, this comprehension will crash at runtime.
149+
150+
[{X,Y} || X <- [1,2,3] && Y <- [1,2,3,4]].
151+
152+
The resulting error tuple is `{bad_generators,{[],[4]}}`. This is because
153+
when the comprehension crashes, the first list in the zip generator has
154+
only the empty list `[]` left, while the second list in the zip generator
155+
has `[4]` left.
156+
157+
On the compiler's side, it is rather difficult to return the original zip
158+
generator in the error message, or to point out which generator is of
159+
different length comparing to others. The proposed error message aims to
160+
gives the most helpful information without imposing extra burden on the
161+
compiler or runtime.
162+
163+
Non-generator in a Zip Generator
164+
-----------------
165+
166+
As the idea of zipping only makes sense for generators, a zip generator cannot
167+
contain filters or any expression that is not a generator. Whenever it is
168+
possible to catch such an error at compile-time, this error is caught by
169+
the Erlang linter.
170+
171+
For example, the zip generator in the following comprehension contains a
172+
non-generator.
173+
174+
zip() -> [{X,Y} || X <- [1,2,3] && Y <- [1,2,3] && X > 0].
175+
176+
When the function is compiled, the linter points out that only generators are
177+
allowed in a zip generator, together with the position of the non-generator.
178+
179+
t.erl:6:55: only generators are allowed in a zip generator.
180+
% 6| zip() -> [{X,Y} || X <- [1,2,3] && Y <- [1,2,3] && X > 0].
181+
% | ^
182+
183+
184+
Backwards Compatibility
185+
========================
186+
187+
The operator `&&` is not used in Erlang. No existing code is affected by
188+
this addition.
189+
190+
Reference Implementation
191+
========================
192+
193+
[Git Branch][1] contains the implementation for zip generators.
194+
195+
[1]: https://github.com/lucioleKi/otp/tree/isabell/compiler/zip-comprehension/OTP-19184
196+
[2]: https://downloads.haskell.org/~ghc/5.00/docs/set/parallel-list-comprehensions.html
197+
[3]: https://docs.racket-lang.org/reference/for.html
198+
199+
Copyright
200+
=========
201+
202+
This document is placed in the public domain or under the CC0-1.0-Universal
203+
license, whichever is more permissive.
204+
205+
[EmacsVar]: <> "Local Variables:"
206+
[EmacsVar]: <> "mode: indented-text"
207+
[EmacsVar]: <> "indent-tabs-mode: nil"
208+
[EmacsVar]: <> "sentence-end-double-space: t"
209+
[EmacsVar]: <> "fill-column: 70"
210+
[EmacsVar]: <> "coding: utf-8"
211+
[EmacsVar]: <> "End:"
212+
[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "

0 commit comments

Comments
 (0)