|
| 1 | + Author: Dániel Szoboszlay <[email protected]> |
| 2 | + Status: Draft |
| 3 | + Type: Standards Track |
| 4 | + Created: 01-Jul-2024 |
| 5 | + Erlang-Version: 28 |
| 6 | + Post-History: |
| 7 | +**** |
| 8 | +EEP 70: Strict and relaxed generators |
| 9 | +---- |
| 10 | + |
| 11 | +Abstract |
| 12 | +======== |
| 13 | + |
| 14 | +This EEP proposes the addition of a new, *strict* variant of all |
| 15 | +existing generators (list, bit string and map). Currently existing |
| 16 | +generators are *relaxed*: they ignore terms in the right-hand side |
| 17 | +expression that do not match the left-hand side pattern. Strict |
| 18 | +generators on the other hand shall fail with exception `badmatch`. |
| 19 | + |
| 20 | +Rationale |
| 21 | +========= |
| 22 | + |
| 23 | +The motivation for strict generators is that relaxed generators |
| 24 | +can hide the presence of unexpected elements in the input data of a |
| 25 | +comprehension. For example consider the below snippet: |
| 26 | + |
| 27 | + [{User, Email} || #{user := User, email := Email} <- all_users()] |
| 28 | + |
| 29 | +This list comprehension would filter out users that don't have an email |
| 30 | +address. This may be an issue if we suspect potentially incorrect input |
| 31 | +data, like in case `all_users/0` would read the users from a JSON file. |
| 32 | +Therefore cautious code that would prefer crashing instead of silently |
| 33 | +filtering out incorrect input would have to use a more verbose map |
| 34 | +function: |
| 35 | + |
| 36 | + lists:map(fun(#{user := User, email := Email}) -> {User, Email} end, |
| 37 | + all_users()) |
| 38 | + |
| 39 | +Unlike the generator, the anonymous function would crash on a user |
| 40 | +without an email address. Strict generators would allow similar |
| 41 | +semantics in comprehensions too: |
| 42 | + |
| 43 | + [{User, Email} || #{user := User, email := Email} <:- all_users()] |
| 44 | + |
| 45 | +This generator would crash (with a `badmatch` error) if the pattern |
| 46 | +wouldn't match an element of the list. |
| 47 | + |
| 48 | +The proposed operators for strict generators are `<:-` (for lists |
| 49 | +and maps) and `<:=` (for bit strings) instead of `<-` and `<=`. This |
| 50 | +syntax was chosen because `<:-` and `<:=` somewhat resemble the `=:=` |
| 51 | +operator that tests whether two terms match, and at the same time keep |
| 52 | +the operators short and easy to type. Having the two types of operators |
| 53 | +differ by a single character, `:`, also makes the operators easy to |
| 54 | +remember as "`:` means strict." |
| 55 | + |
| 56 | +Alternate Designs |
| 57 | +================= |
| 58 | + |
| 59 | +There are numerous other ways of achieving the goal of crashing upon |
| 60 | +encountering non-matching input: |
| 61 | + |
| 62 | +1. Converting the comprehension to a `map` call, just as in the example |
| 63 | + of the previous section. |
| 64 | + |
| 65 | +2. Moving the pattern matching that may fail to the result expression of |
| 66 | + the comprehension: |
| 67 | + |
| 68 | + [begin |
| 69 | + #{user:= User, email := Email} = Usr, |
| 70 | + {User, Email} |
| 71 | + end |
| 72 | + || Usr <- all_users()] |
| 73 | + |
| 74 | +3. Moving the pattern matching that may fail to a filter in the |
| 75 | + comprehension (which would still have to evaluate to a boolean, |
| 76 | + making this solution quite awkward looking): |
| 77 | + |
| 78 | + [{User, Email} |
| 79 | + || Usr <- all_users(), |
| 80 | + (#{user := User, email := Email} = Usr) > 0] |
| 81 | + |
| 82 | +4. The same as before, but using binders proposed in [EEP 12][]: |
| 83 | + |
| 84 | + [{User, Email} |
| 85 | + || Usr <- all_users(), |
| 86 | + #{user := User, email := Email} = Usr] |
| 87 | + |
| 88 | +Most of these solutions are much more verbose than the strict |
| 89 | +generator syntax, undermining the compactness of comprehensions. But |
| 90 | +there are more serious issues too: |
| 91 | + |
| 92 | +* 1 would not work for bit string comprehensions, because there isn't |
| 93 | + a `map` function available for bit strings. |
| 94 | + |
| 95 | +* 2 would make it impossible to use sub terms of the pattern (`Usr` in |
| 96 | + this example) in subsequent generators or filters. |
| 97 | + |
| 98 | +* The intent of the code isn't clear in most of these solutions |
| 99 | + (definitely not in 3, and arguably in 4 and 2). |
| 100 | + |
| 101 | +* None of these techniques can solve the problem of final non-matching |
| 102 | + bits of a bit string. |
| 103 | + |
| 104 | +The issue with bit string generators is that unlike lists and maps, bit |
| 105 | +strings don't have natural "elements" the generator could iterate over. |
| 106 | +Instead the pattern in the generator dictates where to split the bit |
| 107 | +string into parts. This could inevitably lead to situations where the |
| 108 | +bit string ends with some bits that don't match the pattern, like in |
| 109 | +this example: |
| 110 | + |
| 111 | + [X || <<X:16>> <- <<1,2,3>>] |
| 112 | + |
| 113 | +The existing generator would skip these final non-matching bits, and |
| 114 | +this behaviour cannot be changed due to backward compatibility reasons. |
| 115 | +It is only possible to guarantee that a bit string generator would fully |
| 116 | +consume its input by introducing a new type of bit string generator or |
| 117 | +some other new syntax that would alter the behaviour of the existing |
| 118 | +generator. |
| 119 | + |
| 120 | +Reference Implementation |
| 121 | +======================== |
| 122 | + |
| 123 | +Current implementation: [PR #8625][]. |
| 124 | + |
| 125 | +Backward Compatibility |
| 126 | +====================== |
| 127 | + |
| 128 | +There are no backward compatibility issues with the proposed new syntax, |
| 129 | +since the `<:-` and `<:=` operators used to be invalid syntax in |
| 130 | +previous versions of Erlang. |
| 131 | + |
| 132 | +[PR #8625]: https://github.com/erlang/otp/pull/8625 |
| 133 | + "Reference implementation PR" |
| 134 | + |
| 135 | +[EEP 12]: eep-0012.md |
| 136 | + "Extensions to comprehensions, O'Keefe" |
| 137 | + |
| 138 | +Copyright |
| 139 | +========= |
| 140 | + |
| 141 | +This document is placed in the public domain or under the CC0-1.0-Universal |
| 142 | +license, whichever is more permissive. |
| 143 | + |
| 144 | +[EmacsVar]: <> "Local Variables:" |
| 145 | +[EmacsVar]: <> "mode: indented-text" |
| 146 | +[EmacsVar]: <> "indent-tabs-mode: nil" |
| 147 | +[EmacsVar]: <> "sentence-end-double-space: t" |
| 148 | +[EmacsVar]: <> "fill-column: 70" |
| 149 | +[EmacsVar]: <> "coding: utf-8" |
| 150 | +[EmacsVar]: <> "End:" |
| 151 | +[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: " |
0 commit comments