-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlexical.tex
522 lines (446 loc) · 20.1 KB
/
lexical.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
%!TEX root = reference.tex
\chapter{Lexical Syntax}
\label{lexical}
\section{Characters}
\index{character set}
\index{Unicode}
\Sr source text is based on the Unicode\tm{} character set. This means that identifiers and string values may directly use any Unicode characters. However, all the standard operators and keywords fall in the ASCII subset of Unicode.
\section{Comments and White Space}
Input is tokenized according to rules that are similar to most modern programming languages: contiguous sequences of characters are assumed to belong to the same token unless the class of character changes -- for example, a punctuation mark separates sequences of letter characters. In addition, white space and comments serve as token boundaries; otherwise white space and comments are ignored by the higher-level semantics of the language.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{Ignorable}&\arrow&\ntRef{LineComment}\\
&\choice&\ntRef{BlockComment}\\
&\choice&\emph{WhiteSpace}
\end{eqnarray*}
\caption{Ignorable Characters}
\label{ignorableFig}
\end{figure}
There are two forms of comment: line comment and block comment.
\subsection{Line Comment}
\index{comment!line}
\index{line comment}
A line comment consists of a \q{--\spce} or a \q{--\bsl{}t} followed by all characters up to the next new-line. Here, \q{\spce} refers to the space character, and \q{\bsl{}t} refers to the Horizontal Tab.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{LineComment}&\arrow&(\q{--\spce}\choice\q{--\bsl{}t})\,\emph{char}\sequence{}\emph{char}\,\q{\bsl{}n}
\end{eqnarray*}
\caption{Line Comment}
\label{lineCommentFig}
\end{figure}
\subsection{Block Comment}
\index{comment!block}
\index{block comment}
A block comment consists of the characters \q{/*} followed by any characters and terminated by the characters \q{*/}.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{BlockComment}&\arrow&\q{/*}\ \emph{char}\sequence{}\emph{char}\,\q{*/}
\end{eqnarray*}
\caption{Block comments}
\label{blockCommentFig}
\end{figure}
\noindent
Each form of comment overrides the other: a \q{/*} sequence in a line comment is \emph{not} the start of a block comment, and a \q{--\spce} sequence in a block comment is similarly not the start of a line comment but the continuation of the block comment.
\section{Number Literals}
\label{numericLiteral}
\Sr supports integer values, floating point values, decimal values and character codes as numeric values.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{NumericLiteral}&\arrow&\ntRef{IntegerLiteral}\\
&\choice&\ntRef{Hexadecimal}\\
&\choice&\ntRef{FloatingPoint}\\
&\choice&\ntRef{Decimal}\\
&\choice&\ntRef{CharacterCode}
\end{eqnarray*}
\caption{Numeric Literals}
\label{numberFig}
\end{figure}
\subsection{Integral Literals}
\label{Integers}
\index{integer}
\index{number!integer}
\index{syntax!integer}
An integer is written using the normal decimal notation (see Figure~\vref{decimalFig}):
\begin{alltt}
1 34 -99
\end{alltt}
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{IntegerLiteral}&\arrow&[\q{-}]\,\emph{Digit}\ldots\emph{Digit}\plus[\q{L}\choice\q{l}]\\
\ntDef{Digit}&\arrow&\q{0}\choice\q{1}\choice\q{2}\choice\q{3}\choice\q{4}\choice\q{5}\choice\q{6}\choice\q{7}\choice\q{8}\choice\q{9}
\end{eqnarray*}
\caption{Integer Literals}
\label{decimalFig}
\end{figure}
A \q{long} integer is denoted by suffixing the number with a letter \q{L} or \q{l}:
\begin{alltt}
23L -99l
\end{alltt}
\begin{aside}
In general, \q{long} integers are \emph{not} interchangeable with \q{integer}s without explicit coercion.
\end{aside}
\begin{aside}
A \q{long} value is 64 bits, whereas an \q{integer} is 32 bits.
\end{aside}
\subsection{Hexadecimal Integers}
\label{Hexadecimals}
\index{hexadecimal}
\index{number!hexadecimal}
\index{syntax!hexadecimal}
A hexadecimal number is an integer written using hexadecimal notation. A hexadecimal number consists of a leading \q{0x} followed by a sequence of hex digits. For example,
\begin{alltt}
0x0 0xff 0x34fe
\end{alltt}
are all hexadecimals.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{Hexadecimal}&\arrow&\q{0x}\,\emph{Hex}\ldots\emph{Hex}\plus[\q{L}\choice\q{l}]\\
\ntDef{Hex}&\arrow&\q{0}\choice\q{1}\choice\q{2}\choice\q{3}\choice\q{4}\choice\q{5}\choice\q{6}\choice\q{7}\choice\q{8}\choice\q{9}\choice\q{a}\choice\q{b}\choice\q{c}\choice\q{d}\choice\q{e}\choice\q{f}
\end{eqnarray*}
\caption{Hexadecimal numbers}
\label{hexadecimalFig}
\end{figure}
Like \ntRef{IntegerLiteral} numbers, \emph{HexaDecimal}s may be suffixed by a \q{L} or \q{l} to indicate a \q{long} value.
\subsection{Floating Point Numbers}
\index{floating point}
\index{number! floating point}
\index{syntax! floating point number}
\label{FloatingPoint}
Floating point numbers are written using a notation that is familiar. For example,
\begin{alltt}
234.45 1.0e45
\end{alltt}
See Figure~\vref{floatingPointFig} for a complete syntax diagram for floating point numbers.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{FloatingPoint}&\arrow&\ntRef{Digit}\ldots\ntRef{Digit}\plus\q{.}\,\ntRef{Digit}\ldots\ntRef{Digit}\plus[\q{e}\,[\q{-}]\,\ntRef{Digit}\ldots\ntRef{Digit}\plus]
\end{eqnarray*}
\caption{Floating Point numbers}
\label{floatingPointFig}
\end{figure}
\begin{aside}
All floating point number are represented to a precision that is at least equal to 64-bit double precision. There is no equivalent of single-precision floating pointer numbers.
\end{aside}
\subsection{Decimal Numbers}
\index{arbitrary precision numbers}
\index{number!decimal}
\index{syntax! arbitrary precision number}
A decimal number is a decimal number with an arbitrary precision associated with it. It is intended to support calculations where conventional integral and floating point values become ineffective.
Decimal numbers are written as a normal decimal fraction number followed by the character \q{a} or \q{A}.
\begin{alltt}
123.45a
\end{alltt}
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{Decimal}&\arrow&\ntRef{Digit}\ldots\ntRef{Digit}\plus\,\q{.}\,\ntRef{Digit}\ldots\ntRef{Digit}\plus[\q{a}|\q{A}]
\end{eqnarray*}
\caption{Decimal Numbers}
\label{DecimalFig}
\end{figure}
\subsection{Character Codes}
\index{character code}
\index{number! character code}
\index{syntax! character code}
The character code notation allows a number to be based on the coding value of a character. Any Unicode character code point can be entered in this way:
\begin{alltt}
0cX 0c[ 0c\bsl{}n 0c
\end{alltt}
For example, \q{0c\bsl{}n} is the code point associated with the new line character, i.e., its value is \q{10}.
\begin{aside}
Unicode has the capability to represent up to one million character code points.
\end{aside}
\begin{figure}[htbp]
\label{characterCodeFig}
\begin{eqnarray*}
\ntDef{CharacterCode}&\arrow&\q{0c}\,\ntRef{CharRef}
\end{eqnarray*}
\caption{Character Codes}
\vspace{-2ex}
\end{figure}
A \ntRef{CharacterCode} has type \q{integer}. If necessary, it can be coerced to a \q{long} using a \ntRef{TypeCoercion} expression:
\begin{alltt}
(0cA as long)
\end{alltt}
\section{Strings and Characters}
\label{string}
\index{string literal}
\index{character reference}
\index{syntax! string literal}
A \q{string} consists of a sequence of characters. There is no specific type in \Sr for the characters themselves.
\begin{aside}
The reasons for this are due to the fact that Unicode \q{string} values cannot be easily represented as a unique sequence.
\end{aside}
\label{CharacterReference}
\index{character reference}
\index{syntax! character reference}
A \ntRef{CharRef} is a denotation of a single character.
\begin{figure}[h!]
\label{charRefFig}
\begin{eqnarray*}
\ntDef{CharRef}&\arrow&\emph{Char}\ \choice\ \ntRef{Escape}\\
\ntDef{Escape}&\arrow&\bsl\q{b}\choice\bsl\q{d}\choice\q{\bsl}\q{e}\choice\q{\bsl}\q{f}\choice\q{\bsl}\q{n}\choice\q{\bsl}\q{r}\choice\q{\bsl}\q{t}\choice\q{\bsl{}v}\choice\q{\bsl}\emph{Char}\choice\q{\bsl u}\ntRef{Hex}\sequence{}\ntRef{Hex}\q{;}
\end{eqnarray*}
\caption{Character Reference}
\end{figure}
For most characters, the character reference for that character is the character itself. For example, the string \q{"T"} contains the character \q{T}. However, certain standard characters are normally referenced by escape sequences consisting of a backslash character followed by other characters; for example, the new-line character is typically written \q{\bsl{}n}. The standard character references are shown in Table~\vref{CharRef}.
\begin{table}
\caption{\Sr Character References}\label{CharRef}
\begin{center}
\begin{tabular}{|ll|ll|}
\hline
Escape&Meaning&Escape&Meaning\\
\hline
\tt \bsl{}b&Back space&
\tt \bsl{}d&Delete\\
\tt \bsl{}e&Escape&
\tt \bsl{}f&Form Feed\\
\tt \bsl{}n&New line&
\tt \bsl{}r&Carriage return\\
\tt \bsl{}t&Tab&
\tt \bsl{}v&Vertical Tab\\
\tt \bsl{}u{\rm \emph{HexSeq}};&Unicode code point&
\tt \bsl{}\emph{Char}&The \emph{Char} itself\\
\hline
\end{tabular}
\end{center}
\end{table}
Apart from the standard character references, there is a hex encoding for directly encoding unicode characters that may not be available on a given keyboard:
\begin{alltt}
\bsl{}u34ff;
\end{alltt}
This notation accommodates the Unicode's varying width of character codes -- from 8 bits through to 20 bits.
\subsection{Quoted Strings}
\label{quotedString}
A string is a sequence of character references (see Section~\vref{CharacterReference}) enclosed in double quotes; alternately a string may take the form of a \ntRef{BlockString}.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{StringLiteral}&\arrow&\q{"}\ntRef{StrChar}\ldots\ntRef{StrChar}\q{"}\\
&\choice&\ntRef{BlockString}\\
\ntDef{StrChar}&\arrow&\ntRef{CharRef}\ \choice\,\ntRef{Interpolation}\\
\ntDef{Interpolation}&\arrow&[\,\q{\$}\ \choice\ \q{\#}\,]\ \ntRef{Identifier}\,[\,\q{:}\,\ntRef{FormattingSpec}\,\q{;}\,]\\
&\choice&[\,\q{\$}\ \choice\ \q{\#}\,]\ \q{(}\,\ntRef{Expression}\,\q{)}\,[\,\q{:}\,\ntRef{FormattingSpec}\,\q{;}\,]\\
\ntDef{FormattingSpec}&\arrow&\ntRef{StrChar}\ldots\ntRef{StrChar}
\end{eqnarray*}
\caption{Quoted String}\label{quotedStringFig}
\end{figure}
\begin{alltt}
"This is a string with a \bsl{}nnew line in the middle"
\end{alltt}
\begin{aside}
Strings are \emph{not} permitted to contain the new-line character -- other than as a character reference.
\end{aside}
\subsection{Block String}
\label{blockString}
\index{strings!block form of}
\index{block of data}
In addition to the `normal' notation for strings, there is a block form of string that permits raw character data to be processed as a string.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{BlockString}&\arrow&\q{"""}\emph{Char}\sequence{}\emph{Char}\q{"""}
\end{eqnarray*}
\caption{Block String Literal}
\label{blockStringFig}
\end{figure}
The block form of string allows any characters in the text, and performs no interpretation of those characters.
Block strings are written using triple quote characters at either end. Any new-line characters enclosed by the block quotes are considered to be part of the strings.
The normal interpretation of \q{\$} and \q{\#} characters as interpolation markers is suppressed within a block string.
\begin{alltt}
"""This is a block string with $ and
uninterpreted # characters"""
\end{alltt}
\begin{aside}
This form of string literal can be a convenient method for including block text into a program source.
\end{aside}
\section{Regular Expressions}
\label{regexps}
\index{regular expression}
A regular expression may be used to match against string values. Regular expressions are written using a regexp notation that is close to the common formats; with some simplifications and extensions.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{RegularExpression}&\arrow&\q{`}\ntRef{Regex}\q{`}\\
\ntDef{Regex}&\arrow&\q{.}\\
&\choice&\ntRef{CharRef}\\
&\choice&\ntRef{DisjunctiveGroup}\\
&\choice&\ntRef{CharacterClass}\\
&\choice&\ntRef{Binding}\\
&\choice&\ntRef{Regex}\ \ntRef{Cardinality}\\
&\choice&\ntRef{Regex}\ \ntRef{Regex}
\end{eqnarray*}
\caption{Regular Expressions}\label{regFig}
\end{figure}
Figure~\vref{regFig} shows the lexical syntax of regular expressions; however, see Section~\vref{regularExpressions} for a more detailed explanations of regular expression syntax and semantics.
\section{Identifiers}
\label{identifiers}
\index{identifier}
Identifiers are used to denote operators, keywords and variables.
\subsection{Alphanumeric Identifiers}
\index{alpha numeric identifier}
\label{alphaIdentifier}
Identifiers in \Sr are based on the Unicode definition of identifier. For the ASCII subset of characters, the definition corresponds to the common form of identifier -- a letter followed by a sequence of digits and letters. However, non-ASCII characters are also permitted in an identifier.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{Identifier}&\arrow&\ntRef{AlphaNumeric}\\
&\choice&\ntRef{MultiWordIdentifier}\\
&\choice&\ntRef{GraphicIdentifier}\\
\ntDef{AlphaNumeric}&\arrow&\ntRef{LeadChar}\,\ntRef{BodyChar}\sequence{}\ntRef{BodyChar}\\
\ntDef{LeadChar}&\arrow&\emph{LetterNumber}\\
&\choice&\emph{LowerCase}\\
&\choice&\emph{UpperCase}\\
&\choice&\emph{TitleCase}\\
&\choice&\emph{OtherNumber}\\
&\choice&\emph{OtherLetter}\\
&\choice&\emph{ConnectorPunctuation}\\
&\choice&\ntRef{Escape}\\
&\choice&\q{\_}\\
\ntDef{BodyChar}&\arrow&\emph{LeadChar}\\
&\choice&\emph{ModifierLetter}\\
&\choice&\emph{Digit}
\end{eqnarray*}
\caption{Identifier}
\label{identifierFig}
\end{figure}
The terms \emph{LetterNumber}, \emph{ModifierLetter} and so on; referred to in Figure~\vref{identifierFig} refer to standard character categories defined in Unicode 3.0.
\begin{aside}
This definition of \emph{Identifier} closely follows the standard definition of Identifier as contained in the Unicode specification.
\end{aside}
\subsection{Punctuation Symbols and Graphic Identifiers}
\label{punctuation}
The standard operators often have a graphic form -- such as \q{+}, and \q{=<}. Table~\vref{standardGraphicsTable} contains a complete listing of all the standard graphic-form identifiers.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{GraphicIdentifier}\ &\arrow&\ \ntRef{SymbolicChar}\sequence{}\ntRef{SymbolicChar}\\
\ntDef{SymbolicChar}\ &\arrow&\ \emph{Char}\ \text{{excepting}}\ \ntRef{BodyChar}
\end{eqnarray*}
\caption{Graphic Identifiers}
\label{graphicIdentifierFig}
\end{figure}
\begin{table}[h]
\begin{center}
\caption{Standard Graphic-form Identifiers}\label{standardGraphicsTable}
\begin{tabular}{|llllllll|}
\hline
\tt !&\tt \#<&\tt \%\%&\tt -->&\tt :!&\tt ;&\tt =>&\tt |\\
\tt !=&\tt \#<>&\tt *&\tt ->&\tt :\&&\tt ;*&\tt >&\tt |*\\
\tt \#&\tt \#@&\tt **&\tt .&\tt :*&\tt <&\tt >\#&\tt |>\\
\tt \#\#&\tt \#\tlda{}&\tt +&\tt ..,&\tt :+&\tt <=&\tt >=&\tt \tlda{}\\
\tt \#\$&\tt \$&\tt ++&\tt ./&\tt :-&\tt <=>&\tt ?&\tt \#*\\
\tt \$\$&\tt ,&\tt /&\tt ::&\tt <|&\tt @&\tt \#+&\tt \$=>\\
\tt ,..&\tt //&\tt :=&\tt =&\tt @@&\tt \#:&\tt \%&\tt -\\
\tt :&\tt :|&\tt ==>&\tt \_&\tt ?.&&&\\
\hline
\end{tabular}
\end{center}
\end{table}
\begin{aside}
Apart from their graphic form there is no particular semantic distinction between a graphic form identifier and a alphanumeric form identifier. In fact, new graphical tokens may be introduced as a result of declaring an operator -- see Section~\vref{symbolicOperators}.
\end{aside}
\subsection{Standard Keywords}
\label{keywords}
\index{standard keywords}
\index{keywords}
There are a number of keywords which are reserved by the language -- these may not be used as identifiers or in any other role.
Table~\vref{keywordTable} lists the standard keywords in the language.
\begin{longtable}[hb]{|llll|}
\caption{Standard Keywords\label{keywordTable}}\\
\hline
\endfirsthead
\multicolumn{4}{c}{
{Table \ref{keywordTable} Standard Keywords (cont.)}}\\
\hline
\endhead
\hline\multicolumn{4}{r}{\small\emph{continued\ldots}}\
\endfoot
\hline
\endlastfoot
\tt 'n&\tt fn&\tt memo&\tt request\\
\tt 's&\tt for&\tt merge&\tt spawn\\
\tt alias&\tt for\spce{}all&\tt not&\tt substitute\\
\tt all&\tt forall&\tt nothing&\tt such\spce{}that\\
\tt and&\tt from&\tt notify&\tt suchthat\\
\tt any&\tt fun&\tt of&\tt switch\\
\tt any\spce{}of&\tt function&\tt on&\tt sync\\
\tt anyof&\tt has&\tt on\spce{}abort&\tt then\\
\tt as&\tt has\spce{}kind&\tt open&\tt to\\
\tt assert&\tt has\spce{}type&\tt or&\tt try\\
\tt bound\spce{}to&\tt hastype&\tt order\spce{}by&\tt tuple\\
\tt case&\tt identifier&\tt order\spce{}descending\spce{}by&\tt type\\
\tt cast&\tt if&\tt otherwise&\tt unique\\
\tt catch&\tt ignore&\tt over&\tt unquote\\
\tt computation&\tt implementation&\tt package&\tt update\\
\tt contract&\tt implements&\tt pattern&\tt using\\
\tt counts\spce{}as&\tt implies&\tt perform&\tt valis\\
\tt def&\tt import&\tt prc&\tt valof\\
\tt default&\tt in&\tt private&\tt var\\
\tt delete&\tt instance\spce{}of&\tt procedure&\tt waitfor\\
\tt descending\spce{}by&\tt is&\tt ptn&\tt when\\
\tt determines&\tt is\spce{}tuple&\tt query&\tt where\\
\tt do&\tt java&\tt quote&\tt while\\
\tt down&\tt kind&\tt raise&\tt with\\
\tt else&\tt let&\tt reduction&\tt without\\
\tt exists&\tt matches&\tt ref&\tt yield\\
\tt extend&\tt matching&\tt remove&\\
\hline
\end{longtable}
\begin{aside}
On those occasions where it is important to have an identifier that is a keyword it is possible to achieve this by enclosing the keyword in parentheses.
For example, while \q{type} is a keyword in the language; enclosing the word in parentheses: \q{(type)} has the effect of suppressing the keyword interpretation.
Enclosing a name in parentheses also has the effect of suppressing any operator information about the name.
\end{aside}
\subsection{Multi-word Identifiers}
\label{multiword}
\index{identifier!multi-word}
\index{multiple word identifiers}
A \ntRef{MultiWordIdentifier} is an \ntRef{Identifier} that is written as a contiguous sequence of alphanumeric words. Although written as multiple words, a \ntRef{MultiWordIdentifier} is logically a \emph{single} identifier. For example, the combination of words:
\begin{alltt}
such that
\end{alltt}
is logically a single multi-word identifier whose name is ``\q{such\spce{}that}''.
\begin{figure}[htbp]
\begin{eqnarray*}
\ntDef{MultiWordIdentifier}&\arrow&\ntRef{AlphaNumeric}\plus\\
\end{eqnarray*}
\caption{Multi-Word Identifier}
\label{multiWordIentifierFig}
\end{figure}
There are a few standard \ntRef{MultiWordIdentifier}s, as outlined in Table~\vref{multiWords}. In addition, \ntRef{MultiWordIdentifier}s can be defined as operators (see Section~\vref{multiWordOperators}).
\begin{table}
\begin{center}
\begin{tabular}[htbp]{|llll|}
\hline
\tt any\spce{}of&\tt is\spce{}tuple&\tt group\spce{}by&\tt has\spce{}value\\
\tt such\spce{}that&\tt counts\spce{}as&\tt for\spce{}all&\tt order\spce{}by\\
\tt order\spce{}descending\spce{}by&\tt has\spce{}kind&\tt instance\spce{}of&\tt descending\spce{}by\\
\tt has\spce{}type&\tt bound\spce{}to&\tt or\spce{}else&\tt on\spce{}abort\\
\hline
\end{tabular}
\caption{Multi-Word Keywords\label{multiWords}}
\end{center}
\end{table}
Parts of a \ntRef{MultiWordIdentifier} may be separated by spaces and/or comments.
If a part of a \ntRef{MultiWordIdentifier} occurs out of sequence, i.e., not as part of the sequence that defines the identifier, then it is interpreted as a normal \ntRef{AlphaNumeric} identifier.
\subsection{Operator Defined Tokens}
\label{operatorDefinedTokens}
\index{identifer!operator defined}
\index{operator defined tokens}
When a new operator is defined -- see Section~\vref{newOperator} -- it may be that it takes the form of a normal identifier; as in:
\begin{alltt}
\#infix("hello",50)
\end{alltt}
However, it is also possible to define an operator from special characters:
\begin{alltt}
\#prefix("&&",80);
\end{alltt}
The operator `identifier' -- \q{\&\&} -- is not a normal alphanumeric identifier.
When such a declaration is processed, the tokenizer extends itself to include the new operator identifier as a valid token. Hence an operator may be constructed out of any characters.
\begin{aside}
It is permissible (if not advisable) to mix alphanumeric characters with non-alphanumeric characters in an operator. However, the tokenizer will not recognize such a mixed operator if the first character of the operator identifier is alphanumeric.
I.e., the operator declaration:
\begin{alltt}
\#postfix("alpha\%\&beta",90)
\end{alltt}
will not be processable as a single token. Hence such operators are not permitted. However, the operator:
\begin{alltt}
\#postfix("\&more",90)
\end{alltt}
would be valid -- again, still somewhat inadvisable.
\end{aside}