diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml new file mode 100644 index 0000000000..6dc93f06d9 --- /dev/null +++ b/.github/workflows/docs.yml @@ -0,0 +1,21 @@ +name: docs +on: + push: + branches: + - docs +permissions: + contents: write +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v3 + - uses: actions/setup-python@v4 + with: + python-version: 3.x + - uses: actions/cache@v2 + with: + key: ${{ github.ref }} + path: .cache + - run: pip install mkdocs-material + - run: mkdocs gh-deploy --force \ No newline at end of file diff --git a/docs/cpp2/common.md b/docs/cpp2/common.md new file mode 100644 index 0000000000..2801f34cbf --- /dev/null +++ b/docs/cpp2/common.md @@ -0,0 +1,201 @@ +# Common programming concepts + +## `main` + +As always, `main` is the entry point of the program. For example: + +`main` can have either: + +- No parameters:   **`#!cpp main: () /*etc.*/`** + +- One parameter of implicit type named `args`:   **`#!cpp main: (args) /*etc.*/`** + + - The type of `args` cannot be explicitly specified. It is always `cpp2::args_t`, which behaves similarly to a `#!cpp const std::array`. + + - Using `args` performs zero heap allocations. Every `string_view` is directly bound to the string storage provided by host environment. + + - `args.argc` and `args.argv` additionally provide access to the raw C/C++ `main` parameters. + +``` cpp title="main with (args)" hl_lines="5 9" +// Print out command line arguments, then invoke +// a Qt event loop for a non-UI Qt application +main: (args) -> int += { + for args do (arg) { + std::cout << arg << "\n"; + } + + app: QCoreApplication = (args.argc, args.argv); + return app.exec(); +} +``` + +`main` can return: + +- `#!cpp void`, the default return value for functions. No `#!cpp return` statement is allowed in the body. In this case, the compiled Cpp1 code behaves as if `main` returned `#!cpp int`. + +- `#!cpp int`. If the body has no `#!cpp return` statement, the default is to `#!cpp return 0;` at the end of the function body. + +- Some other type that your Cpp1 compiler(s) supports as a nonstandard extension. + + +## Comments + +The usual `#!cpp // line comments` and `#!cpp /* stream comments */` are supported. For example: + +``` cpp title="Writing comments" +// A line comment: After //, the entire +// rest of the line is part of the comment + +/* + A stream comment: After /*, everything until the + next * / (without a space between) is part of the + comment. Note that stream comments do not nest. + */ +``` + + +## Reserved keywords + +Cpp2 has very few globally reserved keywords; nearly all keywords are contextual, where they have their special meaning when they appear in a particular place in the grammar. For example: + +- `new` is used as an ordinary function to do allocation (e.g., `shared.new(1, 2, 3)`). + +- `struct` and `enum` are used as function names in the metafunctions library. + +- `type` can be used as an ordinary name (e.g., `std::common_type::type`). + +In rare cases, usually when consuming code written in other languages, you may need to write a name that is a reserved keyword. The way to do that is to prefix it with `__identifer__`, which treats it as an ordinary identifier (without the prefix). + + +## Fundamental data types + +Cpp2 supports the same fundamental types as today's Cpp1, but additionally provides the following aliases in namespace `cpp2`: + +| Fixed-width types | Synonym for | +|---|---| +| `i8` | `std::int8_t` | +| `i16` | `std::int16_t` | +| `i32` | `std::int32_t` | +| `i64` | `std::int64_t` | +| `u8` | `std::uint8_t` | +| `u16` | `std::uint16_t` | +| `u32` | `std::uint32_t` | +| `u64` | `std::uint64_t` | + +| Variable-width types
(Cpp2-compatible single-word names) | Synonym for (these multi-word
names are not allowed in Cpp2) | +|---|---| +| `ushort` | `#!cpp unsigned short` | +| `uint` | `#!cpp unsigned int` | +| `ulong` | `#!cpp unsigned long` | +| `longlong` | `#!cpp long long` | +| `ulonglong` | `#!cpp unsigned long long` | +| `longdouble` | `#!cpp long double` | + +| For compatibility/interop only,
so deliberately ugly names | Synonym for (these multi-word
names are not allowed in Cpp2) | Notes | +|---|---|---| +| `_schar` | `#!cpp signed char` | Normally, prefer `i8` instead | +| `_uchar` | `#!cpp unsigned char` | Normally, prefer `u8` instead | + +## Type qualifiers + +Types can be qualified with `#!cpp const` and `#!cpp *`. Types are written left-to-right, so a qualifier always applies to what immediately follows it. For example, to declare a `#!cpp const` pointer to a non-`#!cpp const` pointer to a `#!cpp const i32` object, write: + +``` cpp title="Using type qualifiers" +// A const pointer to a non-const pointer to a const i32 object +p: const * * const i32; +``` + +## Literals + +Cpp2 supports the same `#!cpp 'c'`haracter, `#!cpp "string"`, binary, integer, and floating point literals as Cpp1, including most Unicode encoding prefixes and raw string literals. + +Cpp2 supports using Cpp1 user-defined literals for compatibility, to support seamlessly using existing libraries. However, because Cpp2 has unified function call syntax (UFCS), the preferred way to author the equivalent in Cpp2 is to just write a function or type name as a `.` call suffix. For example: + +- You can create a `u8` value by writing either `u8(123)` or **`123.u8()`**. [^u8using] + +- You can write a 'constexpr' function like `#!cpp nm: (value: i64) -> my_nanometer_type == { /*...*/ }` that takes an integer and returns a value of a strongly typed "nanometer" type, and then create a `nm` value by writing either `nm(123)` or **`123.nm()`**. + +Both **`123.nm()`** and **`123.u8()`** are very similar to user-defined literal syntax, and more general. + +## Operators + +Operators have the same precedence and associativity as in Cpp1, but some unary operators that are prefix (always or sometimes) in Cpp1 are postfix (always) in Cpp2. + +### Unary operators + +The operators `!`, `+`, and `-` are prefix, as in Cpp1. For example: + +``` cpp title="Using prefix operators" +if !vec.empty() { + vec.emplace_back( -123.45 ); +} +``` + +| Unary operator | Cpp2 example | Cpp1 equivalent | +|---|---|---| +| `!` | `!vec.empty()` | `!vec.empty()` | +| `+` | `#!cpp +100` | `#!cpp +100` | +| `-` | `#!cpp -100` | `#!cpp -100` | + +The operators `.`, `*`, `&`, `~`, `++`, `--`, `()`, `[]`, and `$` are postfix. For example: + +``` cpp title="Using postfix operators" +// Cpp1 examples, from cppfront's own source code: +// address = &(*tokens)[pos + num]; +// is_void = *(*u)->identifier == "void"; +// Cpp2 equivalents: + address = tokens*[pos + num]&; + is_void = u**.identifier* == "void"; +``` + +Postfix notation lets the code read fluidly left-to-right, in the same order in which the operators will be applied, and lets declaration syntax be consistent with usage syntax. For more details, see [Design note: Postfix operators](https://github.com/hsutter/cppfront/wiki/Design-note%3A-Postfix-operators). + +> Note: The function call syntax `f(x)` calls a namespace-scope function, or a function object, named `f`. The function call syntax `x.f()` is a unified function call syntax (aka UFCS) that calls a type-scope function in the type of `x` if available, otherwise calls the same as `f(x)`. For details, see [Design note: UFCS](https://github.com/hsutter/cppfront/wiki/Design-note%3A-UFCS). + +| Unary operator | Cpp2 example | Cpp1 equivalent | +|---|---|---| +| `#!cpp .` | `#!cpp obj.f()` | `#!cpp obj.f()` | +| `#!cpp *` | `#!cpp pobj*.f()` | `#!cpp (*pobj).f()` or `#!cpp pobj->f()` | +| `#!cpp &` | `#!cpp obj&` | `#!cpp &obj` | +| `#!cpp ~` | `#!cpp val~` | `#!cpp ~val` | +| `#!cpp ++` | `#!cpp iter++` | `#!cpp ++iter` | +| `#!cpp --` | `#!cpp iter--` | `#!cpp --iter` | +| `(` `)` | `#!cpp f( 1, 2, 3)` | `#!cpp f( 1, 2, 3)` | +| `[` `]` | `#!cpp vec[123]` | `#!cpp vec[123]` | +| `$` | `val$` | _reflection — no Cpp1 equivalent yet_ | + +> Because `++` and `--` always have in-place update semantics, we never need to remember "use prefix `++`/`--` unless you need a copy of the old value." If you do need a copy of the old value, just take the copy before calling `++`/`--`. + +Unary suffix operators must not be preceded by whitespace. When `*`, `&`, and `~` are used as binary operators they must be preceded by whitespace. For example: + +| Unary postfix operators that
are also binary operators | Cpp2 example | Cpp1 equivalent | +|---|---|---| +| `#!cpp *` | `#!cpp pobj* * 42` | `#!cpp (*pobj)*42` | +| `#!cpp &` | `#!cpp obj& & mask`

(note: allowed in unsafe code only) | `#!cpp &obj & mask` | +| `#!cpp ~` | `#!cpp ~val ~ bitcomplement` | `#!cpp val~ ~ bitcomplement` | + +For more details, see [Design note: Postfix unary operators vs binary operators](https://github.com/hsutter/cppfront/wiki/Design-note%3A-Postfix-unary-operators-vs-binary-operators). + + +### Binary operators + +Binary operators are the same as in Cpp1. From highest to lowest precedence: + +| Binary operators grouped by precedence | +|---| +| `*`, `/`, `%` | +| `+`, `-` | +| `<<`, `>>` | +| `<=>` | +| `<`, `>`, `<=`, `>=` | +| `==`, `!=` | +| `&` | +| `^` | +| `|` | +| `&&` | +| `||` | +| `=` and compound assignment | + + +[^u8using]: Or `123.cpp2::u8()` if you aren't `using` the namespace or that specific name. diff --git a/docs/cpp2/contracts.md b/docs/cpp2/contracts.md new file mode 100644 index 0000000000..b72e003773 --- /dev/null +++ b/docs/cpp2/contracts.md @@ -0,0 +1,125 @@ + +# Contracts + +## Overview + +Cpp2 currently supports three kinds of contracts: + +- **Preconditions and postconditions.** A function declaration can include `pre(condition)` and `post(condition)` before the `= /* function body */`. Before entering the function body, preconditions are fully evaluated and postconditions are captured (and performs their captures, if any). Immediately before exiting the function body via a normal return, postconditions are evaluated. If the function exits via an exception, postconditions are not evaluated. + +- **Assertions.** Inside a function body, writing `assert(condition)` assertion statements. Assertions are evaluated when control flow passes through them. + +Notes: + +- `condition` is an expression that evaluates to `#!cpp true` or `#!cpp false`. + +- Optionally, `condition` may be followed by `, "message"`, a message to include if a violation occurs. For example, `pre(condition, "message")`. + +- Optionally, a `` can be written inside `<` `>` angle brackets immediately before the `(`, to designate that this test is part of the [contract group](#contract-groups) named `Group`. If a violation occurs, `Group.report_violation()` will be called. For example, `pre(condition)`. + +For example: + +``` cpp title="Precondition and postcondition examples" hl_lines="2 3" +insert_at: (container, where: int, val: int) + pre( 0 <= where <= vec.ssize(), "position (where)$ is outside 'val'" ) + post ( container.ssize() == container.ssize()$ + 1 ) += { + _ = container.insert( container.begin()+where, val ); +} +``` + +In this example: + +- The `$` captures are performed before entering the function. + +- The precondition is part of the `Bounds` safety contract group and is checked before entering the function. If the check fails, say because `where` is `#!cpp -1`, then `#!cpp cpp2::Bounds.report_violation("position -1 is outside 'val'")` is called. + +- The postcondition is part of the `Default` safety contract group. If the check fails, then `#!cpp cpp2::Default.report_violation()` is called. + + +## Contract groups + +Contract groups are useful to enable or disable or [set custom handlers](#violation-handlers) independently for different groups of contracts. A contract group `G` is just the name of an object that can be called with `G.report_violation()` and `G.report_violation(message)`, where `message` is a `* const char` C-style text string. + +You can create new contract groups just by creating new objects that have a `.report_violation` function. The object's name is the contract group's name. The object can be at any scope: local, global, or heap. + +For example, here are some ways to use contract groups, for convenience using [`cpp2::contract_group`](#violation_handlers) which is a convenient group type: + +``` cpp title="Using contract groups" hl_lines="1 4 6 10-12" +GroupA: cpp2::contract_group = (); // a global group + +func: () = { + GroupB: cpp2::contract_group = (); // a local group + + GroupC := new(); // a dynamically allocated group + + // ... + + assert( some && condition ); + assert( another && condition ); + assert( another && condition ); +} +``` + +You can make all the objects in a class hierarchy into a contract group by having a `.report_violation` function in a base class, and then writing contracts in that hierarchy using `` as desired. This technique used in cppfront's own reflection API: + +``` cpp title="Example of using 'this' as a contract group, from cppfront 'reflect.h2'" hl_lines="8 9" +function_declaration: @copyable type = +{ + // inherits from a base class that provides '.report_violation' + + // ... + + add_initializer: (inout this, source: std::string_view) + pre (!has_initializer(), "cannot add an initializer to a function that already has one") + pre (parent_is_type(), "cannot add an initializer to a function that isn't in a type scope") + = { /*...*/ } + + // ... + +} +``` + + +## `cpp2::contract_group`, and customizable violation handling + +The contract group object could also provide additional functionality. For example, Cpp2 comes with the `cpp2::contract_group` type which allows installing a customizable handler for each object. Each object can only have one handler at a time, but the handler can change during the course of the program. `contract_group` supports: + +- `.set_handler(pfunc)` accepts a pointer to a handler function with signature `#!cpp * (* const char)`. + +- `.get_handler()` returns the current handler function pointer, or null if none is installed. + +- `.has_handler()` returns whether there is a current handler installed. + +- `.enforce(condition, message)` evaluates `condition`, and if it is `false` then calls `.report_violation(message)`. + +Cpp2 comes with five predefined `contract group` global objects in namespace `cpp2`: + +- `Default`, which is used as the default contract group for contracts that don't specify a group. + +- `Type` for type safety checks. + +- `Bounds` for bounds safety checks. + +- `Null` for null safety checks. + +- `Testing` for general test checks. + +For these groups, the default handler is `cpp2::report_and_terminate`, which prints information about the violation to `std::cerr` and then calls `std::terminate()`. But you can customize it to do anything you want, including to integrate with any third-party or in-house error reporting system your project is already using. For example: + +``` cpp title="Example of customized contract violation handler" hl_lines="2 8-9" +main: () -> int = { + cpp2::Default.set_handler(call_my_framework&); + assert(false, "this is a test, this is only a test"); + std::cout << "done\n"; +} + +call_my_framework: (msg: * const char) = { + // You can do anything you like here, including arbitrary work + // and integration with your current error reporting libraries + std::cout + << "sending error to my framework... [" + << msg << "]\n"; + exit(0); +} +``` diff --git a/docs/cpp2/declarations.md b/docs/cpp2/declarations.md new file mode 100644 index 0000000000..b829b0b12e --- /dev/null +++ b/docs/cpp2/declarations.md @@ -0,0 +1,282 @@ +# Declarations and aliases + +## Unified declarations + +All Cpp2 declarations are written as **"_name_ `:` _kind_ `=` _statement_"**. + +- The _name_ must be a valid identifier (start with a letter, and consist of letters, digits, or `_`). The name can be variadic (be a name for a list of zero or more things) by writing a `...` suffix at the end of the name. + +- The `:` is pronounced **"is a."** + +- The _kind_ can start with [template parameters](#template-parameters) and end with [`#!cpp requires` constraints](#requires). + +- The `=` is pronounced **"defined as."** For the definition of something that will always have the same value, write `==`, pronounced **"defined as a synonym for"**. + +- The _statement_ is typically an expression statement (e.g., `#!cpp a + b();`) or a compound statement (e.g., `#!cpp { /*...*/ return c(d) / e; }`). + +Various parts of the syntax allow a `_` "don't care" wildcard or can be omitted entirely to accept a default (e.g., `#!cpp x: int = 0;` can be equivalently written `#!cpp x: _ = 0;` or `#!cpp x := 0;` both of which deduce the type). + +> Notes: +> +> - When the type is omitted, whitespace does not matter, and writing `#!cpp x: = 0;` or `#!cpp x : = 0;` or `#!cpp x := 0;` or other whitespace is just a stylistic choice. This documentation's style uses the last one, except when there are multiple adjacent declaration lines this style lines up their `:` and `=`. +> +> - `==` stresses that this name will always have the given value, to express [aliases](#aliases) and side-effect-free 'constexpr' [function aliases](#function-aliases). + + +### Unnamed declaration expressions + +In an expression, most declarations can be written without a name (just starting with `:`). Such unnamed declaration expressions are useful for single-use temporary variables or 'lambda' functions that don't need a name to be reused elsewhere. For example: + +- `#!cpp :widget = 42` is an unnamed expression-local (aka temporary) object of type `widget` defined as having the initial value `#!cpp 42`. It uses the same general syntax, just without declaring a name. + +- `#!cpp :(x) = std::cout << x` is an unnamed expression-local generic function expression (aka lambda) defined as having the given one-statement body. The body can include [captures](expressions.md/#captures). + +Both just omit the name and make the final `;` optional. Otherwise, they have the identical syntax and meaning as if you declared the same thing with a name outside expression scope (e.g., `w: widget = 42;` or `f: (x) = std::cout << x;`) and then used the name in the expression. + +> Note: Throughout Cpp2, every declaration is written with `:`, and every use of `:` is a declaration. + + + +### From functions to local scopes, and back again + +The function syntax is deliberately designed to be general, so you can omit parts. This means Cpp2 has no special "lambda function" syntax for unnamed functions; an unnamed function is really an unnamed function, written using the ordinary function just without a name. This scales all the way down to ordinary blocks and statements, which are written the same as functions that have no name or parameters. + +We can illustrate this in two directions. First, let's start with a full function, and successively omit optional parts that we aren't currently using: + +``` cpp title="Start with a full function, and successively omit optional parts if unused" hl_lines="1 5 9 13" +// Full named function +f:(x: int = init) = { /*...*/ } // x is a parameter to the function +f:(x: int = init) = statement; // same, { } is implicit + +// Omit name => anonymous function (aka 'lamba') + :(x: int = init) = { /*...*/ } // x is a parameter to the function + :(x: int = init) = statement; // same, { } is implicit + +// Omit declaration => local and immediate (aka 'let' in other languages) + (x: int = init) { /*...*/ } // x is a parameter to this + (x: int = init) statement; // compound or single-statement + +// Omit parameters => ordinary block or statement + { /*...*/ } // ordinary compound statement + statement; // ordinary statement +``` + +Conversely, we can start with an ordinary block or statement, and successively build it up to make it more powerful: + +``` cpp title="Start with an ordinary block or statement, and successively add parts" hl_lines="1 5 9 13" +// Ordinary block or statement + { /*...*/ } // ordinary compound statement + statement; // ordinary statement + +// Add parameters => more RAII locally-scoped variables + (x: int = init) { /*...*/ } // x is destroyed after this + (x: int = init) statement; // compound or single-statement + +// Add declaration => treat the code as a callable object + :(x: int = init) = { /*...*/ } // x is a parameter to the function + :(x: int = init) = statement; // same, { } is implicit + +// Add name => full named function +f:(x: int = init) = { /*...*/ } // x is a parameter to the function +f:(x: int = init) = statement; // same, { } is implicit + +``` + + +### Template parameters + +A template parameter list is enclosed by `<` `>` angle brackets, and the parameters separated by commas. Each parameter is declared using the [same syntax as any type or object](declarations.md). If a parameter's **`:`** ***kind*** is not specified, the default is `: type`. + +For example: + +``` cpp title="Declaring template parameters" hl_lines="1-3 8-9" +array: type + // parameter T is a type + // parameter size is a 32-bit int += { + // ... +} + +tuple: type + // parameter Ts is variadic list of zero or more types += { + // ... +} +``` + + +### `#!cpp requires` constraints + +A `#!cpp requires` ***condition*** constraint appears at the end of the ***kind*** of a templated declaration. If the condition evaluates to `#!cpp false`, that specialization of the template is ignored as if not declared. + +For example: + +``` cpp title="A requires constraint on a variadic function" hl_lines="3" +print: + (inout out: std::ostream, args...: Args) + requires sizeof...(Args) >= 1u += { + (out << ... << args); +} +``` + + +### Examples + +``` cpp title="Consistent declarations — name : kind = statement" linenums="1" hl_lines="2 6 10 15 24 28 32 43 49 53" +// n is a namespace defined as the following scope +n: namespace += { + // shape is a templated type with one type parameter T + // (equivalent to '') defined as the following scope + shape: type + = { + // point is a type defined as being always the same as + // (i.e., an alias for) T + point_type: type == T; + + // points is an object of type std::vector, + // defined as having an empty default value + // (type-scope objects are private by default) + points: std::vector = (); + + // draw is a function taking 'this' and 'canvas' parameters + // and returning bool, defined as the following body + // (type-scope functions are public by default) + // + // this is an object of type shape (as if written 'this: shape') + // + // where is an object of type canvas + draw: (this, where: canvas) -> bool + = { + // pen is an object of deduced (omitted) type 'color', + // defined as having initial value 'color::red' + pen := color::red; + + // success is an object of deduced (omitted) type bool, + // defined as having initial value 'false' + success := false; + + // ... + + return success; + } + + // count is a function taking 'this' and returning a type + // deduced from its body, defined as a single-expression body + // (equivalent to '= { return points.ssize(); }' but omitting + // syntax where we're using the language defaults) + count: (this) = points.ssize(); + + // ... + } + + // color is an @enum type (see Note) defined as having these enumerators + color: @enum type = { red; green; blue; } + + // calc_next_year is a function defined as always returning the same + // value for the same input (i.e., 'constexpr', side effect-free) + calc_next_year: (year: i32) -> i32 == year + 1; +} +``` + +> Note: `@enum` is a metafunction, which provides an easy way to opt into a group of defaults, constraints, and generated functions. For details, see [`@enum`](metafunctions.md#enum). + + +## Aliases + +Aliases are pronounced **"synonym for"**, and written using the same **name `:` kind `=` value** [declaration syntax](../cpp2/declarations.md) as everything in Cpp2: + +- **name** is declared to be a synonym for **value**. + +- **kind** can be any of the kinds: `namespace`, `type`, a function signature, or a type. + +- **`==`**, pronounced **"defined as a synonym for"**, always precedes the value. The `==` syntax stresses that during compilation every use of the name could be equivalently replaced with the value. + +- **value** is the expression that the **name** is a synonym for. + + +### Namespace aliases + +A namespace alias is written the same way as a [namespace](namespaces.md), but using `==` and with the name of another namespace as its value. For example: + +``` cpp title="Namespace aliases" hl_lines="1 2 4 5 8 12 16" +// 'chr' is a namespace defined as a synonym for 'std::chrono' +chr : namespace == std::chrono; + +// 'chrlit' is a namespace defined as a synonym for 'std::chrono_literals' +chrlit : namespace == std::chrono_literals; + +main: () = { + using namespace chrlit; + + // The next two lines are equivalent + std::cout << "1s is (std::chrono::nanoseconds(1s).count())$ns\n"; + std::cout << "1s is (chr::nanoseconds(1s).count())$ns\n"; +} +// Prints: +// 1s is 1000000000ns +// 1s is 1000000000ns +``` + + +### Type aliases + +A type alias is written the same way as a [type](types.md), but using `==` and with the name of another type as its value. For example: + +``` cpp title="Type aliases" hl_lines="1 2 7 10" +// 'imap' is a type defined as a synonym for 'std::map' +imap : type == std::map; + +main: () = { + // The next two lines declare two objects with identical type + map1: std::map = (); + map2: imap = (); + + // Assertion they are the same type, using the same_as concept + static_assert( std::same_as< decltype(map1), decltype(map2) > ); +} +``` + + +### Function aliases + +A function alias is written the same way as a [function](functions.md), but using `==` and with a side-effect-free body as its value; the body must always return the same value for the same input arguments. For example: + +``` cpp title="Function aliases" hl_lines="1 2 6 9 12 15" +// 'square' is a function defined as a synonym for the value of 'i * i' +square: (i: i32) -> _ == i * i; + +main: () = { + // It can be used at compile time, with compile time values + ints: std::array = (); + + // Assertion that the size is the square of 4 + static_assert( ints.size() == 16 ); + + // And if can be used at run time, with run time values + std::cout << "the square of 4 is (square(4))$\n"; +} +// Prints: +// the square of 4 is 16 +``` + +> Note: A function alias is compiled to a Cpp1 `#!cpp constexpr` function. + + +### Object aliases + +An object alias is written the same way as an [object](objects.md), but using `==` and with a side-effect-free value. For example: + +``` cpp title="Function aliases" hl_lines="1 2 5 6" +// 'BufferSize' is an object defined as a synonym for the value 1'000'000 +BufferSize: i32 == 1'000'000; + +main: () = { + buf: std::array = (); + static_assert( buf.size() == BufferSize ); +} +``` + +> Note: An object alias is compiled to a Cpp1 `#!cpp constexpr` object. + diff --git a/docs/cpp2/expressions.md b/docs/cpp2/expressions.md new file mode 100644 index 0000000000..3be6718f11 --- /dev/null +++ b/docs/cpp2/expressions.md @@ -0,0 +1,307 @@ + +# Common expressions + +## Calling functions: `f(x)` syntax, and `x.f()` UFCS syntax + +A function call like `f(x)` is a normal function call that will call non-member functions only, as usual in C++. + +A function call like `x.f()` is a unified function call syntax (aka UFCS) call. It will call a member function if one is available, and otherwise will call `f(x)`. Having UFCS is important for generic code that may want to call a member or a non-member function, whichever is available. It's also important to enable fluid programming styles and natural IDE autocompletion support. + +An operator notation call like `#!cpp a + b` will call an overloaded operator function if one is available, as usual in C++. + +For example: + +``` cpp title="Function calls" hl_lines="3 7 11 16 19 20" +// Generic function to log something +// This calls operator<< using operator notation +log: (x) = clog << x; + +f: ( v : std::vector ) = { + // This calls log() with the result of std::vector::size() + log( v.size() ); + + // This calls log() with the result of std::ssize(v), because + // v doesn't have a .ssize member function + log( v.ssize() ); +} + +// Generic function to print "hello, ___!" for any printable type +hello: (name) = { + myfile := fopen("xyzzy.txt", "w"); + // Direct calls to C nonmember functions, using UFCS and safe + // string interpolation (instead of type-unsafe format strings) + myfile.fprintf( "Hello, (name)$!\n" ); + myfile.fclose(); + // The C and C++ standard libraries are not only fully available, + // but safer (and arguably nicer) when used from Cpp2 syntax code +} +``` + +To explicitly treat an object name passed as an argument as `move` or `out`, write that keyword before the variable name. + +- Explicit `move` is rarely needed. Every definite last use of a local variable will apply `move` by default. Writing `move` from an object before its definite last use means that later uses may see a moved-from state. + +- Explicit `out` is needed only when initializing a local variable separately from its declaration using a call to a function with an `out` parameter. For details, see [Guaranteed initialization](../cpp2/objects.md#init). + +For example: + + + +## `_` — the "don't care" wildcard, including explicit discard + +`_` is pronounced **"don't care"** and allowed as a wildcard in most contexts. For example: + +``` cpp title="Using the _ wildcard" hl_lines="2 5 11" +// We don't care about the guard variable's name +_ : std::lock_guard = mut; + +// If we don't care to write the variable's type, deduce it +x : _ = 42; + // in cases like this, _ can be omitted... + // this is equivalent to "x := 42;" + +return inspect v -> std::string { + is std::vector = "v is a std::vector"; + is _ = "unknown"; // don't care what else, match anything +}; +``` + +Cpp2 treats all function outputs (return values, and results produced via `inout` and `out` parameters) as important, and does not let them be silently discarded by default. To explicitly discard such a value, assign it to `_`. For example: + +``` cpp title="Using _ for explicit discard" hl_lines="1 8" +_ = vec.emplace_back(1,2,3); + // "_ =" is required to explicitly discard emplace_back's + // return value (which is non-void since C++17) + +{ + x := my_vector.begin(); + std::advance(x, 2); + _ = x; // required to explicitly discard x's new value, + // because std::advance modifies x's value +} +``` + +For details, see [Design note: Explicit discard](https://github.com/hsutter/cppfront/wiki/Design-note%3A-Explicit-discard). In Cpp2, data is always initialized, data is never silently lost, data flow is always visible. Data is precious, and it's always safe. + + +## `is` — safe type/value queries + +An `x is C` expression allows safe type and value queries, and evaluates to `#!cpp true` if `x` matches constraint `C`. It supports both static and dynamic queries, including customization, with support for standard library dynamic types like `std::variant`, `std::optional`, `std::expected`, and `std::any` provided out of the box. + +There are two kinds of `is`: + +- A **type query**, where `C` is a type constraint: a type, a template name, a concept, or a type predicate. Here `x` may be a type, or an object or expression; if it is an object or expression, the query refers to `x`'s type. + +| Type constraint kind | Example | +|---|---| +| Static type query | `x is int` | +| Dynamic type query | `ptr* is Shape` | +| Static template type query | `x is std::vector` | +| Static concept query | `x is std::integral` | + +- A **value query**, where `C` is a value constraint: a value, or a value predicate. Here `x` must be an object or expression. + +| Value constraint kind | Example | +|---|---| +| Value | `#!cpp x is 0` | +| Value predicate | `#!cpp x is (in(10, 20))` | + +`is` is useful throughout the language, including in `inspect` pattern matching alternatives. `is` is extensible, and works out of the box with `std::variant`, `std::optional`, `std::expected`, and `std::any`. For examples, see: + +- [`mixed-inspect-templates.cpp2`](https://github.com/hsutter/cppfront/tree/main/regression-tests/mixed-inspect-templates.cpp2) +- [`mixed-inspect-values.cpp2`](https://github.com/hsutter/cppfront/tree/main/regression-tests/mixed-inspect-values.cpp2) +- [`mixed-inspect-values-2.cpp2`](https://github.com/hsutter/cppfront/tree/main/regression-tests/mixed-inspect-values-2.cpp2) +- [`mixed-type-safety-1.cpp2`](https://github.com/hsutter/cppfront/tree/main/regression-tests/mixed-type-safety-1.cpp2) +- [`pure2-enum.cpp2`](https://github.com/hsutter/cppfront/tree/main/regression-tests/pure2-enum.cpp2) +- [`pure2-inspect-expression-in-generic-function-multiple-types.cpp2`](https://github.com/hsutter/cppfront/tree/main/regression-tests/pure2-inspect-expression-in-generic-function-multiple-types.cpp2) +- [`pure2-inspect-fallback-with-variant-any-optional.cpp2`](https://github.com/hsutter/cppfront/tree/main/regression-tests/pure2-inspect-fallback-with-variant-any-optional.cpp2) +- [`pure2-type-safety-1.cpp2`](https://github.com/hsutter/cppfront/tree/main/regression-tests/pure2-type-safety-1.cpp2) +- [`pure2-type-safety-2-with-inspect-expression.cpp2`](https://github.com/hsutter/cppfront/tree/main/regression-tests/pure2-type-safety-2-with-inspect-expression.cpp2) + +Here are some `is` queries with their Cpp1 equivalents. In this table, uppercase names are type names, lowercase names are objects, `v` is a `std::variant` where one alternative is `T`, `o` is a `std::optional`, and `a` is a `std::any`: + +| Some sample `is` queries | Cpp1 equivalent +|---|---| +| `X is Y && Y is X` | `std::is_same_v` | +| `D is B` | `std::is_base_of` | +| `#!cpp pb is *D` | `#!cpp dynamic_cast(pb) != nullptr` | +| `v is T` | `std::holds_alternative(v)` | +| `a is T` | `#!cpp a.type() == typeid(T)` | +| `o is T` | `o.has_value()` | + +> Note: `is` unifies a variety of differently-named Cpp1 language and library queries under one syntax, and supports only the type-safe ones. + + +## `as` — safe casts and conversions + +An `x as T` expression allows safe type casts. `x` must be an object or expression, and `T` must be a type. Like `is`, `as` supports both static and dynamic typing, including customization, with support for standard library dynamic types like `std::variant`, `std::optional`, `std::expected`, and `std::any` provided out of the box. For example: + +``` cpp title="Using as" hl_lines="4 6 14" +main: () = { + a: std::any = 0; // a's type is now int, value 0 + test(a); // prints "zero" + a = "plugh" as std::string; // a's type is now std::string, value "plugh" + test(a); // prints "plugh" + test("xyzzy" as std::string); // prints "xyzzy" +} + +// A generic function that takes an argument 'x' of any type, +// same as "void test( auto const& x )" in C++20 syntax +test: (x) = { + std::cout << inspect x -> std::string { + is 0 = "zero"; + is std::string = x as std::string; + is _ = "(no match)"; + } << "\n"; +} +``` + +Here are some `as` casts with their Cpp1 equivalents. In this table, uppercase names are type names, lowercase names are objects, `v` is a `std::variant` where one alternative is `T`, `o` is a `std::optional`, and `a` is a `std::any`: + +| Some sample `as` casts | Cpp1 equivalent +|---|---| +| `x as Y` | `Y{x}` | +| `#!cpp pb as *D` | `#!cpp dynamic_cast(pb)` | +| `v as T` | `std::get(v)` | +| `a as T` | `std::any_cast(a)` | +| `o as T` | `o.value()` | + +> Note: `as` unifies a variety of differently-named Cpp1 language and library casts and conversions under one syntax, and supports only the type-safe ones. + + +## `inspect` — pattern matching + +An `inspect expr -> Type` expression allows pattern matching using `is`. + +- `expr` is evaluated once. + +- Each alternative spelled `is C` is evaluated in order as if called with `expr is C`. + +- If an alternative evaluates to `#!cpp true`, then its `#!cpp = alternative;` body is used as the value of the entire `inspect` expression, and the meaning is the same as if the entire `inspect` expression had been written as just `#!cpp :Type = alternative;` — i.e., an unnamed object expression (aka 'temporary object') of type `Type` initialized with `alternative`. + +- A catchall `is _` is required. + +For example: + +``` cpp title="Using inspect" hl_lines="6-13" +// A generic function that takes an argument 'x' of any type +// and inspects various things about `x` +test: (x) = { + forty_two := 42; + std::cout + << inspect x -> std::string { + is 0 = "zero"; // == 0 + is (forty_two) = "the answer"; // == 42 + is int = "integer"; // is type int (and not 0 or 42) + is std::string = x as std::string; // is type std::string + is std::vector = "a std::vector"; // is a vector + is _ = "(no match)"; // is something else + } + << "\n"; +} + +// Sample call site +test(42); + // Behaves as if the following function were called: + // test: (x) = { std::cout << (:std::string = "the answer") << "\n"; } + // (and that's why inspect alternatives are introduced with '=') +``` + +For more examples, see also the examples in the previous two sections on `is` and `as`, many of which use `inspect`. + + +## `$` — captures, including interpolations + +Suffix `$` is pronounced **"paste the value of"** and captures the value of an expression at the point when the expression where the capture is written is evaluated. Depending on the complexity of the capture expression `expr$` and where it is used, parentheses `(expr)$` may be required for precedence or to show the boundaries of the expression. + +`x$` always captures `x` by value. To capture by reference, take the address and capture a pointer using `x&$`. If the value is immediately used, dereference again; for example `:(val) total&$* += val` adds to the `total` local variable itself, not a copy. + +Captures are evaluated at the point where they are written in function expressions, contract postconditions, and string literals. The stored captured value can then be used later when evaluating its context, such as when the function expression body containing the captured value is actually called later (one or more times), when the postcondition containing the captured value is evaluated later when the function returns, or when the string literal containing the captured value is read later. + +The design and syntax are selected so that capture is spelled the same way in all contexts. For details, see [Design note: Capture](https://github.com/hsutter/cppfront/wiki/Design-note%3A-Capture). + + +### Capture in function expressions (aka lambdas) + +Any capture in a function expression body is evaluated at the point where the function expression is written, at the declaration of the function expression. The function expression itself is then evaluated each time the function is invoked, and can reference the captured value. + +For example: + +``` cpp title="Capture in an unnamed function expression (aka lambda)" hl_lines="7 8 13-18" +main: () = { + s := "-ish\n"; + vec: std::vector = (1, 2, 3, 5, 8, 13 ); + + std::ranges::for_each( + vec, + :(i) = std::cout << i << s$ + // Function capture: Paste the value of 's' + ); +} + +// prints: +// 1-ish +// 2-ish +// 3-ish +// 5-ish +// 8-ish +// 13-ish +``` + +Another example: + +``` cpp title="Capture in a named function expression (aka lambda)" hl_lines="2 4 7 12 13" +main: () = { + price := 100; + func := :() = { + std::cout << "Price = " << price$ << "\n"; + }; + func(); + price = 200; + func(); +} + +// prints: +// Price = 100 +// Price = 100 +``` + + +### Capture in contract postconditions + +Any capture in a postcondition is evaluated at the point where the postcondition is written, at the beginning (entry) of the function. The postcondition itself is then evaluated when the function returns, and can reference the captured value. + +For example: + +``` cpp title="Capture in contract postconditions" hl_lines="2" +push_back: (coll, value) + [[post: coll.ssize() == coll.ssize()$ + 1]] + // Paste the value of `coll.ssize()` += { + // ... +} +``` + + +### Capture in string interpolation + +A string literal can capture the value of an expression `expr` by writing `(expr)$` inside the string literal. The `(` `)` are required, and cannot be nested. A string literal has type `std::string` if it performs any captures, otherwise it is a normal C/C++ string literal (array of characters). + +Any capture in a string literal is evaluated at the point where the string literal is written. The string literal can be used repeatedly later, and includes the captured value. + +For example: + +``` cpp title="Capture for string interpolation" hl_lines="2 5" +x := 0; +std::cout << "x is (x)$\n"; + // Paste the value of `x` +x = 1; +std::cout << "now x+2 is (x+2)$\n"; + // Paste the value of `x+2` + +// prints: +// x is 0 +// now x+2 is 3 +``` + +A string literal capture can include a `:suffix` where the suffix is a [standard C++ format specification](https://en.cppreference.com/w/cpp/utility/format/spec). For example, `#!cpp (x.price(): <10.2f)$` evaluates `x.price()` and converts the result to a string with 10-character width, 2 digits of precision, and left-justified. diff --git a/docs/cpp2/functions.md b/docs/cpp2/functions.md new file mode 100644 index 0000000000..59f7708ecc --- /dev/null +++ b/docs/cpp2/functions.md @@ -0,0 +1,387 @@ + +# Functions + +## Overview + +A function is defined by writing a function signature after the `:` and a statement (expression or `{` `}` compound statement) after the `=`. After the optional [template parameters](declarations.md#template-parameters) available for all declarations, a function signature consists of a possibly-empty [parameter list](#parameters), and one or more optional [return values](#return-values). + +For example, the minimal function named `func` that takes no parameters and returns nothing (`#!cpp void`) is: + +``` cpp title="A minimal function" +func: ( /* no parameters */ ) = { /* empty body */ } +``` + + +## Parameters + +The parameter list is enclosed by `(` `)` parentheses and the parameters are separated by commas. Each parameter is declared using the [same unified syntax](declarations.md) as used for all declarations. For example: + +``` cpp title="Declaring parameters" hl_lines="2-4" +func: ( + x: i32, // parameter x is a 32-bit int + y: std::string, // parameter y is a std::string + z: std::map // parameter z is a std::map + ) += { + // ... +} +``` + +There are six ways to pass parameters that cover all use cases: + +| Parameter ***kind*** | "Pass me an `x` I can ______" | Accepts arguments that are | Special semantics | ***kind*** `x: X` Compiles to Cpp1 as | +|---|---|---|---|---| +| `in` (default) | read from | anything | always `#!cpp const`

automatically passes by value if cheaply copyable | `X const x` or `X const& x` | +| `copy` | take a copy of | anything | acts like a normal local variable initialized with the argument | `X x` | +| `inout` | read from and write to | lvalues | | `X& x` | +| `out` | write to (including construct) | lvalues, including uninitialized lvalues | must `=` assign/construct before other uses | `cpp2::out` | +| `move` | move from | rvalues | automatically moves from every definite last use | `X&&` | +| `forward` | forward | anything | automatically forwards from every definite last use | `T&&` constrained to type `X` | + + +> Note: All parameters and other objects in Cpp2 are `#!cpp const` by default, except for local variables. For details, see [Design note: `#!cpp const` objects by default](https://github.com/hsutter/cppfront/wiki/Design-note%3A-const-objects-by-default). + + +## Return values + +A function can return either of the following. The default is `#!cpp -> void`. + +(1) **`#!cpp -> X`** to return a single unnamed value of type `X`, which can be `#!cpp void` to signify the function has no return value. If `X` is not `#!cpp void`, the function body must have a `#!cpp return /*value*/;` statement that returns a value of type `X` on every path that exits the function. For example: + +``` cpp title="Functions with an unnamed return value" hl_lines="2 4 7 9 12 14" +// A function returning no value (void) +increment_in_place: (inout a: i32) -> void = { a++; } +// Or, using syntactic defaults, the following has identical meaning: +increment_in_place: (inout a: i32) = a++; + +// A function returning a single value of type i32 +add_one: (a: i32) -> i32 = { return a+1; } +// Or, using syntactic defaults, the following has identical meaning: +add_one: (a: i32) -> i32 = a+1; + +// A generic function returning a single value of deduced type +add: (a:T, b:U) -> decltype(a+b) = { return a+b; } +// Or, using syntactic defaults, the following has identical meaning: +add: (a, b) -> _ = a+b; +``` + +(2) **`#!cpp -> ( /* parameter list */ )`** to return a list of named return parameters using the same [parameters](#parameters) syntax, but where the only passing styles are `out` (the default, which moves where possible) or `forward`. The function body must [initialize](objects.md#init) the value of each return-parameter `ret` in its body the same way as any other local variable. An explicit return statement is written just `#!cpp return;` and returns the named values; the function has an implicit `#!cpp return;` at the end. For example: + +``` cpp title="Function with multiple/named return values" hl_lines="1 3-4 7-8 14 16-17" +divide: (dividend: int, divisor: int) -> (quotient: int, remainder: int) = { + if divisor == 0 { + quotient = 0; // constructs quotient + remainder = 0; // constructs remainder + } + else { + quotient = dividend / divisor; // constructs quotient + remainder = dividend % divisor; // constructs remainder + } +} + +main: () = { + div := divide(11, 5); + std::cout << "(div.quotient)$, (div.remainder)$\n"; +} +// Prints: +// 2, 1 +``` + +This next example declares a [member function](types.md#this-parameter) with multiple return values in a [type](types.md) named `set`: + +``` cpp title="Member function with multiple/named return values" hl_lines="7 9 10 22-24" +set: type = { + container: std::set; + iterator : type == std::set::iterator; + + // A std::set::insert-like function using named return values + // instead of just a std::pair/tuple + insert: (inout this, value: Key) -> (where: iterator, inserted: bool) = { + set_returned := container.insert(value); + where = set_returned.first; + inserted = set_returned.second; + } + + ssize: (this) -> i64 = std::ssize(container); + + // ... +} + +use_inserted_position: (_) = { } + +main: () = { + m: set = (); + ret := m.insert("xyzzy"); + if ret.inserted { + use_inserted_position( ret.where ); + } + assert( m.ssize() == 1 ); +} +``` + + +### Function outputs are not implicitly discardable + +A function's outputs are its return values, and the "out" state of any `out` and `inout` parameters. + +Function outputs cannot be silently discarded. To explicitly discard a function output, assign it to `_`. For example: + +``` cpp title="No silent discard" hl_lines="9 11 13 17-18 23-24 29-30" +f: () -> void = { } +g: () -> int = { return 10; } +h: (inout x: int) -> void = { x = 20; } + +main: () += { + f(); // ok, no return value + + std::cout << g(); // ok, use return value + + _ = g(); // ok, explicitly discard return value + + g(); // ERROR, return value is ignored + + { + x := 0; + h( x ); // ok, x is referred to again... + std::cout << x; // ... here, so its new value is used + } + + { + x := 0; + h( x ); // ok, x is referred to again... + _ = x; // ... here where its value explicitly discarded + } + + { + x := 0; + h( x ); // ERROR, this is a definite last use of x + } // so x is not referred to again, and its + // 'out' value can't be implicitly discarded +} +``` + +> Cpp2 imbues Cpp1 code with nondiscardable semantics, while staying fully compatible as usual: +> +> - A function written in Cpp2 syntax that returns something other than `#!cpp void` is always compiled to Cpp1 with `[[nodiscard]]`. +> +> - A function call written in Cpp2 `x.f()` member call syntax always treats a non-`#!cpp void` return type as not discardable, even if the function was written in Cpp1 syntax that did not write `[[nodiscard]]`. + + +## Control flow + +## `#!cpp if`, `#!cpp else` — Branches + +`if` and `else` are like always in C++, except that `(` `)` parentheses around the condition are not required. Instead, `{` `}` braces around a branch body *are* required. For example: + +``` cpp title="Using if and else" hl_lines="1 4" +if vec.ssize() > 100 { + do_general_algorithm( container ); +} +else { + do_linear_scan( vec ); +} +``` + + +## `#!cpp for`, `#!cpp while`, `#!cpp do` — Loops + +**`#!cpp do`** and **`#!cpp while`** are like always in C++, except that `(` `)` parentheses around the condition are not required. Instead, `{` `}` braces around the loop body *are* required. + +**`#!cpp for range do (e)`** ***statement*** says "for each element in `range`, call it `e` and perform the statement." The loop parameter `(e)` is an ordinary parameter that can be passed isoing any [parameter passing style](#parameters); as always, the default is `in`, which is read-only and expresses a read-only loop. The statement is not required to be enclosed in braces. + +Every loop can have a `next` clause, that is performed at the end of each loop body execution. This makes it easy to have a counter for any loop, including a range `#!cpp for` loop. + +> Note: Whitespace is just a stylistic choice. This documentation's style generally puts each keyword on its own line and lines up what follows. + +For example: + +``` cpp title="Using loops" hl_lines="4 5 13 16 17 22-24" +words: std::vector = ("Adam", "Betty"); +i := 0; + +while i < words.ssize() // while this condition is true +next i++ // and increment i after each loop body is run +{ // do this loop body + std::cout << "word: (words[i])$\n"; +} +// prints: +// word: Adam +// word: Betty + +do { // do this loop body + std::cout << "**\n"; +} +next i-- // and decrement i after each loop body is run +while i > 0; // while this condition is true +// prints: +// ** +// ** + +for words // for each element in 'words' +next i++ // and increment i after each loop body is run +do (inout word) // declare via 'inout' the loop can change the contents +{ // do this loop body + word = "[" + word + "]"; + std::cout << "counter: (i)$, word: (word)$\n"; +} +// prints: +// counter: 0, word: [Adam] +// counter: 1, word: [Betty] +``` + +There is no special "select" or "where" to perform the loop body for only a subset of matches, because this can naturally be expressed with `if`. For example: + +``` cpp title="Using loops + if" hl_lines="7" +// Continuing the previous example +i = 0; + +for words +next i++ +do (word) +if i % 2 == 1 // if i is odd +{ // do this loop body + std::cout << "counter: (i)$, word: (word)$\n"; +} +// prints: +// counter: 1, word: [Betty] +``` + + +### Loop names, `#!cpp break`, and `#!cpp continue` + +Loops can be named using the usual **name `:`** syntax that introduces all names, and `#!cpp break` and `#!cpp continue` can refer to those names. For example: + +``` cpp title="Using named break and continue" hl_lines="1 3 6 10" +outer: while i Move/forward from definite last use + +In a function body, a **definite last use** of a local name is a single use of that name in a statement that is not in a loop, where no control flow path after that statement mentions the name again. + +For each definite last use: + +- If the name is a local object or a `copy` or `move` parameter, we know the object will not be used again before being destroyed, and so the object is automatically treated as an rvalue (move candidate). If the expression that contains the last use is able to move from the rvalue, the move will happen automatically. + +- If the name is a `forward` parameter, the object is automatically forwarded to preserve its constness and value category (`std::forward`-ed). + +For example: + +``` cpp title="Definite last uses" linenums="1" hl_lines="13 16 19 21" +f: ( + copy x: some_type, + move y: some_type, + forward z: some_type + ) += { + w: some_type = "y"; + + prepare(x); // NOT a definite last use + + if something() { + process(y); + z.process(x); // definite last uses of x and z + } + else { + cout << z; // definite last use of z + } + + transfer(y); // definite last use of y + + offload(w); // definite last use of w +} +``` + +In this example: + +- `x` has a definite last use on one path, but not another. Line 13 is a definite last use that automatically treats `x` as an rvalue. However, if the `#!cpp else` is taken, `x` gets no special automatic handling. Line 9 is not a definite last use because `x` could be used again where it is mentioned later on line 13. + +- `y` has a definite last use on every path, in this case the same on all executions of the function. Line 19 is a definite last use that automatically treats `x` as an rvalue. + +- `z` has a definite last use on every path, but unlike `y` it can be a different last use on different executions of the function. That's fine, each of lines 13 and 16 is a definite last use that automatically forwards the constness and value category of `z`. + +- `w` has a definite last use on every path, in this case the same on all executions of the function. Line 21 is a definite last use that automatically treats `w` as an rvalue. + + +## Generality note: Summary of function defaults + +There is a single function syntax, designed so we can just omit the parts we're not currently using. + +For example, let's express in full verbose detail that `equals` is a function template that has two type parameters `T` and `U`, two ordinary `in` parameters `a` and `b` of type `T` and `U` respectively, and a deduced return type, and its body returns the result of `a == b`: + +``` cpp title="equals: A generic function written in full detail (using no defaults)" +equals: (in a: T, in b: U) -> _ = { return a == b; } +``` + +We can write all that, but we don't have to. + +First, `: type` is the default for template parameters, so we can omit it since that's what we want: + +``` cpp title="equals: Identical meaning, now using the :type default for template parameters" +equals: (in a: T, in b: U) -> _ = { return a == b; } +``` + +So far, the return type is already using one common default available throughout Cpp2: the wildcard `_` (pronounced "don't care"). Since this function's body doesn't actually use the parameter type names `T` and `U`, we can just use wildcards for the parameter types too: + +``` cpp title="equals: Identical meaning, now using the _ wildcard also for the parameter types" +equals: (in a: _, in b: _) -> _ = { return a == b; } +``` + +Next, `: _` is also the default parameter type, so we don't need to write even that: + +``` cpp title="equals: Identical meaning, now using the :_ default parameter type" +equals: (in a, in b) -> _ = { return a == b; } +``` + +Next, `in` is the default [parameter passing mode](#parameters). So we can use that default too: + +``` cpp title="equals: Identical meaning, now using the 'in' default parameter passing style" +equals: (a, b) -> _ = { return a == b; } +``` + +We already saw that `{` `}` is the default for a single-line function that returns nothing. Similarly, `{ return` and `}` is the default for a single-line function that returns something: + +``` cpp title="equals: Identical meaning, now using the { return ... } default body decoration" +equals: (a, b) -> _ = a == b; +``` + +Next, `#!cpp -> _ =` (deduced return type) is the default for single-expression functions that return something and so can be omitted: + +``` cpp title="equals: Identical meaning, now using the -> _ = default for functions that return something" +equals: (a, b) a == b; +``` + +Finally, at expression scope (aka "lamba/temporary") functions/objects aren't named, and the trailing `;` is optional: + +``` cpp title="(not) 'equals': Identical meaning, but without a name as an unnamed function at expression scope" +:(a, b) a == b +``` + +Here are some additional examples of unnamed function expressions: + +``` cpp title="Some more examples of unnamed function expressions" +std::ranges::for_each( a, :(x) = std::cout << x ); + +std::ranges::transform( a, b, :(x) x+1 ); + +where_is = std::ranges::find_if( a, :(x) x == waldo$ ); +``` + +> Note: Cpp2 doesn't have a separate "lambda" syntax; you just use the regular function syntax at expression scope to write an unnamed function, and the syntactic defaults are chosen to make such function expressions convenient to write. And because in Cpp2 every local variable [capture](expressions.md#captures) (for example, `waldo$` above) is written in the body, it doesn't affect the function syntax. + diff --git a/docs/cpp2/metafunctions.md b/docs/cpp2/metafunctions.md new file mode 100644 index 0000000000..f7075e2888 --- /dev/null +++ b/docs/cpp2/metafunctions.md @@ -0,0 +1,385 @@ + +# Metafunctions + +## Overview + +A metafunction is a compile-time function that can participate in interpreting the meaning of a declaration, and can: + +- apply defaults (e.g., `interface` makes functions virtual by default) + +- enforce constraints (e.g., `value` enforces that the type has no virtual functions) + +- generate additional functions and other code (e.g., `value` generates copy/move/comparison operations for a type if it didn't write them explicitly) + +The most important thing about metafunctions is that they are not hardwired language features — they are compile-time library code that uses the reflection and code generation API, that lets the author of an ordinary type easily opt into a named set of defaults, requirements, and generated contents. This approach is essential to making the language simpler, because it lets us avoid hardwiring special "extra" types into the language and compiler. + +## Applying metafunctions using `@` + +Metafunctions provide an easy way for a type author to opt into a group of defaults, constraints, and generated functions: Just write `@name` afer the `:` of a declaration, where `name` is the name of the metafunction. This lets the type author declare (and the human reader see) the intent up front: "This isn't just any `type`, this is a `@value type`" which automatically gives the type default/copy/move construction and assignment, `<=>` with `std::strong_ordering` comparisons, and guarantees that it has a public destructor and no protected or virtual functions: + +``` cpp title="Using the value metafunction when writing a type" hl_lines="1" +point2d: @value type = { + x: i32 = 0; + y: i32 = 0; + // @value automatically generates default/copy/move + // construction/assignment and <=> strong_ordering comparison, + // and emits an error if you try to write a non-public + // destructor or any protected or virtual function +} +``` + +## Generating source code at compile time + +A metafunction applied to a definition using `@` gets to participate in interpreting the meaning of the definition by inspecting and manipulating the definition's parse tree. For example: + +``` cpp title="shape.cpp2: Using @interface @print" hl_lines="1" +shape: @interface @print type = { + draw : (this); + move_by: (this, dx: double, dy: double); +} +``` + +The above code: + +- applies `@interface`, which makes functions pure virtual by default and defines a virtual destructor with a do-nothing body if there isn't already a virtual destructor (among other things), and + +- then applies `@print`, which pretty-prints the resulting parse tree as source code to the console so that we can see the results of what the first metafunction did. + +The result of compiling this is the following cppfront output, which is the `@interface`-modified Cpp2 source code as printed by `@print`: + +``` cpp title="'cppfront shape.cpp2' output to the console, from @print" hl_lines="1" +shape:/* @interface @print */ type = +{ + public draw:(virtual in this); + + public move_by:( + virtual in this, + in dx: double, + in dy: double + ); + + operator=:(virtual move this) = + { + } +} +``` + +Finally, cppfront also emits the following in `shape.cpp`: + +``` cpp title="'cppfront shape.cpp' output to 'shape.cpp'" +class shape { + public: virtual auto draw() const -> void = 0; + public: virtual auto move_by(cpp2::in dx, cpp2::in dy) const -> void = 0; + public: virtual ~shape() noexcept; + + public: shape() = default; + public: shape(shape const&) = delete; /* No 'that' constructor, suppress copy */ + public: auto operator=(shape const&) -> void = delete; + +}; + +shape::~shape() noexcept{} +``` + + +## Built-in metafunctions + +The following metafunctions are provided in the box with cppfront. + + +### For regular value-like types (copyable, comparable) + + +#### `ordered`, `weakly_ordered`, `partially_ordered` + +An `ordered` (or `weakly_ordered` or `partially_ordered`) type has an `#!cpp operator<=>` three-way comparison operator that returns `std::strong_ordering` (or `std::weak_ordering` or `std::partial_ordering`, respectively). This means objects of this type can be used in all binary comparisons: `<`, `<=`, `==`, `!=`, `>=`, and `>`. + +If the user explicitly writes `operator<=>`, its return type must be the same as the one implied by the metafunction they chose. + +If the user doesn't explicitly write `operator<=>`, a default memberwise `operator<=>: (this, that) -> /* appropriate _ordering */;` will be generated for the type. + +These metafunctions will emit a compile-time error if: + +- a user-written `operator<=>` returns a different type than the one implied by the metafunction they chose + +> Note: This feature derived from Cpp2 was already adopted into Standard C++ via paper [P0515](https://wg21.link/p0515), so most of the heavy lifting is done by the Cpp1 C++20/23 compiler, including the memberwise default semantics. In contrast, cppfront has to do the work itself for default memberwise semantics for operator= assignment as those aren't yet part of Standard C++. + + +#### `copyable` + +A `copyable` type has (copy and move) x (construction and assignment). + +If the user explicitly writes any of the copy/move `operator=` functions, they must also write the most general one that takes `(out this, that)`. + +If the user doesn't write any of the copy/move `operator=` functions, a default general memberwise `operator=: (out this, that) = { }` will be generated for the type. + +`copyable` will emit a compile-time error if: + +- there is a user-written `operator=` but no user-written `operator=: (out this, that)` + + +#### `basic_value`, `value`, `weakly_ordered_value`, `partially_ordered_value` + +A `basic_value` type is a regular type: [`copyable`](#copyable), default constructible, and not polymorphic (no protected or virtual functions). + +A `value` (or `weakly_ordered_value` or `partially_ordered_value`) is a `basic_value` that is also [`ordered`](#ordered) (or `weakly_ordered` or `partially_ordered`, respectively). + +```mermaid +graph TD; + value---->basic_value; + weakly_ordered_value---->basic_value; + partially_ordered_value---->basic_value; + basic_value-->copyable; + value-->ordered; + partially_ordered_value-->partially_ordered; + weakly_ordered_value-->weakly_ordered; +``` + +These metafunctions will emit a compile-time error if: + +- any function is protected or virtual + +- the type has a destructor that is not public + + +#### `struct` + +A `struct` is a type with only public bases, objects, and functions, with no virtual functions, and with no user-defined constructors (i.e., no invariants) or assignment or destructors. + +`struct` is implemented in terms of [`cpp1_rule_of_zero`](#cpp1_rule_of_zero). + +`struct` will emit a compile-time error if: + +- any member is non-public + +- any function is virtual + +- there is a user-written `operator=` + + +### For polymorphic types (interfaces, base classes) + + +#### `interface` + +An `interface` type is an abstract base class having only pure virtual functions. + +Cpp2 has no `interface` feature hardwired into the language, as C# and Java do. Instead you apply the `@interface` metafunction when writing an ordinary `type`. For a detailed example, see [the `shape` example above](#generating-source-code-at-compile-time). + +`interface` will emit a compile-time error if: + +- the type contains a data object + +- the type has a copy or move function (the diagnostic message will suggest a virtual `clone` function instead) + +- any function has a body + +- any function is nonpublic + + +#### `polymorphic_base` + +A `polymorphic_base` type is a pure polymorphic base type that is not copyable, and whose destructor is either public and virtual or protected and nonvirtual. + +Unlike an [interface](#interface), it can have nonpublic and nonvirtual functions. + +`polymorphic_base` will emit a compile-time error if: + +- the type has a copy or move function (the diagnostic message will suggest a virtual `clone` function instead) + +- the type has a destructor that is not public and virtual, and also not protected and nonvirtual + + +### For enumeration types + + +#### `enum` + +Cpp2 has no `enum` feature hardwired into the language. Instead you apply the `@enum` metafunction when writing an ordinary `type`. + +`enum` will emit a compile-time error if: + +- any member has the reserved name `operator=` or `operator<=>`, as these will be generated by the metafunction + +- an enumerator is not public or does not have a deduced type + +For example: + +``` cpp title="Using the @enum metafunction when writing a type" hl_lines="14" +// skat_game is declaratively a safe enumeration type: it has +// default/copy/move construction/assignment and <=> with +// std::strong_ordering, a minimal-size signed underlying type +// by default if the user didn't specify a type, no implicit +// conversion to/from the underlying type, in fact no public +// construction except copy construction so that it can never +// have a value different from its listed enumerators, inline +// constexpr enumerators with values that automatically start +// at 1 and increment by 1 if the user didn't write their own +// value, and conveniences like to_string()... the word "enum" +// carries all that meaning as a convenient and readable +// opt-in, without hardwiring "enum" specially into the language +// +skat_game: @enum type = { + diamonds := 9; + hearts; // 10 + spades; // 11 + clubs; // 12 + grand := 20; + null := 23; +} +``` + +Consider `hearts`: It's a member object declaration, but it doesn't have a type (or a default value) which is normally illegal, but here it's okay because the `@enum` metafunction fills them in: It iterates over all the data members and gives each one the underlying type (here explicitly specified as `i16`, otherwise it would be computed as the smallest signed type that's big enough), and an initializer (by default one higher than the previous enumerator). + +Unlike C `#!cpp enum`, this `@enum` is scoped and strongly typed (does not implicitly convert to the underlying type). + +Unlike C++11 `#!cpp enum class`, it's "just a `type`" which means it can naturally also have member functions and other things that a type can have: + +``` cpp title="An @enum type with a member function" hl_lines="1" +janus: @enum type = { + past; + future; + + flip: (inout this) == { + if this == past { this = future; } + else { this = past; } + } +} +``` + + +#### `flag_enum` + +`flag_enum` is a variation on `enum` that has power-of-two default enumerator values, a default signed underlying type that is large enough to hold the values, and supports bitwise operators to combine and test values. + +`flag_enum` will emit a compile-time error if: + +- any member has the reserved name `operator=`, `operator<=>`, `has`, `set`, `clear`, `to_string`, `get_raw_value`, or `none`, as these will be generated by the metafunction + +- an enumerator is not public or does not have a deduced type + +- the values are outside the range that can be represented by the largest default underlying type + +For example: + +``` cpp title="Using the @flag_enum metafunction when writing a type" hl_lines="11" +// file_attributes is declaratively a safe flag enum type: +// same as enum, but with a minimal-size unsigned underlying +// type by default, and values that automatically start at 1 +// and rise by powers of two if the user didn't write their +// own value, and bitwise operations plus .has(flags), +// .set(flags), and .clear(flags)... the word "flag_enum" +// carries all that meaning as a convenient and readable +// opt-in without hardwiring "[Flags]" specially into the +// language +// +file_attributes: @flag_enum type = { + cached; // 1 + current; // 2 + obsolete; // 4 + cached_and_current := cached | current; +} +``` + + +### For dynamic types + + +#### `union` + +`@union` declaratively opts into writing a safe discriminated union/variant dynamic type. + +`union` will emit a compile-time error if: + +- any alternative is not public or has an initializer + +- any member starts with the reserved name prefix `is_` or `set_`, as these will be generated by the metafunction + + +For example: + +``` cpp title="Using the @union metafunction when writing a type" hl_lines="10 18-20 25 26" +// name_or_number is declaratively a safe union/variant type: +// it has a discriminant that enforces only one alternative +// can be active at a time, members always have a name, and +// each member has .is_member(), .set_member(), and .member() +// accessors using the member name... the word "union" +// carries all that meaning as a convenient and readable +// opt-in without hardwiring "union" specially into the +// language +// +name_or_number: @union type = { + name: std::string; + num : i32; +} + +main: () = { + x: name_or_number = (); + + x.set_name("xyzzy"); // now x is a string + assert( x.is_name() ); + std::cout << x.name(); // prints the string + + // trying to use x.num() here would cause a Type safety + // contract violation, because x is currently a string + + x.set_num( 120 ); // now x is a number + std::cout << x.num() + 3; // prints 123 +} +``` + +Unlike C `#!cpp union`, this `@union` is safe to use because it always ensures only the active type is accessed. + +Unlike C++11 `std::variant`, this `@union` is easier to use because its alternatives are anonymous, and safer to use because each union type is a distinct type. [^variant] + +Each `@union` type has its own type-safe name, has clear and unambiguous named members, and safely encapsulates a discriminator to rule them all. Sure, it uses unsafe casts in the implementation, but they are fully encapsulated, where they can be tested once and be safe in all uses. + +Because a `@union type` is still a `type`, it can naturally have other things normal types can have, such as template parameter lists and member functions: + +``` cpp title="A templated custom safe union type" hl_lines="1" +name_or_other: @union type += { + name : std::string; + other : T; + + // a custom member function + to_string: (this) -> std::string = { + if is_name() { return name(); } + else if is_other() { return other() as std::string; } + else { return "invalid value"; } + } +} + +main: () = { + x: name_or_other = (); + x.set_other(42); + std::cout << x.other() * 3.14 << "\n"; + std::cout << x.to_string(); // prints "42" here, but is legal + // whichever alternative is active +} +``` + + +### Helpers and utilities + + +#### `cpp1_rule_of_zero` + +A `cpp1_rule_of_zero` type is one that has no user-written copy/move/destructor functions, and for which Cpp2 should generate nothing so that the Cpp1 defaults for generated special member functions are accepted. + +> C.20: If you can avoid defining default operations, do. +> Reason: It's the simplest and gives the cleanest semantics. +> This is known as "the rule of zero". +> — Stroustrup, Sutter, et al. (C++ Core Guidelines) + + +#### `print` + +`print` prints a pretty-printed visualization of the type to the console. + +This is most useful for debugging metafunctions, and otherwise seeing the results of applying previous metafunctions. + +For a detailed example, see [the `shape` example above](#generating-source-code-at-compile-time). + + +[^variant]: With `variant`, there's no way to distinguish in the type system between a `variant` that stores either an employee id or employee name, and a `variant` that stores either a lucky number or a pet unicorn's dominant color. diff --git a/docs/cpp2/namespaces.md b/docs/cpp2/namespaces.md new file mode 100644 index 0000000000..43d9e75462 --- /dev/null +++ b/docs/cpp2/namespaces.md @@ -0,0 +1,60 @@ + +# Namespaces + +## Overview + +A namespace `N` can contain declarations that are then accessed by writing `N::` or [`using`](#using) the namespace or declaration. For example: + +``` cpp title="Declaring some things in a namespace" hl_lines="2 8" +// A namespace to put all the names provided by a widget library +widgetlib: namespace = { + widget: type = { /*...*/ } + // ... more things ... +} + +main: () = { + w: widgetlib::widget = /*...*/; +} +``` + + +## `using` + +A `#!cpp using` statement brings names declared in another namespace into the current scope as if they had been declared in the current scope. It has two forms: + +- `#!cpp using a_namespace::a_name;` brings the single name `a_name` into scope. + +- `#!cpp using namespace a_namespace;` brings all the namespace's names into scope. + +For example: + +``` cpp title="using statements" hl_lines="13 14 20 21" +// A namespace to put all the names provided by a widget library +widgetlib: namespace = { + widget: type = { /*...*/ } + // ... more things ... +} + +main: () = { + // Explicit name qualification + w: widgetlib::widget = /*...*/; + + { + // Using the name, no qualification needed + using widgetlib::widget; + w2: widget = /*...*/; + // ... + } + + { + // Using the whole namespace, no qualification needed + using namespace widgetlib; + w3: widget = /*...*/; + // ... + } + + // ... +} +``` + + diff --git a/docs/cpp2/objects.md b/docs/cpp2/objects.md new file mode 100644 index 0000000000..0233b2bdbb --- /dev/null +++ b/docs/cpp2/objects.md @@ -0,0 +1,133 @@ +## Overview + +An object can be declared at any scope: in a namespace, in a `type`, in a function, in an expression. + +Its declaration is written using the same **name `:` kind `=` value** [declaration syntax](../cpp2/declarations.md) as everything in Cpp2: + +- **name** starts with a letter and is followed by other letters, digits, or `_`. Examples: `count`, `skat_game`, `Point2D` are valid names. + +- **kind** is the object's type. In most places, except type scopes, you can write the `_` wildcard as the type (or omit the type entirely) to ask for the type to be deduced. When the type is a template, the templated arguments can be inferred from the constructor (via [CTAD](../welcome/hello-world.md#ctad)). + +- **value** is the object's initial value. To use the default-constructed value, write `()`. + + +For example: + +``` cpp title="Declaring some objects" hl_lines="3 4 7-9 12 13" +// numbers is an object of type std::vector, +// defined as having the initial contents 1, 2, 3 +numbers: std::vector = (1, 2, 3); +numbers: std::vector = (1, 2, 3); // same, deducing the vector's type + +// count is an object of type int, defined as having initial value -1 +count: int = -1; +count: _ = -1; // same, deducing the object's type with the _ wildcard +count := -1; // same, deducing the object's type by just omitting it + +// pi is a variable template; == signifies the value never changes (constexpr) +pi: T == 3.14159'26535'89793'23846L; +pi: _ == 3.14159'26535'89793'23846L; // same, deducing the object's type +``` + + +## Guaranteed initialization + +Every object must be initialized using `=` before it is used. + +An object in any scope can be initialized at its declaration. For example: + +``` cpp title="Initializing objects when they are declared" hl_lines="4 10" +shape: type = { + // An object at type scope (data member) + // initialized with its type's default value + points: std::vector = (); + + draw: (this, where: canvas) -> bool + = { + // An object at function scope (local variable) + // initialized with color::red + pen := color::red; + + // ... + } + + // ... +} +``` + +Additionally, at function local scope an object `obj` can be initialized separately from its declaration. This can be useful when the object must be declared before a program-meaningful initial value is known (to avoid a dead write of a wrong 'dummy' value), and/or when the object may be initialized in more than one way depending on other logic (e.g., by using different constructors on different paths). The way to do this is: + +- Declare `obj` without an initializer, such as `obj: some_type;`. This allocates stack space for the object, but does not construct it. + +- `obj` must have a definite first use on every `#!cpp if`/`#!cpp else` branch path, and + +- that definite first use must be of the form `obj = value;` which is a constructor call, or else pass `obj` as an `out` argument to an `out` parameter (which is also effectively a constructor call, and performs the construction in the callee). + +For example: + +``` cpp title="Initializing local objects after they are declared" hl_lines="5 14 17 21" +f: () = { + buf: std::array; // uninitialized + // ... calculate some things ... + // ... no uses of buf here ... + buf = some_calculated_value; // constructs (not assigns) buf + // ... + std::cout buf[0]; // ok, a has been initialized +} + +g: () = { + buf: std::array; // uninitialized + if flip_coin_is_heads() { + if heads_default_is_available { + buf = copy_heads_default(); // constructs buf + } + else { + buf = (other, constructor); // constructs buf + } + } + else { + load_from_disk( out buf ); // constructs buf (*) + } + std::cout buf[0]; // ok, a has been initialized +} + +load_from_disk: (out buffer) = { + x = /* data read from disk */ ; // when `buffer` is uninitialized, +} // constructs it; otherwise, assigns +``` + +In the above example, note the simple rule for branches: The local variable must be initialized on both the `#!cpp if` and `#!cpp else` branches, or neither branch. + + +## Heap objects + +Objects can also be allocated on the heap using `#!cpp arena.new (/*initializer, arguments*/)` where `arena` is any object that acts as a memory arena and provides a `#!cpp .new` function template. Two memory arena objects are provided in namespace `cpp2`: + +- `#!cpp unique.new` calls `std::make_unique` and returns a `std::unique_ptr`. + +- `#!cpp shared.new` calls `std::make_shared` and returns a `std::shared_ptr`. + +The default is `#!cpp unique.new` if you don't specify an arena object. + +For example (see [types](types.md) for more details about writing types): + + +``` cpp title="Heap allocation" hl_lines="3-6 10-11" +f: () -> std::shared_ptr += { + // Dynamically allocate an object owned by a std::unique_ptr + // 'vec' is a unique_ptr> containing three values + vec := new>(1, 2, 3); + // shorthand for 'unique.new<...>(...)' + std::cout << vec*.ssize(); // prints 3 + // note that * dereference is a suffix operator + + // Dynamically allocate an object with shared ownership + wid := cpp2::shared.new(); + store_a_copy( wid ); // store a copy of 'wid' somewhere + return wid; // and move-return a copy too + +} // as always in C++, vec is destroyed here automatically, which + // destroys the heap vector and deallocates its dynamic memory +``` + diff --git a/docs/cpp2/types.md b/docs/cpp2/types.md new file mode 100644 index 0000000000..39cfd1dfae --- /dev/null +++ b/docs/cpp2/types.md @@ -0,0 +1,287 @@ + +# Types + +## Overview + +A user-defined `type` is written using the same **name `:` kind `=` value** [declaration syntax](../cpp2/declarations.md) as everything in Cpp2. The type's "value" is a `{}`-enclosed body containing more declarations. + +In a `type`, data members are private by default, and functions and nested types are public by default. To explicitly declare a type scope declaration `#!cpp public`, `#!cpp protected`, or `#!cpp private`, write that keyword at the beginning of the declaration. + +``` cpp title="Writing a simple type" hl_lines="1" +mytype: type = +{ + // data members are private by default + x: std::string; + + // functions are public by default + protected f: (this) = { do_something_with(x); } + + // ... +} +``` + +## `#!cpp this` — The parameter name + +**`#!cpp this`** is a synonym for the current object. Inside the scope of a type that has a member named `member`, `member` by default means `#!cpp this.member`. + +> Note: In Cpp2, `#!cpp this` is not a pointer. + +The name `#!cpp this` may only be used for the first parameter of a type-scope function (aka member function). It is never declared with an explicit `: its_type` because its type is always the current type. + +`#!cpp this` can be an `in` (default), `inout`, `out`, or `move` parameter. Which you choose naturally determines what kind of member function is being declared: + +- **`#!cpp in this`**: Writing `#!cpp myfunc: (this /*...*/)`, which is shorthand for `#!cpp myfunc: (in this /*...*/)`, defines a Cpp1 `#!cpp const`-qualified member function, because `in` parameters are `#!cpp const`. + +- **`#!cpp inout this`**: Writing `#!cpp myfunc: (inout this /*...*/)` defines a Cpp1 non-`#!cpp const` member function. + +- **`#!cpp out this`**: Writing `#!cpp myfunc: (out this /*...*/)` defines a Cpp1 constructor... and more. (See below.) + +- **`#!cpp move this`**: Writing `#!cpp myfunc: (move this /*...*/)` defines a Cpp1 `#!cpp &&`-qualified member function, or if there are no additional parameters it defines the destructor. + +For example, here is how to write read-only member function named `print` that takes a read-only string value and prints this object's data value and the string message: + +``` cpp title="The this parameter" hl_lines="4 6" +mytype: type = { + data: i32; // some data member (private by default) + + print: (this, msg: std::string) = { + std::cout << data << msg; + // "data" is shorthand for "this.data" + } + + // ... +} +``` + +## `#!cpp this` — Inheritance + +Base types are written as members named `#!cpp this`. For example, just as a type could write a data member as `#!cpp data: string = "xyzzy";`, which is pronounced "`data` is a `string` defined as having the default value `#!cpp "xyzzy"`, a base type is written as `#!cpp this: Shape = (default, values);`, which is pronounced "`#!cpp this` is a `Shape` defined as having these default values." + +> Cpp2 syntax has no separate base list or separate member initializer list. + +Because base and member subobjects are all declared in the same place (the type body) and initialized in the same place (an `#!cpp operator=` function body), they can be written in any order, including interleaved, and are still guaranteed to be safely initialized in declared order. This means that in Cpp2 you can declare a data member object before a base subobject, so that it naturally outlives the base subobject. + +> Cpp2 code doesn't need workarounds like Boost's `base_from_member`, because all of the motivating examples for that can be written directly. See [this explanation](https://github.com/hsutter/cppfront/issues/334#issuecomment-1500984173) for details. + +## `#!cpp virtual`, `#!cpp override`, and `#!cpp final` — Virtual functions + +A `#!cpp this` parameter can additionally be declared as one of the following: + +- **`#!cpp virtual`**: Writing `#!cpp myfunc: (virtual this /*...*/)` defines a new virtual function. + +- **`#!cpp override`**: Writing `#!cpp myfunc: (override this /*...*/)` defines an override of an existing base class virtual function. + +- **`#!cpp final`**: Writing `#!cpp myfunc: (final this /*...*/)` defines a final override of an existing base class virtual function. + +A pure virtual function is a function with a `#!cpp virtual this` or `#!cpp override this` parameter and no body. + +For example: + +``` cpp title="Virtual functions" hl_lines="3 4 14 15" +abstract_base: type += { + // A pure virtual function: virtual + no body + print: (virtual this, msg: std::string); + + // ... +} + +derived: type += { + // 'this' is-an 'abstract_base' + this: abstract_base; + + // Explicit override + print: (override this, msg: std::string); + + // ... +} +``` + + +## `implicit` — Controlling conversion functions + +A `#!cpp this` parameter of an `#!cpp operator=` function can additionally be declared as: + +- **`implicit`**: Writing `#!cpp operator=: (implicit out this, /*...*/)` defines a function that will not be marked as "explicit" when lowered to Cpp1 syntax. + +> Note: This reverses the Cpp1 default, where constructors are not "explicit" by default, and you have to write "explicit" to make them explicit. + + +## `#!cpp operator=` — Construction, assignment, and destruction + +All value operations are spelled `#!cpp operator=`, including construction, assignment, and destruction. `#!cpp operator=` sets the value of `#!cpp this` object, so the `#!cpp this` parameter can be passed as anything but `in` (which would imply `#!cpp const`): + +- **`#!cpp out this`:** Writing `#!cpp operator=: (out this /*...*/ )` is naturally both a constructor and an assignment operator, because an `out` parameter can take an uninitialized or initialized argument. If you don't also write a more-specialized `#!cpp inout this` assignment operator, Cpp2 will use the `#!cpp out this` function also for assignment. + +- **`#!cpp inout this`:** Writing `#!cpp operator=: (inout this /*...*/ )` is an assignment operator (only), because an `inout` parameter requires an initialized modifiable argument. + +- **`#!cpp move this`:** Writing `#!cpp operator=: (move this)` is the destructor. No other parameters are allowed, so it connotes "move `#!cpp this` nowhere." + +Unifying `#!cpp operator=` enables usable `out` parameters, which is essential for composable guaranteed initialization. We want the expression syntax `#!cpp x = value` to be able to call a constructor or an assignment operator, so naming them both `#!cpp operator=` is consistent. + +An assignment operator always returns the same type as `#!cpp this` and automatically performs `#!cpp return this;`. + +> Note: Writing `=` always invokes an `#!cpp operator=` (in fact for a Cpp2-authored type, and semantically for a Cpp1-authored type). This avoids the Cpp1 inconsistency that "writing `=` calls `#!cpp operator=`, except when it doesn't" (such as in a Cpp1 variable initialization). Conversely, `#!cpp operator=` is always invoked by `=` in Cpp2. + + +### `that` — A source parameter + +All type-scope functions can have **`that`** as their second parameter, which is a synonym for the object to be copied/moved from. Like `this`, at type scope it is never declared with an explicit `: its_type` because its type is always the current type. + +`that` can be an `in` (default) or `move` parameter. Which you choose naturally determines what kind of member function is being declared: + +- **`in that`**: Writing `#!cpp myfunc: (/*...*/ this, that)`, which is shorthand for `#!cpp myfunc: (/*...*/ this, in that)`, is naturally both a copy and move function, because it can accept an lvalue or an rvalue `that` argument. If you don't write a more-specialized `move that` move function, Cpp2 will automatically use the `in that` function also for move. + +- **`move that`**: Writing `#!cpp myfunc: (/*...*/ this, move that)` defines a move function. + +Putting `this` and `that` together: The most general form of `#!cpp operator=` is **`#!cpp operator=: (out this, that)`**. It works as a unified general {copy, move} x { constructor, assignment } operator, and generates all of four of those in the lowered Cpp1 code if you didn't write a more specific one yourself. + + +### `#!cpp operator=` can generalize (A)ssignment from construction, and (M)ove from copy + +As mentioned above: +- If you don't write an `#!cpp inout this` function, Cpp2 will use your `#!cpp out this` function in its place (if you wrote one). +- If you don't write a `move that` function, Cpp2 will use your `in that` function in its place (if you wrote one). + +> Note: When lowering to Cpp1, this just means generating the applicable special member functions from the appropriate Cpp2 function. + +This graphic summarizes these generalizations. For convenience I've numbered the (A)ssignment and (M)ove defaults. + +![image](https://user-images.githubusercontent.com/1801526/226261443-03125a35-7890-4cc7-bf7d-f23b3a0bb0df.png) + +In Cpp1 terms, they can be described as follows: + +- **(M)ove, M1, M2:** If you write a copy constructor or assignment operator, but not a corresponding move constructor or assignment operator, the latter is generated. + +- **(A)ssignment, A1, A2, A3:** If you write a copy or move or converting constructor, but not a corresponding copy or move or converting assignment operator, the latter is generated. + +- **The arrows are transitive.** For example, if you write a copy constructor and nothing else, the move constructor, copy assignment operator, and move assignment operator are generated. + +- **M2 is preferred over A2.** Both M2 and A2 can generate a missing `#!cpp (inout this, move that)` function. If both options are available, Cpp2 prefers to use M2 (generate move assignment from copy assignment, which could itself have been generated from copy construction) rather than A2 (generate move assignment from move construction). This is because M2 is a better fit: Move assignment is more like copy assignment than like move construction, because assignments are designed structurally to set the value of an existing `#!cpp this` object. + +The most general `#!cpp operator=` with `that` is `#!cpp (out this, that)`. In Cpp1 terms, it generates all four combinations of { copy, move } x { constructor, assignment }. This is often sufficient, so you can write all these value-setting functions just once. If you do want to write a more specific version that does something else, though, you can always write it too. + +> Note: Generating `#!cpp inout this` (assignment) from `#!cpp out this` also generates **converting assignment** from converting construction, which is a new thing. Today in Cpp1, if you write a converting constructor from another type `X`, you may or may not write the corresponding assignment from `X`; in Cpp2 you will get that by default, and it sets the object to the same state as the converting constructor from `X` does. + + + +### Minimal functions generated by default + +There are only two defaults the language will generate implicitly for a type: + +- The only special function every type must have is the destructor. If you don't write it by hand, a public nonvirtual destructor is generated by default. + +- If no `#!cpp operator=` functions other than the destructor are written by hand, a public default constructor is generated by default. + +All other `#!cpp operator=` functions are explicitly written, either by hand or by opting into applying a metafunction (see below). + +> Note: Because generated functions are always opt-in, you can never get a generated function that's wrong for your type, and so Cpp2 doesn’t need to support "=delete" for the purpose of suppressing unwanted generated functions. + +### Memberwise by default + +All copy/move/comparison `#!cpp operator=` functions are memberwise by default in Cpp2. That includes when you write memberwise construction and assignment yourself. + +In a hand-written `#!cpp operator=`: + +- The body must begin with a series of `member = value;` statements, one for each of the type's data members (including base classes) in declaration order. + +- If the body does not mention a member in the appropriate place in the beginning section, by default the member's default initializer is used. + +- In an assignment operator (`#!cpp inout this`), you can explicitly skip setting a member by writing `member = _;` where it would normally be set if you know you have a reason to set its value later instead or if the existing value needs to be preserved. (This is rare; for an example, see the generated implementation of the [`union` metafunction](metafunctions.md#union).) + +For example: + +``` cpp title="Memberwise operator= semantics" hl_lines="9-11 20-22" +mytype: type += { + // data members (private by default) + name: std::string; + social_handle: std::string = "(unknown)"; + + // conversion from string + operator=: (out this, who: std::string) = { + name = who; + // if social_handle is not mentioned, defaults to: + // social_handle = "(unknown)"; + + // now that the members have been set, + // any other code can follow... + print(); + } + + // copy/move constructor/assignment + operator=: (out this, that) = { + // if neither data member is mentioned, defaults to: + // name = that.name; + // social_handle = that.social_handle; + print(); + } + + print: (this) = std::cout << "value is [(name)$] [(social_handle)$]\n"; +} + +// The above definition of mytype allows all of the following... +main: () = { + x: mytype = "Jim"; // construct from string + x = "John"; // assign from string + y := x; // copy construct + y = x; // copy assign + z := (move x); // move construct + z = (move y); // move assign + x.print(); // "value is [] []" - moved from + y.print(); // "value is [] []" - moved from +} +``` + +> Note: This makes memberwise semantics symmetric for construction and assignment. In Cpp1, only non-copy/move constructors have a default, which is to initialize a member with its default initializer. In Cpp2, both constructors and assignment operators default to using the default initializer if it's a conversion function (non-`that`, aka non-copy/move), and using memberwise `member = that.member;` for copy/move functions. + + +## `#!cpp operator<=>` — Unified comparisons + +To write comparison functions for your type, usually you just need to write either or both of `operator<=>` and `operator==` with a first parameter of `this` and a second parameter of any type (usually `that` which is of the same type). If you omit the function body, a memberwise comparison will be generated by default. + +`operator<=>` must return one of `std::strong_ordering`, `std::partial_ordering`, or `std::weak_ordering`. It makes `<`, `<=`, `>`, and `>=` comparisons available for your type. Prefer a strong ordering unless you have a reason to use a partial or weak ordering. If you write `operator<=>` without a custom function body, `operator==` is generated for you. + +`operator==` must return `bool`. It makes `==` and `!=` comparisons available for your type. + +For example: + +``` cpp title="Writing the <=> operator" hl_lines="5-7 13" +item: type = { + x: i32 = (); + y: std::string = (); + + operator<=>: (this, that) -> std::strong_ordering; + // memberwise by default: first compares x <=> that.x, + // then if those are equal compares y <=> that.y + + // ... +} + +test: (x: item, y: item) = { + if x != y { // ok + // ... + } +} +``` + +The above is the same as in Cpp1 because most of Cpp2's `#!cpp operator<=>` feature has already been merged into ISO C++ (Cpp1). In addition, in Cpp2 comparisons with the same precedence can be safely chained, and always have the mathematically sound transitive meaning or else are rejected at compile time: + +- **Valid chains: All `<`/`<=`, all `>`/`>=`, or all `==`.** All mathematically sound and safe chains like `a <= b < c` are supported, with efficient single evaluation of each term. They are "sound" because they are transitive; these chains imply a relationship between `a` and `c` (in this case, the chain implies that `a <= c` is also true). + +> Note: These valid chains always give mathematically expected results, even when invoking existing comparison operators authored in Cpp1 syntax. + +- **Invalid chains: Everything else.** Nonsense chains like `a >= b < c` and `a != b != c` are compile time errors. They are "nonsense" because they are non-transitive; these chains do not imply any relationship between `a` and `c`. + +- **Non-chains: Mixed precedence is not a chain.** Expressions like `a // Cpp1 +#include // Cpp1 + +N: namespace = { // Cpp2 + hello: (msg: std::string_view) = // Cpp2 + std::cout << "Hello, (msg)$!\n"; // Cpp2 +} // Cpp2 + +int main() { // Cpp1 + auto words = std::vector{ "Alice", "Bob" }; // Cpp1 + N::hello( words[0] ); // Cpp1 + N::hello( words[1] ); // Cpp1 + std::cout << "... and goodnight\n"; // Cpp1 +} // Cpp1 +``` + +## Not allowed: Nesting Cpp1 inside Cpp2 (and vice versa) + +However, the following source file is not valid, because it tries to nest Cpp2 code inside Cpp1 code, and vice versa: + +``` cpp title="ERROR.cpp2 — this is NOT allowed" linenums="1" hl_lines="5 6 9 14" +#include // Cpp1 +#include // Cpp1 + +namespace N { // Cpp1 + hello: (msg: std::string_view) = // Cpp2 (ERROR here) + std::cout << "Hello, (msg)$!\n"; // Cpp2 (ERROR here) +} // Cpp1 + +main: () = { // Cpp2 + auto words = std::vector{ "Alice", "Bob" }; // Cpp1 (ERROR here) + N::hello( words[0] ); // ? + N::hello( words[1] ); // ? + std::cout << "... and goodnight\n"; // ? +} // Cpp2 +``` + +The above nesting is not supported because it would create not just parsing problems but also semantic ambiguities. For example, lines 11-13 are syntactically valid as Cpp1 or as Cpp2, but if they are treated as Cpp2 then the `#!cpp words[0]` and `#!cpp words[1]` expressions' `#!cpp std::vector::operator[]` calls are bounds-checked and bounds-safe by default, whereas if they are treated as Cpp1 then they are not bounds-checked. And that's a pretty important difference to be sure about! + diff --git a/docs/cppfront/options.md b/docs/cppfront/options.md new file mode 100644 index 0000000000..d3cd967aeb --- /dev/null +++ b/docs/cppfront/options.md @@ -0,0 +1,122 @@ +# Cppfront command line options + +Cppfront is invoked using + + cppfront [options] file ... + +where + +- **options** is optional, and can include options described on this page + +- **file ...** is a list of one or more `.cpp2` filenames to be compiled + +Command line options are spelled starting with `-` or `/` followed by the option name. For example, `-help` prints help. + +For convenience, you can shorten the name to any unique prefix not shared with another option. For example: + +- `-help` can be equivalently written as `-hel`, `-he`, or `-h`, because no other option starts with `h`. +- `-import-std` and `-include-std` can be shortened to `-im` and `-in` respectively, but not `-i` which would be ambiguous with each other. + + +# Basic command line options + +## `-help`, `-h`, `-?` + +Prints an abbreviated version of this documentation page. + +## `-import-std`, `-im` + +Makes the entire C++ standard library (namespace `std::`) available via a module `import std.compat;` (which implies `import std;`). + +> When you use either `-import-std` or `-include-std`, your `.cpp2` program will not need to explicitly `import` any C++ standard library module or `#include` any C++ standard library header (it can still do that, but it would be redundant). + +This option is implicitly set if `-pure-cpp2` is selected. + +This option is ignored if `-include-std` is selected. If your Cpp1 compiler does not yet support standard library modules `std` and `std.compat`, this option automatically uses `-include-std` instead as a fallback. + +## `-include-std`, `-in` + +Makes the entire C++ standard library (namespace `std::`) available via an '#include" of every standard header. + +This option should always work with all standard headers, including draft-standard C++26 headers that are not yet in a published standard, because it tracks new headers as they are added and uses feature tests to not include headers that are not yet available on your Cpp1 implementation. + +## `-pure-cpp2`, `-p` + +Allow Cpp2 syntax only. + +This option also sets `-import-std`. + +## `-version`, `-vers` + +Print version, build, copyright, and license information. + + +# Additional dynamic safety checks and contract information + +## `-add-source-info`, `-a` + +Enable `source_location` information for contract checks. If this is supported by your Cpp1 compiler, the default contract failure messages will include exact file/line/function information. For example, if the default `Bounds` violation handler would print this without `-a`: + + Bounds safety violation: out of bounds access attempt detected - attempted access at index 2, [min,max] range is [0,1] + +then it would print something like this with `-a` (the exact text will vary with the Cpp1 standard library vendor's `source_location` implementation): + + demo.cpp2(4) int __cdecl main(void): Bounds safety violation: out of bounds access attempt detected - attempted access at index 2, [min,max] range is [0,1] + +## `-no-comparison-checks`, `-no-c` + +Disable mixed-sign comparison safety checks. If not disabled, mixed-sign comparisons are diagnosed by default. + +## `-no-null-checks`, `-no-n` + +Disable null safety checks. If not disabled, null dereference checks are performed by default. + +## `-no-subscript-checks`, `-no-s` + +Disable subscript bounds safety checks. If not disabled, subscript bounds safety checks are performed by default. + + +# Support for constrained target environments + +## `-fno-exceptions`, `-fno-e` + +Disable C++ exception handling. This should be used only if you must run in an environment that bans C++ exception handling, and so you are already using a similar command line option for your Cpp1 compiler. + +If this option is selected, a failed `as` for `std::variant` will assert. + +## `-fno-rtti`, `-fno-r` + +Disable C++ run-time type information (RTTI). This should be used only if you must run in an environment that bans C++ RTTI, and so you are already using a similar command line option for your Cpp1 compiler. + +If this option is selected, trying to using `as` for `*` (raw pointers) or `std::any` will assert. + + +# Other options + +## `-clean-cpp1`, `-c` + +Emit clean `.cpp` files without `#line` directives and other extra information that cppfront normally emits in the `.cpp` to light up C++ tools (e.g., to let IDEs integrate cppfront error message output, debuggers step to the right lines in Cpp2 source code, and so forth). In normal use, you won't need `-c`. + +## `-debug`, `-d` + +Emit compiler debug output. This is only useful when debugging cppfront itself. + +## `-emit-cppfront-info`, `-e` + +Emit cppfront version and build in the `.cpp` file. + +## `-format-colon-errors`, `-fo` + +Emit cppfront diagnostics using `:line:col:` format for line and column numbers, if that is the format better recognized by your IDE, so that it will pick up cppfront messages and integrate them in its normal error message output location. If not set, by default cppfront diagnostics use `(line,col)` format. + +## `-line-paths`, `-l` + +Emit absolute paths in `#line` directives. + +## `-output` _filename_, `-o` _filename_ + +Output to 'filename' (can be 'stdout'). If not set, the default output filename for is the same as the input filename without the `2` (e.g., compiling `hello.cpp2` by default writes its output to `hello.cpp`, and `header.h2` to `header.h`). + +## `-verbose`, `-verb` + +Print verbose statistics and `-debug` output. diff --git a/docs/index.md b/docs/index.md index 398beded14..0df8e07143 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,78 +1,65 @@ -# A tour of Cpp2 ('C++ alt syntax 2') and the `cppfront` compiler +# Overview: What are Cpp2 and cppfront? How do I get and build cppfront? -## Preface: What is this? - -### Goal in a nutshell: 100% pure C++... just nicer +``` cpp title="hello.cpp2" +main: () = { + std::cout << "Hello, world!\n"; +} +``` -My goal for this project is to try to prove that Bjarne Stroustrup has long been right: That it's possible and desirable to have true C++ with all its expressive power and control and with full backward compatibility, but in a C++ that's **10x simpler** with fewer warts and special cases, and **50x safer** where it's far easier to not write vulnerability bugs by accident. +## What is Cpp2? -Stroustrup said it best: +"Cpp2," short for "C++ syntax 2," is my ([Herb Sutter's](https://github.com/hsutter)) personal project to try to make writing ordinary C++ types/functions/objects be much **simpler and safer**, without breaking backward compatibility. Bjarne Stroustrup said it best: -> "Inside C++, there is a much smaller and cleaner language struggling to get out."
— Bjarne Stroustrup, _The Design and Evolution of C++_ (D&E), 1994 +> "Inside C++, there is a much smaller and cleaner language struggling to get out."
  — Bjarne Stroustrup, _The Design and Evolution of C++_ (D&E), 1994 > -> "Say 10% of the size of C++ in definition and similar in front-end compiler size. ... most of the simplification would come from generalization."
— Bjarne Stroustrup, _ACM History of Programming Languages III_, 2007 +> "Say 10% of the size of C++ in definition and similar in front-end compiler size. ... most of the simplification would come from generalization."
  — Bjarne Stroustrup, _ACM History of Programming Languages III_, 2007 -But how? +My goal is to try to prove that Stroustrup is right: that it's possible and desirable to have true C++ with all its expressive power and control and with full backward compatibility, but in a flavor that's 10x simpler with fewer quirks and special cases to remember, [^simpler] and 50x safer where it's far easier to not write security bugs by accident. -### Approach in a nutshell: Alternative syntax + perfect compatibility +We can't make an improvement that large to C++ via gradual evolution to today's syntax, because some important changes would require changing the meaning of code written in today's syntax. For example, we can never change a language feature default in today's syntax, not even if the default creates a security vulnerability pitfall, because changing a default would break vast swathes of existing code. Having a distinct alternative syntax gives us a "bubble of new code" that doesn't exist today, and have: -This project explores creating an alternate "syntax 2" (Cpp2 for short) for C++ itself, that's unambiguous with today's syntax (Cpp1 for short). That gives us: +- **Freedom to make any desired improvement, without breaking any of today's code.** Cpp2 is designed to take all the consensus C++ best-practices guidance we already teach, and make them the default when using "syntax 2." Examples: Writing unsafe type casts is just not possible in Cpp2 syntax; and Cpp2 can change language defaults to make them simpler and safer. You can always "break the glass" when needed to violate the guidance, but you have to opt out explicitly to write unsafe code, so if the program has a bug you can grep for those places to look at first. For details, see [Design note: unsafe code](https://github.com/hsutter/cppfront/wiki/Design-note%3A-Unsafe-code). -- a "bubble of new code" that doesn't exist today, where we can make any change we want in a fully compatible way, without worrying about breaking existing code; -- a way to make any improvement, including to fix language defaults and make all the C++ best-practices guidance we already teach be the default; -- the power to freely use both syntaxes in the same file if we want full backward C++ source compatibility, or to freely use just the simpler syntax standalone if we want to write in a 10x simpler C++ (i.e., pay for source compatibility only if you use it); and -- perfect interoperability, because any type/function/object written in either Cpp1 or Cpp2 syntax is always still just a normal C++ type/function/object. +- **Perfect link compatibility always on, perfect source compatibility always available (but you pay for it only if you use it).** Any type/function/object/namespace written in either syntax is always still just a normal C++ type/function/object/namespace, so any code or library written in either Cpp2 or today's C++ syntax ("Cpp1" for short) can seamlessly call each other, with no wrapping/marshaling/thunking. You can write a "mixed" source file that has both Cpp2 and Cpp1 code and get perfect backward C++ source compatibility (even SFINAE and macros), or you can write a "pure" all-Cpp2 source file and write code in a 10x simpler syntax. -In the 1980s and 1990s, Stroustrup similarly ensured that C++ could be interleaved with C in the same source file, and C++ could always call any C code with no wrapping/marshaling/thunking. Stroustrup accomplished this and more by writing **cfront**, the original C++ compiler, to translate C++ to pure C. That way, people could start trying out C++ code in any existing C project with just another build step to translate the C++ to C, and the result Just Worked with existing C tools. +**What it isn't.** Cpp2 is not a successor or alternate language with its own divergent or incompatible ecosystem. For example, it does not have its own nonstandard incompatible modules/concepts/etc. that compete with the Standard C++ features; and it does not replace your Standard C++ compiler and other tools. -This project aims to follow Stroustrup's implementation approach, with a **cppfront** compiler that compiles Cpp2 syntax to Cpp1 syntax. You can start trying out Cpp2 syntax in any existing C++ project just by adding a build step to translate the Cpp2 to Cpp1 syntax, and the result Just Works with existing C++ tools. +**What it is.** Cpp2 aims to be another "skin" for C++ itself, just a simpler and safer way to write ordinary C++ types/functions/objects. It seamlessly uses Standard C++ modules and concepts requirements and other features, and it works with all existing C++20 or higher compilers and tools right out of the box with zero overhead. -What does it look like? -## Hello, world! +## What is cppfront? -Here is the usual starter program that prints "Hello, world!": +[**Cppfront**](https://github.com/hsutter/cppfront) is a compiler that compiles Cpp2 syntax to today's Cpp1 syntax. This lets you start trying out Cpp2 syntax in any existing C++ project and build system just by renaming a source file from `.cpp` to `.cpp2` and [adding a build step](#adding-cppfront-in-your-ide-build-system), and the result Just Works with every C++20 or higher compiler and all existing C++ tools (debuggers, build systems, sanitizers, etc.). -```cpp -// hello.cpp2 -main: () = { - std::cout << "Hello, world!"; -} -``` +This deliberately follows Bjarne Stroustrup's wise approach with [**cfront**](https://en.wikipedia.org/wiki/Cfront), the original C++ compiler: In the 1980s and 1990s, Stroustrup created cfront to translate C++ to pure C, and similarly ensured that C++ could be interleaved with C in the same source file, and that C++ could always call any C code with no wrapping/marshaling/thunking. By providing a C++ compiler that emitted pure C, Stroustrup ensured full compatibility with the C ecosystems that already existed, and made it easy for people to start trying out C++ code in any existing C project by adding just another build step to translate the C++ to C first, and the result Just Worked with existing C tools. -This is a complete program that prints `Hello, world!`. -Everything in Cpp2 is declared using the syntax **"_name_ `:` _kind_ `=` _statement_"**. The `:` is pronounced "is a." Here, `main` is a function that takes no arguments, and has a body that prints the string to `cout`. +## How do I get and build cppfront? -We can just use `std::cout` and `std::operator<<` directly. Cpp2 code works with any C++ code or library, using direct calls without any wrapping/marshaling/thunking. +The full source code for cppfront is at the [**Cppfront GitHub repo**](https://github.com/hsutter/cppfront). -We didn't need `#include ` or `import std;`. The full C++ standard library is always available by default if your source file contains only syntax-2 code and you compile with it `cppfront -p` (short for `-pure-cpp2`). +Cppfront builds with any recent C++ compiler. Go to the `/cppfront/source` directory, and run one of the following: -### Building and running the program + -Now use `cppfront` to compile `hello.cpp2` to a standard C++ file `hello.cpp`: +``` bash title="MSVC build instructions (Visual Studio 2019 version 16.11 or higher)" +cl cppfront.cpp -std:c++20 -EHsc +``` +``` bash title="GCC build instructions (GCC 10 or higher)" +g++ cppfront.cpp -std=c++20 -o cppfront ``` -cppfront hello.cpp2 -p # produces hello.cpp + +``` bash title="Clang build instructions (Clang 12 or higher)" +clang++ cppfront.cpp -std=c++20 -o cppfront ``` -and then build `hello.cpp` using your favorite C++20 compiler, where `CPPFRONT_INCLUDE` is the path to `/cppfront/include`: +That's it! -``` -# --- MSVC ----------------------------------------------- -> cl hello.cpp -std:c++20 -EHsc -I CPPFRONT_INCLUDE -> hello.exe -Hello, world! - -# --- GCC ------------------------------------------------ -$ g++ hello.cpp -std=c++20 -ICPPFRONT_INCLUDE -o hello -$ ./hello.exe -Hello, world! - -# --- Clang ---------------------------------------------- -$ clang++ hello.cpp -std=c++20 -ICPPFRONT_INCLUDE -o hello -$ ./hello.exe -Hello, world! -``` + +### ➤ Next: [Hello, world!](welcome/hello-world.md) + + +[^simpler]: I'd ideally love to obsolete ~90% of my own books. I know that Cpp2 can eliminate that much of the C++ guidance I've personally had to write and teach over the past quarter century, by removing inconsistencies and pitfalls and gotchas, so that they're either impossible to write or are compile-time errors (either way, we don't have to teach them). I love writing C++ code... I just want it to be easier and safer by default. diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css new file mode 100644 index 0000000000..16198feddb --- /dev/null +++ b/docs/stylesheets/extra.css @@ -0,0 +1,15 @@ +/* + the default font sizes look small and encourage zooming, + which loses the navigation side panels +*/ +p { font-size: 16px; } +td { font-size: 16px; } + +/* + todo: try to make the nav pane section labels larger + + for now, this at least adds space between sections + so that section starts are easier to see +*/ +.md-nav__item { font-size: 20pt; } +.md-nav__link { font-size: medium; } diff --git a/docs/welcome/hello-world.md b/docs/welcome/hello-world.md new file mode 100644 index 0000000000..1e9ab67718 --- /dev/null +++ b/docs/welcome/hello-world.md @@ -0,0 +1,154 @@ +# **Hello, world!** + +``` mermaid +graph LR + A["` hello.cpp**2** `"] ==> B(["` **cppfront** `"]); + B ==> C[hello.cpp]; + C ==> D([Your favorite
C++ compiler

... and IDE / libraries / build
system / in-house tools / ...]); +``` + +## A `hello.cpp2` program + +Here is the usual one-line starter program that prints `Hello, world!`. Note that this is a complete program, no `#!cpp #include` required: + +``` cpp title="hello.cpp2 — on one line" +main: () = std::cout << "Hello, world!\n"; +``` + +But let's add a little more, just to show a few things: + +``` cpp title="hello.cpp2 — slightly more interesting" +main: () = { + words: std::vector = ( "Alice", "Bob" ); + hello( words[0] ); + hello( words[1] ); +} + +hello: (msg: std::string_view) = + std::cout << "Hello, (msg)$!\n"; +``` + +This short program code already illustrates a few Cpp2 essentials. + +**Consistent context-free syntax.** Cpp2 is designed so that there is one general way to spell a given thing, that works consistently everywhere. All Cpp2 types/functions/objects/namespaces are written using the unambiguous and context-free [declaration syntax](../cpp2/declarations.md) **"_name_ `:` _kind_ `=` _statement_"**. The `:` is pronounced **"is a,"** and the `=` is pronounced **"defined as."** + +- `main` **is a** function that takes no arguments and returns nothing, and is **defined as** the code body shown. + +- `words` **is a** `std::vector`, initially **defined as** holding `#!cpp "Alice"` and `#!cpp "Bob"`. + +- `hello` **is a** function that takes a `std::string_view` it will only read from and that returns nothing, and is **defined as** code that prints the string to `cout` the usual C++ way. + +All grammar is context-free. In particular, we (the human reading the code, and the compiler) never need to do name lookup to figure out how to parse something — there is never a ["vexing parse"](https://en.wikipedia.org/wiki/Most_vexing_parse) in Cpp2. For details, see [Design note: Unambiguous parsing](https://github.com/hsutter/cppfront/wiki/Design-note%3A-Unambiguous-parsing). + +**Simple, safe, and efficient by default.** Cpp2 has contracts (tracking draft C++26 contracts), `inspect` pattern matching, string interpolation, automatic move from last use, and more. + +- Declaring `words` uses **"CTAD"** (C++'s normal [constructor template argument deduction](https://en.cppreference.com/w/cpp/language/class_template_argument_deduction)) to deduce the type of elements in the `vector`. + +- Calling `#!cpp words[0]` and `#!cpp words[1]` is **bounds-checked by default**. From Cpp2 code, ordinary `std::vector` subscript accesses are safely bounds-checked by default without requiring any upgrade to your favorite standard library, and that's true for any similar subscript of something whose size can be queried using `std::size()` and `std::ssize()`, and for which `std::begin()` returns a random access iterator, including any in-house integer-indexed container types you already have that can easily provide `std::size()` and `std::ssize()` if they don't already. + +- `hello` uses **string interpolation** to be able to write `#!cpp "Hello, (msg)$!\n"` instead of `#!cpp "Hello, " << msg << "!\n"`. String interpolation also supports [standard C++ format specifications](https://en.cppreference.com/w/cpp/utility/format/spec), so you won't need iostream manipulators. + +**Simplicity through generality + defaults.** A major way that Cpp2 delivers simplicity is by providing just one powerful general syntax for a given thing (e.g., one function definition syntax), but designing it so you can omit the parts you're not currently using (e.g., where you're happy with the defaults). We're already using some of those defaults above: + +- We can omit writing the `#!cpp -> void` return type for a function that doesn't return anything, as both of these functions do. + +- We can omit the `{` `}` around single-statement function bodies, as `hello` does. + +- We can omit the `in` on the `msg` parameter. Cpp2 has just six ways to pass parameters: The most common ones are `in` for read-only (the default so we can omit it, as `hello` does), and `inout` for read-write. The others are `copy`, `out`, `move`, and `forward`. + +For details, see [Design note: Defaults are one way to say the same thing](https://github.com/hsutter/cppfront/wiki/Design-note%3A-Defaults-are-one-way-to-say-the-same-thing). + +**Order-independent by default.** Did you notice that `main` called `hello`, which was defined later? Cpp2 code is order-independent by default — there are no forward declarations. + +**Seamless compatibility and interop.** We can just use `std::cout` and `#!cpp std::operator<<` and `std::string_view` directly as usual. Cpp2 code works with any C++ code or library, including the standard library, using ordinary direct calls without any wrapping/marshaling/thunking. + +**C++ standard library is always available.** We didn't need `#!cpp #include ` or `#!cpp import std;`. The full C++ standard library is always available by default if your source file contains only syntax-2 code and you compile using cppfront's `-p` (short for `-pure-cpp2`), or if you use `-im` (short for `-import-std`). Cppfront is regularly updated to be compatible with C++23 and the latest draft C++26 library additions as soon as the ISO C++ committee votes them into the C++26 working draft, so as soon as you have a C++ implementation that has a new standard (or bleeding-edge draft standard!) C++ library feature, you'll be able to fully use it in Cpp2 code. + + +## Building `hello.cpp2` + +Now use `cppfront` to compile `hello.cpp2` to a standard C++ file `hello.cpp`: + +``` bash title="Call cppfront to produce hello.cpp" +cppfront hello.cpp2 -p +``` + +The result is an ordinary C++ file that looks like this: [^clean-cpp1] + +``` cpp title="hello.cpp — created by cppfront" linenums="1" +#define CPP2_IMPORT_STD Yes + +#include "cpp2util.h" + +auto main() -> int; + +auto hello(cpp2::in msg) -> void; +auto main() -> int{ + std::vector words {"Alice", "Bob"}; + hello(CPP2_ASSERT_IN_BOUNDS_LITERAL(words, 0)); + hello(CPP2_ASSERT_IN_BOUNDS_LITERAL(std::move(words), 1)); +} + +auto hello(cpp2::in msg) -> void { + std::cout << ("Hello, " + cpp2::to_string(msg) + "!\n"); } +``` + +Here we can see more of how Cpp2 makes its features work. + +**How: Consistent context-free syntax.** + +- **All compiled lines are portable C++20 code** we can build with pretty much any C++ compiler released circa 2019 or later. Cpp2's context-free syntax converts directly to today's Cpp1 syntax. We can write and read our C++ types/functions/objects in simpler Cpp2 syntax without wrestling with context sensitivity and ambiguity, and they're all still just ordinary types/functions/objects. + +**How: Simple, safe, and efficient by default.** + +- **Line 9: CTAD** just works, because it turns into ordinary C++ code which already supports CTAD. +- **Lines 10-11: Automatic bounds checking** is added to `#!cpp words[0]` and `#!cpp words[1]` nonintrusively at the call site by default. Because it's nonintrusive, it works seamlessly with all existing container types that are `std::size` and `std::ssize`-aware, when you use them from safe Cpp2 code. +- **Line 11: Automatic move from last use** ensures the last use of `words` will automatically avoid a copy if it's being passed to something that's optimized for rvalues. +- **Line 15: String interpolation** performs the string capture of `msg`'s current value via `cpp2::to_string`. That uses `std::to_string` when available, and it also works for additional types (such as `#!cpp bool`, to print `#!cpp false` and `#!cpp true` instead of `0` and `1`, without having to remember to use `std::boolalpha`). + +**How: Simplicity through generality + defaults.** + +- **Line 7: `in` parameters** are implemented using `#!cpp cpp2::in<>`, which is smart enough to pass by `#!cpp const` value when that's safe and appropriate, otherwise by `#!cpp const&`, so you don't have to choose the right one by hand. + +**How: Order-independent by default.** + +- **Lines 5 and 7: Order independence** happens because cppfront generates all the type and function forward declarations for you, so you don't have to. That's why `main` can just call `hello`: both are forward-declared, so they can both see each other. + +**How: Seamless compatibility and interop.** + +- **Lines 9-11 and 15: Ordinary direct calls** to existing C++ code, so there's never a need for wrapping/marshaling/thunking. + +**How: C++ standard library always available.** + +- **Lines 1 and 3: `std::` is available** because cppfront was invoked with `-p`, which implies either `-im` (short for `-import-std`) or `-in` (short for `-include-std`, for compilers that don't support modules yet). The generated code tells `cpp2util.h` to `#!cpp import` the entire standard library as a module (or do the equivalent via headers if modules are not available). + + +## Building and running `hello.cpp` with any recent C++ compiler + +Finally, just build `hello.cpp` using your favorite C++20 compiler, where `CPPFRONT_INCLUDE` is the path to `/cppfront/include`: + + + +``` title="MSVC (Visual Studio 2019 version 16.11 or higher)" +> cl hello.cpp -std:c++20 -EHsc -I CPPFRONT_INCLUDE +> hello.exe +Hello, world! +``` + +``` bash title="GCC (GCC 10 or higher)" +$ g++ hello.cpp -std=c++20 -ICPPFRONT_INCLUDE -o hello +$ ./hello.exe +Hello, world! +``` + +``` bash title="Clang (Clang 12 or higher)" +$ clang++ hello.cpp -std=c++20 -ICPPFRONT_INCLUDE -o hello +$ ./hello.exe +Hello, world! +``` + + +### ➤ Next: [Adding cppfront to your existing C++ project](integration.md) + + +[^clean-cpp1]: For presentation purposes, this documentation generally shows the `.cpp` as generated when using cppfront's `-c` (short for `-clean-cpp1`), which suppresses extra information cppfront normally emits in the `.cpp` to light up C++ tools (e.g., to let IDEs integrate cppfront error message output, debuggers step to the right lines in Cpp2 source code, and so forth). In normal use, you won't need or even want `-c`. diff --git a/docs/welcome/integration.md b/docs/welcome/integration.md new file mode 100644 index 0000000000..16c1151555 --- /dev/null +++ b/docs/welcome/integration.md @@ -0,0 +1,45 @@ + +# Adding cppfront in your IDE / build system + +To start trying out Cpp2 syntax in any existing C++ project, just add a build step to translate the Cpp2 to Cpp1 syntax: + +- Copy the `.cpp` file to the same name with a `.cpp2` extension. +- Add the `.cpp2` file to the project, and ensure the `.cpp` is in C++20 mode. +- Tell the IDE to build that file using a custom build tool to invoke cppfront. + +That's it... The result Just Works with every C++20 or higher compiler and all existing C++ tools (debuggers, build systems, sanitizers, etc.). The IDE build should just pick up the `.cpp2` file source locations for any error messages, and the debugger should just step through the `.cpp2` file. + +The following uses Visual Studio as an example, but others have done the same in Xcode, Qt Creator, CMake, and other IDEs. + +## 1. Add the `.cpp2` file to the project, and ensure the `.cpp` is in C++20 mode + +For Visual Studio: In the Solution Explorer, right-click on Source Files and pick Add to add the file to the project. + +

+ +Also in Solution Explorer, right-click on the `.cpp` file Properties and make sure it's in C++20 (or C++latest) mode. + +

+ + +## 2. Tell the project system to build that file using a custom build tool to invoke cppfront, and add `cppfront/include` to the include path + +For Visual Studio: In Solution Explorer, right-click on the `.cpp2` file and select Properties, and add the custom build tool. Remember to also tell it that the custom build tool produces the `.cpp` file, so that it knows about the build dependency: + +

+ +Finally, put the `/cppfront/include` directory on your `INCLUDE` path. In Solution Explorer, right-click the app and select Properties, and add it to the VC++ Directories > Include Directories: + +

+ + +## That's it: Error message outputs, debuggers, visualizers, and other tools should just work + +That's enough to enable builds, and the IDE just picks up the rest from the `.cpp` file that cppfront generated: + +- **The cppfront error messages in `filename(line, col)` format.** Most C++ IDEs recognize these, and usually automatically merge any diagnostic output wherever compiler error output normally appears. If your IDE prefers `filename:line:col`, just use the cppfront `-format-colon-errors` command line option. + +- **The `#line` directives cppfront emits in the generated `.cpp` file.** Most C++ debuggers recognize these and will know to step through the `.cpp2` file. Note that `#line` emission is on by default, but if you choose `-c` (short for `-clean-cpp1`) these will be suppressed and then the debugger will step through the generated C++ code instead. If your debugger can't find the files, you may need to use `-line-paths` to have absolute paths instead of relative paths in the `#line` directives. + +- **Regardless of syntax, every type/function/object/namespace/etc. is still just an ordinary C++ type/function/object/namespace/etc.** Most C++ debugger visualizers will just work and show beautiful output for the types your program uses, including to use any in-the-box visualizers for all the `std::` types (since those are used directly as usual) and any custom visualizers you may have already written for your own types or popular library types. + diff --git a/mkdocs.yml b/mkdocs.yml new file mode 100644 index 0000000000..cac325c0c6 --- /dev/null +++ b/mkdocs.yml @@ -0,0 +1,91 @@ +# To view the documentation locally on your machine, use the following steps +# +# Clone the GitHub cppfront repo locally, then on the command line: +# +# cd /github/cppfront +# python -m venv venv +# source venv/bin/activate +# pip install mkdocs-material +# mkdocs new . +# mkdocs serve +# +# The last command should eventually print something like +# Serving on http://127.0.0.1:8000/ +# and you can open that URL in a local brower. If you are locally editing +# the documentation, leave the server process running and the browser +# pages will auto-reload as you save edits. +# +site_name: "Cpp2 and cppfront — An experimental 'C++ syntax 2' and its first compiler" +theme: + name: material + features: + - navigation.sections + - navigation.expand + - navigation.instant + - navigation.instant.preview + - navigation.top + - search.suggest + - search.highlight + - content.tabs.link + - content.code.annotate + - content.code.annotation + - content.code.copy + - content.footnote.tooltips + language: en + palette: + - scheme: default + toggle: + icon: material/toggle-switch-off-outline + name: Switch to dark mode + primary: teal + accent: purple + - scheme: slate + toggle: + icon: material/toggle-switch + name: switch to light mode + primary: teal + accent: lime + +extra_css: + - stylesheets/extra.css + +nav: + - 'Welcome & getting started': + - 'Overview: What are Cpp2 and cppfront? How do I get and build cppfront?': index.md + - 'Hello, world!': welcome/hello-world.md + - 'Adding cppfront to your existing C++ project': welcome/integration.md + - 'Cpp2 reference': + - 'Common concepts': cpp2/common.md + - 'Expressions': cpp2/expressions.md + - 'Declarations and aliases': cpp2/declarations.md + - 'Objects, initialization, and memory': cpp2/objects.md + - 'Functions, branches, and loops': cpp2/functions.md + - 'Contracts': cpp2/contracts.md + - 'Types and inheritance': cpp2/types.md + - 'Metafunctions and reflection': cpp2/metafunctions.md + - 'Namespaces': cpp2/namespaces.md + # - 'Modules': cpp2/modules.md + - 'Cppfront reference': + - 'Using Cpp1 (today''s syntax) and Cpp2 in the same source file': cppfront/mixed.md + - 'Cppfront command line options': cppfront/options.md + +markdown_extensions: + - pymdownx.highlight: + anchor_linenums: true + - pymdownx.inlinehilite + - pymdownx.snippets + - admonition + - pymdownx.arithmatex: + generic: true + - footnotes + - pymdownx.details + - pymdownx.superfences: + custom_fences: + - name: mermaid + class: mermaid + format: !!python/name:pymdownx.superfences.fence_code_format + - pymdownx.mark + - attr_list + +copyright: | + © Herb Suttercppfront license