Skip to content
This repository has been archived by the owner on Jan 23, 2021. It is now read-only.

Commit

Permalink
Merge pull request #114 from Nu-SCPTheme/FTML-49
Browse files Browse the repository at this point in the history
Substitute include blocks before preprocessing
  • Loading branch information
Ammon Smith authored Jan 19, 2021
2 parents 9b7923c + d6cf067 commit b1e881c
Show file tree
Hide file tree
Showing 128 changed files with 4,753 additions and 430 deletions.
690 changes: 677 additions & 13 deletions Cargo.lock

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,9 @@ strum = "0.20"
strum_macros = "0.20"
tinyvec = "1"
unicase = "2"
void = "1"
wikidot-normalize = "0.6"

[dev-dependencies]
maplit = "1"
sloggers = "1"
54 changes: 43 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,15 +57,27 @@ While the expanded form of the initialism is never explicitly stated, it is clea
name similarity to HTML.

### Usage
There are three exported functions, which correspond to each of the main steps in the wikitext process.
There are a couple main exported functions, which correspond to each of the main steps in the wikitext process.

First is `preprocess`, which will perform Wikidot's various minor text substitutions.
First is `include`, which substitutes all `[[include]]` blocks for their replaced page content. This returns the substituted wikitext as a new string, as long as the names of all the pages that were used. It requires an object that implement `Includer`, which handles the process of retrieving pages and generating missing page messages.

Second is `tokenize`, which takes the input string and returns a wrapper type. This can be `.into()`-ed into a `Vec<ExtractedToken<'t>>` should you want the token extractions it produced. This is used as the input for `parse`.
Second is `preprocess`, which will perform Wikidot's various minor text substitutions.

Third is `tokenize`, which takes the input string and returns a wrapper type. This can be `.into()`-ed into a `Vec<ExtractedToken<'t>>` should you want the token extractions it produced. This is used as the input for `parse`.

Then, borrowing a slice of said tokens, `parse` consumes them and produces a `SyntaxTree` representing the full structure of the parsed wikitext.

Finally, with the syntax tree you `render` it with whatever `Render` instance you need at the time. Most likely you want `HtmlRender`.

```rust
fn include<'t, I, E>(
log: &slog::Logger,
input: &'t str,
includer: I,
) -> Result<(String, Vec<PageRef<'t>>), E>
where
I: Includer<'t, Error = E>;

fn preprocess(
log: &slog::Logger,
text: &mut String,
Expand Down Expand Up @@ -96,8 +108,23 @@ store the results in a `struct`.
// journalled messages are outputted to.
let log = slog::Logger::root(/* drain */);

// Get an `Includer`.
//
// See trait documentation for what this requires, but
// essentially it is some abstract handle that gets the
// contents of a page to be included.
//
// Two sample includers you could try are `NullIncluder`
// and `DebugIncluder`.
let includer = MyIncluderImpl::new();

// Get our source text
let mut input = "**some** test <<string?>>";

// Substitute page inclusions
let (mut text, included_pages) = ftml::include(&log, input, includer);

// Perform preprocess substitions
let mut text = str!("**some** test <<string?>>");
ftml::preprocess(&log, &mut text);

// Generate token from input text
Expand All @@ -121,13 +148,13 @@ let (tree, warnings) = result.into();
See [`Serialization.md`](Serialization.md).

### Server
If you wish to build the `ftml-server` subcrate, use the following:
If you wish to build the `ftml-http` subcrate, use the following:
Note that it was primarily designed for UNIX-like platforms, but with
some minor changes could be modified to work on Windows.

```sh
$ cargo build -p ftml-server --release
$ cargo run -p ftml-server
$ cargo build -p ftml-http --release
$ cargo run -p ftml-http
```

This will produce an HTTP server which a REST client can query to perform ftml operations.
Expand All @@ -142,12 +169,12 @@ Its usage message (produced by adding `-- --help` to the above `cargo run` invoc
is reproduced below:

```
ftml ftml-server v0.3.1 [8a42fccd]
ftml ftml-http v0.3.1 [8a42fccd]
Wikijump Team
REST server to parse and render Wikidot text.
USAGE:
ftml-server [FLAGS] [OPTIONS]
ftml-http [FLAGS] [OPTIONS]
FLAGS:
-h, --help Prints help information.
Expand All @@ -169,6 +196,11 @@ $ curl \
-X POST \
-H 'Content-Type: application/json' \
--compressed \
--data '{"text": "<your input here>"}' \
http://localhost:3865/parse
--data '
{
"text": "<your input here>",
"callback-url": "http://localhost:8000/included-pages",
"missing-include-template": "No page {{ page }} {% if site %}on site {{ site }} {% endif %}exists!"
}' \
http://localhost:3865/parse
```
181 changes: 164 additions & 17 deletions ServerRoutes.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,173 @@
[<< Return to the README](README.md)

## ftml-server Routes

Note that input text are really simple JSON objects in the following form:
```json
{
"text": "<your input string>"
}
```
## ftml-http Routes

The currently available API routes in the server are:

| Method | Route | Input | Output | Description |
|--------|-------|-------|--------|-------------|
| Any | `/ping` | None | `String` | See if you're able to connect to the server. |
| Any | `/version` | None | `String` | Outputs what version of ftml is being run. |
| `POST` | `/preprocess` | Text | `String` | Runs the preprocessor on the given input string. |
| `POST` | `/tokenize` | Text | `Vec<ExtractedToken>` | Runs the tokenizer on the input string and returns the extracted tokens. |
| `POST` | `/tokenize/only` | Text | `Vec<ExtractedToken>` | Same as above, but the preprocessor is not run first. |
| `POST` | `/parse` | Text | `ParseOutcome<SyntaxTree>` | Runs the parser on the input string and returns the abstract syntax tree. |
| `POST` | `/parse/only` | Text | `ParseOutcome<SyntaxTree>` | Same as above, but the preprocessor is not run first. |
| `POST` | `/render/html` | Text | `ParseOutcome<HtmlOutput>` | Performs the full rendering process, from preprocessing, tokenization, parsing, and then rendering. |
| `POST` | `/render/html/only` | Text | `ParseOutcome<HtmlOutput>` | Same as above, but the preprocessor is not run first. |
| `POST` | `/render/debug` | Text | `ParseOutcome<String>` | Performs rendering, as above, but uses `ftml::DebugRender`. |
| `POST` | `/render/debug/only` | Text | `ParseOutcome<String>` | Same as above, but the preprocessor is not run first. |
| `POST` | `/include` | `TextInput` | `Response<IncludeOutput>` | Substitutes all include blocks in the input string. |
| `POST` | `/preprocess` | `TextInput` | `Response<PreprocessOutput>` | Runs the preprocessor on the given input string. |
| `POST` | `/tokenize` | `TextInput` | `Response<TokenizeOutput>` | Runs the tokenizer on the input string and returns the extracted tokens. |
| `POST` | `/parse` | `TextInput` | `Response<ParseOutput>` | Runs the parser on the input string and returns the abstract syntax tree. |
| `POST` | `/render/html` | `TextInput` | `Response<HtmlRenderOutput>` | Performs the full rendering process, from inclusion, preprocessing, tokenization, parsing, and then rendering. |
| `POST` | `/render/debug` | `TextInput` | `Response<DebugRenderOutput>` | Performs rendering, as above, but uses `ftml::render::DebugRender`. |

Where the structures expected are the following:

**TextInput`** is the object describing a text input, and the specifications necessary to perform include substitution.

* `text` is the input wikitext to be processed.
* `callback-url` is the URL that ftml-http will POST to with an `IncludeRequest`, to get the pages to be included.
* `missing-include-template` is the template used to generate the "missing include" string if the `callback-url` does not return a result for a page. This allows jinja2-like syntax, backed by the crate [`tera`](https://crates.io/crates/tera). Three context variables are provided: `site` (nullable), `page`, `path`.

```json
{
"text": "**My** //wikitext//!",
"callback-url": "http://localhost:8000/includes",
"missing-include-template": "Page '{{ page }}' is missing!"
}
```

**`IncludeRequest`** is the object requesting a foreign server return contents for each of these pages. It is just the field `includes` pointing to a list of `IncludeRef`s.

**`IncludeRef`** is the object describing one particular page to be included. It has two fields, `page-ref`, which specifies the page being included, and a map of all the variables to substitute.

Page references are composed of an optional site, then the page name. For instance `component:blah` would be on-site (`null`), and `:scp-wiki:main` would be off-site (site would be `scp-wiki`).

```json
{
"page-ref": {
"site": null,
"page": "page-name"
},
"variables": {
"each": "variable",
"here!": ""
}
}
```

**`IncludeResponse`** is the object expected from the foreign server returning contents of the fetched pages. It is a list of `FetchedPage` objects.

**`FetchedPage`** is the object describing one retrieved page. The first field, `page-ref`, describes which page it has content for. The second, `content`, has the data to be replaced, or null, if the page was not found.

The number of returned pages should exactly match the order and count of the requested pages. Each index between the request and the response must share the same `PageRef` in the same order.

```json
{
"page-ref": {
"site": null,
"page": "theme:black-highlighter-theme"
},
"content": "[[module CSS]]\n...\n[[/module]]"
}
```

**`Response`** is a wrapper to describe the state of an API call. It takes one of two forms:

Success:
```json
{
"result": [ "data", "here" ]
}
```

Error:
```json
{
"error": "Error message here"
}
```

This is a generic type, so what is inside depends on what is being wrapped. Errors will always be strings.

**`IncludeOutput`** is the object describing the result of a successful `/include` call.

The `text` fields represents the replaced wikitext. The `pages-included` is a list of `PageRef` instances, describing the pages that were included in the text.

```json
{
"text": "Wikidot text following replacement"
"pages-included": [
{
"site": null,
"page": "some-page"
}
],
}
```

**`PreprocessOutput`** is the object describing the result of a successful `/preprocess` call.

It is functionally the same as `IncludeOutput`, except also describes the preprocess step being applied after inclusion.

```json
{
"text": "My //wikitext// here!"
"pages-included": []
}
```

**`ParseOutput`** is the object describing the result of a successful `/parse` call.

It extends `PreprocessOutput`, with two added fields.

* `syntax_tree` is the JSON representation of the abstract syntax tree (AST) created by the parser, a recursively nested series of elements which describe its structure.
* `warnings` is a list of warning objects, describing parsing issues.

```json
{
"text": "My //wikitext// here!"
"pages-included": [],
"syntax-tree": {
"elements": [],
"styles": []
},
"warnings": []
}
```

**`HtmlRenderOutput`** is the object describing the result of a successful `/render/html` call.

It extends `ParseOutput`, with three new fields.

* `html` is the generated HTML body, corresponding to the wikitext.
* `style` is the full collected stylesheet, as specified through CSS in the wikitext.
* `meta` is the list of HTML meta tags to add to the HTML document's `<head>`.

```json
{
"text": "My //wikitext// here!"
"pages-included": [],
"syntax-tree": {
"elements": [],
"styles": []
},
"warnings": [],
"html": "<strong>test</strong>",
"style": "a { display: none }",
"meta": []
}
```

**`DebugRenderOutput`** is the object describing the result of a successful `/render/html` call.

It extends `ParseOutput`, with one new fields.

* `output` is the string output of the `DebugRender` implementation.

```json
{
"text": "My //wikitext// here!"
"pages-included": [],
"syntax-tree": {
"elements": [],
"styles": []
},
"warnings": [],
"output": "< Debug! >"
}
```
3 changes: 3 additions & 0 deletions ftml-http/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,15 @@ clap = "2"
ftml = { path = ".." }
hostname = "0.3"
lazy_static = "1"
reqwest = { version = "0.11", features = ["blocking", "json"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
slog = "2.7"
slog-bunyan = "2"
sloggers = "1"
str-macro = "0.1"
tera = "1.6"
thiserror = "1"
tokio = { version = "0.2", features = ["macros"] }
users = "0.11"
warp = { version = "0.2", features = ["compression"] }
Expand Down
6 changes: 6 additions & 0 deletions ftml-http/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,15 @@ extern crate built;
use std::env;

fn main() {
// Generate build information
if let Ok(profile) = env::var("PROFILE") {
println!("cargo:rustc-cfg=build={:?}", &profile);
}

built::write_built_file().expect("Failed to compile build information!");

// Set openssl library
if env::var("CARGO_CFG_UNIX").is_ok() {
println!("cargo:rustc-flags=-L /usr/lib/openssl-1.0");
}
}
Loading

0 comments on commit b1e881c

Please sign in to comment.