Billion laughs attack protection

As noted in our [security docs](https://fluent-compiler.readthedocs.io/en/latest/security.html) we currently have no protection against a [billion laughs attack](https://en.wikipedia.org/wiki/Billion_laughs_attack) by FTL authors, either at compile-time or run-time.

Note the attack vector here is a malicious FTL author, which might be unlikely but should be considered for some usage scenarios. We are not talking about runtime issues where the attacker controls only the substitution, not the FTL message.

# Compile-time 

Example
```ftl

-term1 = lol
-term2 = {-term1}{-term1}{-term1}{-term1}{-term1}{-term1}{-term1}{-term1}{-term1}{-term1}
-term3 = {-term2}{-term2}{-term2}{-term2}{-term2}{-term2}{-term2}{-term2}{-term2}{-term2}
# etc
message = {-term9}
```

Due to our current strategy of inlining all terms and simplifying, this will attempt to generate a function like:
```python
def message(args, errors):
    return "lollollollollollollollollollollollollollollollollollol..."
```

and you'll use up a lot of memory at compile time.

We could protect against this by a combination of some kind of depth counter and reference counter in the compiler, and bailout when we hit the limits. In real world FTL, there is very rarely a need to have lots of references to other items, or deeply nested references.

# Run-time

We don't inline messages at the call site, so the equivalent with messages would produce a run-time issue:
```ftl
msg1 = lol
msg2 = {msg1}{msg1}{msg1}{msg1}{msg1}{msg1}{msg1}{msg1}{msg1}{msg1}
msg3 = {msg2}{msg2}{msg2}{msg2}{msg2}{msg2}{msg2}{msg2}{msg2}{msg2}
# etc.
```

Which compiles to something like:
```python
def msg1(args, errors):
    return "lol"

def msg2(args, errors):
    return f'{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}'

# etc
```
Attempting to use the last function in the chain would produce a very large string at runtime.

We could address this in two ways:
1. At run-time - the compiled code for each message could check call depth in some way (e.g. by a passed in `current_depth` parameter). This would be a performance hit on every message, and relatively speaking a very large one for the common case.

2. At compile time, by:

   * noting that we already disallow cycles i.e. recursion or mutual recursion. 
   * We can therefore produce a total ordering of the functions we need to generate in terms of dependency on other functions. (See elm-fluent, which shares a lot of code with python-compiler in the copy-paste sense, and does this ordering of functions. Some of these could be copied over easily - https://github.com/elm-fluent/elm-fluent/blob/777477bea84b475d0489032fa923b71f32f15c88/src/elm_fluent/compiler.py#L212 and https://github.com/elm-fluent/elm-fluent/blob/777477bea84b475d0489032fa923b71f32f15c88/src/elm_fluent/compiler.py#L580) 
   * We go to the bottom (functions that call no other functions) and label functions with no dependencies with `calls_others_depth=0`
   * and go up the chain, adding `calls_others_depth = max(function.calls_others_depth for function in this_function.functions_that_i_call)` to each function.
   * We can then impose some kind of  low limit on this depth (e.g. 4)
   * In addition we applying a low limit on the number of substitutions allowed per message (e.g. 10)


We may need to make some of these limits configurable.

As per normal fluent rules, we should not bail out with exceptions in these cases, but produce message functions that:
* have truncated output
* emit errors at compile-time/run-time errors as appropriate (normally both)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Billion laughs attack protection #4

Compile-time

Run-time

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Billion laughs attack protection #4

Description

Compile-time

Run-time

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions