Skip to content

Billion laughs attack protection #4

@spookylukey

Description

@spookylukey

As noted in our security docs we currently have no protection against a billion laughs attack by FTL authors, either at compile-time or run-time.

Note the attack vector here is a malicious FTL author, which might be unlikely but should be considered for some usage scenarios. We are not talking about runtime issues where the attacker controls only the substitution, not the FTL message.

Compile-time

Example

-term1 = lol
-term2 = {-term1}{-term1}{-term1}{-term1}{-term1}{-term1}{-term1}{-term1}{-term1}{-term1}
-term3 = {-term2}{-term2}{-term2}{-term2}{-term2}{-term2}{-term2}{-term2}{-term2}{-term2}
# etc
message = {-term9}

Due to our current strategy of inlining all terms and simplifying, this will attempt to generate a function like:

def message(args, errors):
    return "lollollollollollollollollollollollollollollollollollol..."

and you'll use up a lot of memory at compile time.

We could protect against this by a combination of some kind of depth counter and reference counter in the compiler, and bailout when we hit the limits. In real world FTL, there is very rarely a need to have lots of references to other items, or deeply nested references.

Run-time

We don't inline messages at the call site, so the equivalent with messages would produce a run-time issue:

msg1 = lol
msg2 = {msg1}{msg1}{msg1}{msg1}{msg1}{msg1}{msg1}{msg1}{msg1}{msg1}
msg3 = {msg2}{msg2}{msg2}{msg2}{msg2}{msg2}{msg2}{msg2}{msg2}{msg2}
# etc.

Which compiles to something like:

def msg1(args, errors):
    return "lol"

def msg2(args, errors):
    return f'{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}{msg1(args, errors)}'

# etc

Attempting to use the last function in the chain would produce a very large string at runtime.

We could address this in two ways:

  1. At run-time - the compiled code for each message could check call depth in some way (e.g. by a passed in current_depth parameter). This would be a performance hit on every message, and relatively speaking a very large one for the common case.

  2. At compile time, by:

We may need to make some of these limits configurable.

As per normal fluent rules, we should not bail out with exceptions in these cases, but produce message functions that:

  • have truncated output
  • emit errors at compile-time/run-time errors as appropriate (normally both)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions