The goal of this page is to raise awareness and understanding of the log injection vulnerability. Watch the following one-minute videos to see different log injection scenarios and exploits. Details and explanations are below.
The demonstrations are built with Java, Spring Boot, and log4j, but the vulnerability is not tied to these specific technologies.
- Ep.1 "Newlines" (0:53)
- Ep.2 "ANSI Sequences" (0:58)
- Ep.3 "JavaScript" (1:11)
log-injection-ep01.mp4
The attack in Ep. 1 only works with pattern layout logging. For an attack that also works with structured layout (e.g., JSON layout), see Ep. 3.
log-injection-ep02.mp4
log-injection-ep03.mp4
The video demonstrates the attack using a pattern layout of the log, but the attack works just as well with a structured layout (e.g., JSON).
The video shows a real XSS vulnerability in an older version of Kibana. Both Kibana and Splunk have seen a stream of XSS vulnerabilities over the past years. XSS can be exploited toward a variety of malicious ends, e.g., password stealing with fake login screens.
A log injection is a generic vulnerability that occurs when an application receives untrusted data (often that would be data in an http request) and writes this data to a log without making sure that the data is in some sense safe. The sequence of events is typically as follows:
- an attacker submits malicious input to an application
- the application writes that input to a log
- the log is (later) processed by one or more log-processing system (which may have vulnerabilities)
- and/or the log is reviewed by a human
The malicious log content can mislead the human reviewer and/or compromise log processing. The goals of the attack are to undermine forensics and to bypass logging-related security mechanisms. Furthermore, exploiting a vulnerability in a log processor can be used to further escalate damage.
A log injection is a generic vulnerability that can be used to inject various kinds of malicious inputs into the logs. Here is an incomplete list of popular examples of unsafe log content and its impact.
character class | potential impact | example (plain text) |
---|---|---|
newlines | create fake log entries | |
ANSI sequences | hide log entries in a terminal | ^[[2K^[[1A |
JavaScript | exploit potential XSS in log dashboards | <img src=1 onerror="javascript:alert('pwned')"> |
Unicode homographs | undermine forensics (via strings that differ but look the same) | admin vs аdmin |
lookup expressions | exploit potential vulnerabilities in the logging library | ${jndi:ldap:...} |
Each of these scenarios may or may not apply to you. For example, if your application writes logs in a JSON format. then you typically already have protection against newline injection and creation of fake log entries. At the same time, you are typically still not protected against JavaScript or Unicode homograph injection.
In general, there is no ultimate definition of what is safe or unsafe to log - that depends on how your logs are processed and your risk appetite. Unfortunately, experience shows that in practice logs are often "promiscuous" - they can move around and be processed in a variety of systems, sometimes in surprising ways and a long time after their creation. It is thus better to implement more protections than less. Luckily, it is possible to do so.
Unfortunately, there is currently no "push-button" mitigation for log injection. I will explain different approaches and trade-offs below. The explanation will feature Java, but the concepts apply to other technologies as well.
I strongly advise against trying to sanitize the input (i.e., trying to delete potentially dangerous characters from it - s.replace("\\n","")
and the like). There are several reasons but the main one is that such sanitization is not reversible. Yet, for forensical purposes, we always want to preserve the original input (while still neutralizing the danger it may exude).
The best mitigation is thus encoding. Various encodings are possible - URL-encoding (also known as percent-encoding) is a good choice. It is reversible, easy to imoplement, and protects against a very wide range of dangerous inputs.
This variant is conceptually simple and flexible. It also protects against any vulnerabilities in the logging library iteself (famous example: CVE-2021-44228 Log4Shell).
Define a helper method as follows:
public static String encode(String s) {
return java.net.URLEncoder.encode(s,
java.nio.charset.StandardCharsets.UTF_8);
}
and utilize this encoder when logging untrusted data, e.g.:
@GetMapping("/endpoint")
public String endpoint(@RequestParam("data") String s) {
Logger logger = LoggerFactory.getLogger(Controller.class);
logger.info("Received data item {}", encode(s));
return "OK";
}
Note that untrusted data has to be encoded when storing it in a ThreadContext or MDC (these are examples of a user-constructed key-value map that a logger library will include in the log entry). The same recommendation applies when constructing exceptions.
Don't: throw new IllegalArgumentException(s);
Do: throw new IllegalArgumentException(encode(s));
Consistent usage of encoding can be enforced by static analysis (or even grep).
Configuring a structured log layout (e.g., a JSON layout) already protects against some log injection scenarios, as the logging library will typically encode the characters that have a special meaning in JSON. An attacker will thus not be able to change the structure of the log record and thus create fake log entries. Injection of JavaScript and misdirection with homographs is typially still possible, though. With some logging libraries, users can supply additional encoders, which would help to mitigate these attack scenarios as well.