You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/datasources/kafka.md
+31-1
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,32 @@ Fluentd can setup to collect messages from Kafka. Applications include:
11
11
1. Sending Kafka messages into HDFS for analysis
12
12
2. Sending Kafka messages into Elasticsearch for analysis
13
13
14
-
## Setup
14
+
You can two choices for this purpose whether using `in_kafka` or using `kafka-fluentd-consumer`.
15
+
16
+
## Setup: fluent-plugin-kafka
17
+
18
+
1. Install the [Kafka input plugin](https://github.com/htgc/fluent-plugin-kafka) by running the following command:
19
+
20
+
```
21
+
$ fluent-gem install fluent-plugin-kafka
22
+
```
23
+
24
+
2. Open your Fluentd configuration file and add the following lines:
25
+
26
+
```
27
+
<source>
28
+
@type kafka
29
+
host <broker host>
30
+
port <broker port: default=9092>
31
+
topics <listening topics(separate with comma',')>
32
+
format <input text type (text|json|ltsv|msgpack)>
33
+
message_key <key (Optional, for text format only, default is message)>
34
+
</source>
35
+
```
36
+
37
+
With the above setup, Fluentd consumes Kafka messages via `in_kafka` plugin.
38
+
39
+
## Setup: kafka-fluentd-consumer
15
40
16
41
1. Download the latest [kafka-fluentd-consumer jar](https://github.com/treasure-data/kafka-fluentd-consumer/releases).
17
42
@@ -29,3 +54,8 @@ Fluentd can setup to collect messages from Kafka. Applications include:
29
54
```
30
55
31
56
With the above setup, Fluentd consumes Kafka messages which are specified topics in `fluentd-consumer.properties` via `in_exec` plugin.
57
+
58
+
### Note
59
+
60
+
For simplification, you can use `in_kafka` plugin to retrive kafka messages.
61
+
If you assume highly kafka traffic in production, we recommend to use `kafka-fluentd-consumer` instead of `in_kafka`. Because `in_kafka` has been reported high CPU usage when 1000req/sec environment. In more detail, please refer to [the issue](https://github.com/htgc/fluent-plugin-kafka/issues/16).
0 commit comments