- Introduction
- Setting Up ELK Stack
- Loading Test Data
- Viewing the Data
- Kibana Query Language (KQL)
- Analyzing Data
- Integrating ELK with Filebeat
- Python Logging Levels
- Hands-on: Logging in Python
- Open Questions
- Logging Basics
- What to Log
- How to Log
- Nginx Logs
- Fluentd
- A Taste of Splunk
This guide provides a hands-on approach to setting up and using the ELK (Elasticsearch, Logstash, Kibana) stack for logging. It covers everything from cloning the repository to configuring and analyzing logs.
First, clone the ELK Docker repository:
git clone https://github.com/deviantony/docker-elkQuickly read through the repository and identify logstash.conf.
Navigate to the docker-elk directory and start the ELK stack:
cd docker-elk
docker-compose up setup
docker-compose upExpect to see something like this:
...
Creating docker-elk_elasticsearch_1 ... done
Creating docker-elk_kibana_1 ... done
Creating docker-elk_logstash_1 ... done
Attaching to docker-elk_elasticsearch_1, docker-elk_logstash_1, docker-elk_kibana_1
logstash_1 | OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
elasticsearch_1 | Created elasticsearch keystore in /usr/share/elasticsearch/config/elasticsearch.keystore
kibana_1 | {"type":"log","@timestamp":"2020-07-20T11:03:35Z","tags":["warning","plugins-discovery"],"pid":7,"message":"Expect plugin \"id\" in camelCase, but found: apm_oss"}
The stack is pre-configured with the following privileged bootstrap user:
user: elastic
password: changeme
Execute the following commands to change the passwords for the stack users:
docker-compose exec elasticsearch bin/elasticsearch-reset-password --batch --user elastic
docker-compose exec elasticsearch bin/elasticsearch-reset-password --batch --user logstash_internal
docker-compose exec elasticsearch bin/elasticsearch-reset-password --batch --user kibana_systemSample output:
❯ docker-compose exec elasticsearch bin/elasticsearch-reset-password --batch --user elastic
Password for the [elastic] user successfully reset.
New value: Wb7Yv6niUXEayOtNMCl*
❯ docker-compose exec elasticsearch bin/elasticsearch-reset-password --batch --user logstash_internal
Password for the [logstash_internal] user successfully reset.
New value: 3cHnndylWE0Dm3eHNx6N
❯ docker-compose exec elasticsearch bin/elasticsearch-reset-password --batch --user kibana_system
Password for the [kibana_system] user successfully reset.
New value: DzA=zWV8dxvTqsm=MVNS
Update the .env file in docker-elk with the new passwords:
ELASTIC_VERSION=8.2.3
## Passwords for stack users
ELASTIC_PASSWORD='Wb7Yv6niUXEayOtNMCl*'
LOGSTASH_INTERNAL_PASSWORD='3cHnndylWE0Dm3eHNx6N'
KIBANA_SYSTEM_PASSWORD='DzA=zWV8dxvTqsm=MVNS'Restart the service:
docker-compose down
docker-compose upEnsure the containers are running and check their logs:
docker psExample output:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7054d9fafeb9 docker-elk_logstash "/usr/local/bin/dock…" 7 minutes ago Up 7 minutes 0.0.0.0:5000->5000/tcp, 0.0.0.0:5044->5044/tcp, 0.0.0.0:9600->9600/tcp, 0.0.0.0:5000->5000/udp docker-elk_logstash_1
d506ff044997 docker-elk_kibana "/bin/tini -- /usr/l…" 7 minutes ago Up 7 minutes 0.0.0.0:5601->5601/tcp docker-elk_kibana_1
2dba2aed0b46 docker-elk_elasticsearch "/bin/tini -- /usr/l…" 7 minutes ago Up 7 minutes 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp docker-elk_elasticsearch_1
The ELK stack consists of Filebeat (installed on your webapp servers), Logstash, Elasticsearch, and Kibana.
- Filebeat: Lightweight shipper for forwarding and centralizing log data.
- Logstash: Server-side data processing pipeline.
- Elasticsearch: Distributed search and analytics engine.
- Kibana: Frontend application for search and data visualization.
- Login to
localhost:5601. - Go to Home (
http://localhost:5601/app/kibana#/home). - Add Sample Data -> Sample web logs -> Add data.
Send log entries via TCP:
# Using GNU netcat (CentOS, Fedora, MacOS Homebrew, ...)
$ cat /path/to/logfile.log | nc -c localhost 5000
# Using BSD netcat (Debian, Ubuntu, MacOS system, ...)
$ cat /path/to/logfile.log | nc -q0 localhost 5000Create an index pattern via Kibana API:
$ curl -XPOST -D- 'http://localhost:5601/api/saved_objects/index-pattern' \
-H 'Content-Type: application/json' \
-H 'kbn-version: 7.8.0' \
-u elastic:<your generated elastic password> \
-d '{"attributes":{"title":"logstash-*","timeFieldName":"@timestamp"}}'Or via the UI -> "Connect to your Elasticsearch index" -> Prefix "logstash-*" and select @timestamp.
Navigate to http://localhost:5601/app/kibana#/discover.
Explore the logs or logstash indexes.
Kibana Query Language (KQL) filters:
request: /kibana
response: 5*
host: *elastic* and not response:200
referer: http\://facebook.com*
Escape special characters with \.
- Go to Dashboards:
http://localhost:5601/app/kibana#/dashboard. - Create New -> Lens.
Task: Understand the health of the webapp by analyzing request statuses.
Questions:
- Where do most requests come from over the last 7 days? (Location and IP wise)
- What is the percentage of error logs?
- What are the top requests?
- What extension is used the most?
- What browser do most customers use?
In the docker-elk folder, edit logstash.conf to add:
index => "%{[@metadata][beat]}-%{[@metadata][version]}"
Example:
output {
elasticsearch {
hosts => "elasticsearch:9200"
user => "elastic"
password => "changeme"
index => "%{[@metadata][beat]}-%{[@metadata][version]}"
}
}
docker-compose build
docker-compose upCheck the running containers:
docker-compose pscd filebeat
docker build -t fcc/filebeat .docker run --rm -v '/var/lib/docker/containers:/usr/share/dockerlogs/data:ro' -v '/var/run/docker.sock:/var/run/docker.sock' --name filebeat fcc/filebeat:latest- Login to Kibana.
- Go to
Stack Management->Kibana->Data View->Create. - Type
filebeat-*and select@timestamp. - Click "Create data view".
- Research how to set up an alert for the logging system, e.g., alert when the number of 5xx > 10.
- Investigate setting up dashboards with Terraform.
Six log levels in Python:
- NOTSET = 0
- DEBUG = 10
- INFO = 20
- WARN = 30
- ERROR = 40
- CRITICAL = 50
Create my_logger.py:
import logging
import sys
from logging.handlers import TimedRotatingFileHandler
FORMATTER = logging.Formatter("%(asctime)s — %(name)s — %(levelname)s — %(message)s")
LOG_FILE = "my_app.log"
def get_console_handler():
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setFormatter(FORMATTER)
return console_handler
def get_file_handler():
file_handler = TimedRotatingFileHandler(LOG_FILE, when='midnight')
file_handler.setFormatter(FORMATTER)
return file_handler
def get_logger(logger_name):
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG) # better to have too much log than not enough
logger.addHandler(get_console_handler())
logger.addHandler(get_file_handler())
logger.propagate = False
return loggerIn flask_app.py, add:
from my_logger import get_logger
log = get_logger(__name__)
@app.route('/test/')
def test():
log.info('hitting /test/ endpoint')
return 'rest'
@app.errorhandler(500)
def handle_500(error):
log.error(f'something went wrong {error}')
return str(error), 500Run locally:
python flask_app.pyOr via Docker Compose.
Ensure the my_app.log file captures stack traces by wrapping functions with try-except.
Modify my_logger.py:
class OneLineExceptionFormatter(logging.Formatter):
def formatException(self, exc_info):
result = super().formatException(exc_info)
return repr(result)
def format(self, record):
result = super().format(record)
if record.exc_text:
result = result.replace("\n", "")
return resultUpdate get_logger to use OneLineExceptionFormatter:
def get_logger(logger_name):
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)
one_line_formatter = OneLineExceptionFormatter("%(asctime)s — %(name)s — %(levelname)s — %(message)s")
logger.addHandler(get_console_handler(one_line_formatter))
logger.addHandler(get_file_handler(one_line_formatter))
logger.propagate = False
return loggerRebuild and rerun to see one-liner logs.
Run your app or container and generate traffic to see logs in Kibana.
- What is logging?
- What log types do we know?
- If you are a developer, what do you use the logs for?
- Which tool do you use for logging?
- How do you tell if your application is buggy or not by log?
- How frequently do you collect the logs?
- What if you have millions of logs, how do you find the useful one?
- Visibility is key for your application.
Example logs:
- TIMESTAMP
- ACCESS_METHOD
- STATUS_CODE
- RESPONSE_TIME
Example logs:
- OS image version update
- Network/interface update
- Kernel update
Example logs:
- Healthcheck
- Startup/shutdown logs
Example logs:
- Exhausted threads/connections
- Exhausted CPU/disk/memory log
Example logs:
- Hacker input
- Common User Behavior
Modern logging systems:
- Logging Aggregator/Forwarder
- Search
- Visualization
Create logs folder and files:
sudo mkdir /usr/share/nginx/logs
sudo touch /usr/share/nginx/logs/error.log
sudo touch /usr/share/nginx/logs/access.logRestart nginx:
sudo systemctl reload nginxCheck logs:
less /usr/share/nginx/logs/error.log
less /usr/share/nginx/logs/access.logInstall Fluentd:
sudo apt-get install ruby-full
gem install fluentd
fluentd -s conf
fluentd -c conf/fluent.conf &Install Fluentd UI:
gem install fluentd-ui
fluentd-ui setup
fluentd-ui start --daemonizeInstall multiline parser and configure Fluentd for Nginx logs:
fluent-gem install fluent-plugin-multi-format-parser
cp fluent.conf /conf/fluent.conf
sudo pkill -f fluentd
fluentd -c conf/fluent.conf -vv &Fill in the following details and download
If there is a GUI, click install; otherwise, run
sudo dpkg -i splunk_xxx.deb
At the first time running splunk, it will ask for the initial setup for the admin account.
cd /opt/splunk/bin/
sudo ./splunk start --accept-license
Use the username and password that you set in the last step to login
Let us download some sample data https://data.montgomerycountymd.gov/ https://data.montgomerycountymd.gov/Finance-Tax-Property/County-Spending/vpf9-6irq
Add Data -> upload from my computer -> select *.csv from ~/Download
Source type: this helps Splunk to learn what data you have got. _ You specify a directory as a data source. _ You specify a network input as a data source. * You specify a data source that has been forwarded from another Splunk instance.
Let us select "Automatic" in this short example.
Host: When the Splunk platform indexes data, each event receives a "host" value. The host value should be the name of the machine from which the event originates. The type of input you choose determines the available configuration options.
Index:The Splunk platform stores incoming data as events in the selected index. Consider using a "sandbox" index as a destination if you have problems determining a source type for your data. A sandbox index lets you troubleshoot your configuration without impacting production indexes. You can always change this setting later.
Let us use the default.
Now choose review
The error is about Splunk cannot recognise the time format
go to /opt/splunk/etc/system/local/props.conf
add the following lines
TIMESTAMP_FIELDS = Invoice Date
TIME_FORMAT=%m/%d/%Y
and restart Splunk
cd /opt/splunk/bin/
./splunk restart
Now there should be no error. Let us submit the data and start searching!