A flexible, CLI-based log analysis tool built in Python. Designed for developers, sysadmins, and automation engineers who need to parse, filter, and summarize structured log files quickly and cleanly.
- β
Filter logs by level (
ERROR,INFO,DEBUG, etc.) - π Output parsed summaries in either CSV or human-readable
.logformat - π Extract key metadata: timestamp, level, message, and user (if present)
- ποΈ Save filtered log lines to
triage.log - π± Optional ENV_TAG injection from environment variables (e.g.
dev,prod) - π Auto-generates
level_summary.logto show distribution of log levels - βοΈ Supports limiting number of lines in summary output
You'll need Python 3 installed on your system.
-
Clone the Repository (or download the files):
git clone <repository_url> cd <repository_name>
-
Run the
analyzer.pyscript from your terminal:python analyzer.py --file <path_to_your_log_file> [options]
Here are the arguments you can use:
--file <path>(Required): Specifies the path to your input log file.--level <level(s)>(Optional): Filters the log by specific levels. You can provide a single level (e.g.,INFO) or multiple levels separated by commas (e.g.,ERROR,WARNING). If omitted, it will process all lines.--format <format>(Optional): Sets the output format for the summary. Choose betweenlogorcsv. If not specified, it defaults tocsv.--tag_env(Optional): A flag to add anENV_TAG(environment tag) to your output file.--max_lines <number>(Optional): Limits the number of lines written to the summary file. The default is a very large number (1,000,000).
-
Analyze a log file and get a CSV summary of all levels:
python analyzer.py --file my_application.log --format csv
-
Filter for
ERRORandWARNINGmessages and output to a custom log format:python analyzer.py --file server.log --level ERROR,WARNING --format log
-
Get a CSV summary of
INFOmessages, include an environment tag, and limit to 500 lines:python analyzer.py --file debug.log --level INFO --format csv --tag_env --max_lines 500
The script creates an output directory in the same location where you run the script. Inside, you'll find:
triage.log: Contains the lines filtered by the specified--level.summary.csvorsummary.log: The summarized log information in your chosen format.level_summary.log: A file listing the counts of each log level found in the original log file.
analyzer.py: The main script that handles command-line arguments and orchestrates the log analysis.parser.py: Contains theLogAnalyzerclass, which handles file reading, filtering, detail extraction, and writing summaries. This is where most of the log parsing logic resides.utils.py: A small utility file for common functions like ensuring the output directory exists and retrieving environment tags.
This project was a great way to practice:
- File I/O: Reading from and writing to different file formats.
- Regular Expressions: Using
remodule for pattern matching and extracting specific information from log lines. This was a fun challenge! - Command-line Arguments: Using
argparseto make my script more flexible and user-friendly. - Modular Programming: Breaking down the problem into smaller, manageable functions and classes across different files.
- Error Handling: Basic checks for file existence and argument validation.
- More Robust Error Parsing: Right now, it's pretty basic. I'd like to improve how it extracts different types of errors.
- JSON Output: Add an option to output summaries in JSON format.
- Interactive Mode: Maybe a simple command-line interface for selecting options.
- Date/Time Filtering: Filter logs based on a date range.
- Configuration File: Allow users to specify default settings in a config file.