An intelligent text summarization tool that preserves the emotional tone and sentiment of your original content using state-of-the-art NLP models.
The Emotion-Aware Text Summarizer is a sophisticated Python tool that goes beyond traditional summarization by maintaining the emotional context of your text. Built with reliability and simplicity in mind, it leverages powerful transformer models to deliver high-quality, sentiment-preserving summaries.
- π Emotion Preservation: Analyzes and maintains the original emotional tone (positive, negative, neutral)
- π State-of-the-Art Summarization: Powered by BART-large-CNN for superior text understanding
- β‘ GPU Acceleration: Automatic CUDA detection for enhanced performance
- π Hierarchical Processing: Intelligent chunking for long documents with coherent output
- π§ Flexible Input: Support for both direct text input and file processing
- π» CLI-First Design: Clean command-line interface with comprehensive options
- π Multi-Language Ready: Basic support for non-English text (English optimized)
- Python 3.8 or higher
- pip package manager
- CUDA-compatible GPU (optional, for acceleration)
# Clone the repository
git clone https://github.com/swamy18/emotion-aware-summarizer.git
cd emotion-aware-summarizer
# Install dependencies
pip install -r requirements.txt
# Or install manually
pip install torch transformers textblob tqdm
# Install with development dependencies
pip install -e ".[dev]"
# Install pre-commit hooks (recommended)
pre-commit install
# Summarize a file
python summarizer.py --input article.txt --output summary.txt --max-length 150
# Summarize text directly
python summarizer.py --text "Your amazing text here!" --max-length 100
# Quick summary to console
python summarizer.py -i document.txt
# Process large document with custom parameters
python summarize.py \
--input large_document.txt \
--output summary.txt \
--max-length 300 \
--preserve-emotion
# Batch processing (coming soon)
python summarize.py --batch-dir ./documents --output-dir ./summaries
Option | Short | Description | Default |
---|---|---|---|
--input |
-i |
Path to input text file | - |
--text |
-t |
Direct text input (alternative to --input) | - |
--output |
-o |
Output file path (optional) | Console output |
--max-length |
-l |
Maximum words in summary | 150 |
The tool automatically handles various input scenarios:
- β Empty input detection
- β Minimum length validation (30 words)
- β Maximum file size limits (5MB default)
- β Encoding detection and handling
- β Format validation
# Test edge cases
python summarizer.py --text "" # Empty input handling
python summarizer.py --text "Short text." # Minimum length check
# Test large files
python summarizer.py --input large_file.txt # Size limit validation (5MB max)
# Test non-English content
python summarizer.py --text "La vida es bella y el sol brilla."
# Test emotional content
python summarizer.py --text "This is absolutely terrible and disappointing!" --max-length 30
python summarizer.py --text "I'm so incredibly happy and excited about this!" --max-length 30
# Run basic functionality test
python -c "
import subprocess
result = subprocess.run(['python', 'summarizer.py', '--text', 'This is a simple test.'],
capture_output=True, text=True)
print('β
Basic test passed' if result.returncode == 0 else 'β Test failed')
"
# Performance test with timing
time python summarizer.py --input large_document.txt --max-length 200
emotion-aware-summarizer/
βββ summarizer.py # π― Main application (single file!)
βββ requirements.txt # Python dependencies
βββ LICENSE # MIT License
βββ README.md # This file
βββ examples/ # Sample text files for testing
βββ positive_article.txt
βββ negative_review.txt
βββ neutral_news.txt
Note: This is a single-file application - all functionality is contained in summarizer.py
for simplicity and ease of deployment!
- Summarization: BART-large-CNN (Facebook AI) - State-of-the-art transformer model
- Sentiment Analysis: TextBlob - Fast, lightweight sentiment detection
- Text Processing: Intelligent word-based chunking for long documents
- Hardware: Automatic CPU/GPU detection with CUDA optimization
- Memory Management: Smart lazy loading and GPU cache cleanup
- Lazy Loading: Models load only when needed, improving startup time
- Chunked Processing: Handles documents larger than model limits
- Hierarchical Summarization: Summarizes chunks, then summarizes summaries
- Emotion-Guided Generation: Uses sentiment-aware prefixes for tone preservation
- Error Recovery: Graceful handling of model failures and edge cases
Document Size | Processing Time* | Memory Usage** |
---|---|---|
< 1KB | ~0.5s | ~200MB |
1-10KB | ~2-5s | ~300MB |
10-100KB | ~10-30s | ~500MB |
100KB-1MB | ~30-120s | ~800MB |
* Times measured on RTX 3080, varies by hardware
** Peak memory usage during processing
- Single-file implementation for easy deployment
- Emotion-aware summarization with TextBlob
- BART-large-CNN integration
- GPU acceleration with automatic detection
- Intelligent chunking for long documents
- File size validation (5MB limit)
- Comprehensive error handling
- Progress bars for model loading and processing
- Web interface using Flask
- Batch processing built-in command
- Configuration file support (YAML/JSON)
- Additional emotion models (VADER, RoBERTa)
- Docker containerization
- Output format options (JSON, XML, HTML)
- REST API with FastAPI
- Multiple summarization models (T5, Pegasus)
- Custom emotion training capabilities
- Real-time processing for streaming text
- Multi-language emotion detection
- Plugin architecture for custom models
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Make your changes
- Add tests for new functionality
- Ensure all tests pass (
pytest
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
We use Black for code formatting and isort for import sorting.
# Format code
black summarizer/
isort summarizer/
# Check linting
flake8 summarizer/
- Emotion Detection: TextBlob may miss nuanced emotions like sarcasm
- Language Support: Optimized for English; other languages may have reduced accuracy
- Memory Usage: Large documents (>1MB) require significant RAM
- Processing Time: Very long texts may take considerable time on CPU-only systems
# For integration into other Python projects
import subprocess
import json
def summarize_with_emotion(text, max_length=150):
"""Helper function to use the summarizer in other Python code."""
result = subprocess.run([
'python', 'summarizer.py',
'--text', text,
'--max-length', str(max_length)
], capture_output=True, text=True)
if result.returncode == 0:
return result.stdout
else:
raise Exception(f"Summarization failed: {result.stderr}")
# Example usage
news_text = """
The latest breakthrough in artificial intelligence has researchers excited about the potential
applications. Scientists have developed a new model that can understand context better than
ever before, leading to more accurate and helpful AI assistants...
"""
summary = summarize_with_emotion(news_text, max_length=100)
print(summary)
import os
import subprocess
from pathlib import Path
def process_directory(input_dir, output_dir, max_length=150):
"""Process all .txt files in a directory."""
input_path = Path(input_dir)
output_path = Path(output_dir)
output_path.mkdir(exist_ok=True)
for txt_file in input_path.glob("*.txt"):
output_file = output_path / f"{txt_file.stem}_summary.txt"
subprocess.run([
'python', 'summarizer.py',
'--input', str(txt_file),
'--output', str(output_file),
'--max-length', str(max_length)
])
print(f"β
Processed: {txt_file.name}")
# Usage
process_directory("./articles", "./summaries")
This project is licensed under the MIT License - see the LICENSE file for details.
- Hugging Face Transformers for the BART model
- TextBlob for sentiment analysis
- The open-source community for inspiration and support
- π§ Email: [email protected]
Made by [Swami Gadila]
If this project helped you, please consider giving it a β!