Skip to content

ebertolo/smart-summary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Smart Summary App

AI-powered text summarization with real-time streaming, built with FastAPI and Next.js


Business Description

Smart Summary App is an enterprise-ready solution that transforms lengthy documents into concise, actionable summaries using artificial intelligence. The application addresses a common business challenge: information overload. Teams across organizations spend significant time reading and processing large volumes of text—reports, articles, research papers, and documentation.

This solution enables employees to paste any text and receive an intelligent summary within seconds. The AI processes the content in real-time, showing results as they are generated rather than requiring users to wait for completion. This streaming approach provides immediate feedback and improves productivity.

Key Business Benefits:

  • Reduce time spent reading lengthy documents by up to 80%
  • Maintain consistency in how information is summarized across teams
  • Scale document processing without increasing headcount
  • Secure, authenticated access ensures only authorized users can access the system
  • Flexible deployment options support cloud or on-premise infrastructure

The application is designed to integrate with existing enterprise workflows and can be customized to meet specific organizational needs. It supports multiple AI providers, allowing organizations to choose the model that best fits their requirements and budget.


Table of Contents

  1. Technical Requirements
  2. Features
  3. Architecture
  4. Summarization Strategies
  5. Compression Ratio
  6. Technology Stack
  7. Project Structure
  8. Getting Started
  9. Deployment
  10. API Documentation
  11. Testing
  12. Security
  13. Scaling Considerations
  14. Future Roadmap
  15. Design Decisions
  16. Assumptions and Constraints

Technical Requirements

The application fulfills the following technical specifications:

Requirement Implementation
Frontend React with Next.js 14 (App Router)
Backend FastAPI with Python 3.11+
LLM Integration LangChain with Anthropic Claude (switchable to OpenAI/Gemini)
Streaming Server-Sent Events (SSE) for progressive summary generation
Deployment Configured for Vercel (frontend) and Render (backend)

Demo Credentials (Testing Only):

  • Username: demo
  • Password: As configured in environment variables

Important: For production environments, create users with strong passwords. Demo credentials should never be used in production.


Features

The application provides two categories of functionality: core features essential for summarization and advanced features that enhance the user experience.

Core Functionality

Real-time Streaming enables users to see the summary being generated token by token, providing immediate feedback rather than waiting for the entire response.

Multiple Summarization Strategies allow users to choose between simple, hierarchical, or detailed approaches depending on their needs.

JWT Authentication secures all API endpoints with industry-standard token-based authentication.

LLM Flexibility allows organizations to switch between Anthropic, OpenAI, or Gemini models without modifying application code.

Responsive Interface works seamlessly across desktop and mobile devices using Tailwind CSS.

Docker Support provides full containerization for consistent development and deployment environments.

Advanced Features

  • Progress indicators during summarization
  • Adjustable compression ratio (5% to 50% of original text)
  • Real-time character and word counting
  • One-click copy to clipboard
  • Input validation and prompt injection protection

Architecture

The application follows a modern three-tier architecture separating the user interface, business logic, and AI processing layers. This separation enables independent scaling and maintenance of each component.

┌────────────────────────────────────────────────────┐
│         Next.js Frontend (Vercel)                  │
│  - React Components (TypeScript)                   │
│  - SSE Client for Streaming                        │
│  - JWT Token Management                            │
└─────────────────┬──────────────────────────────────┘
                  │ HTTPS + SSE
┌─────────────────▼──────────────────────────────────┐
│       FastAPI Backend (Render.com)                 │
│  ┌──────────────────────────────────────────────┐  │
│  │  API Routes (JWT Protected)                  │  │
│  │  - /api/auth/* - Login/Register              │  │
│  │  - /api/summary/* - Streaming/Sync           │  │
│  └──────────────────┬───────────────────────────┘  │
│  ┌──────────────────▼───────────────────────────┐  │
│  │  Services Layer                              │  │
│  │  - SummarizerService (3 strategies)          │  │
│  │  - LLMService (LangChain abstraction)        │  │
│  │  - TextProcessor (chunking, cleaning)        │  │
│  └──────────────────┬───────────────────────────┘  │
│  ┌──────────────────▼───────────────────────────┐  │
│  │  LangChain + LLM APIs                        │  │
│  │  - Anthropic Claude 4.5 Sonnet               │  │
│  └──────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────┘

Technical Details:

  • The frontend communicates with the backend over HTTPS using Server-Sent Events for streaming responses
  • All API routes are protected by JWT authentication except for login and registration
  • The Services Layer implements business logic independently from the API layer
  • LangChain provides an abstraction layer that enables switching between AI providers without code changes

Summarization Strategies

Users can choose from three summarization approaches depending on the length and complexity of their source material. Each strategy is optimized for different use cases.

Simple Strategy

Best for quick summaries of shorter documents. The system sends the entire text to the AI model in a single request and returns the summary.

Attribute Value
Recommended text length Under 50,000 characters
Speed Fast (single LLM call)
Use cases Quick summaries, simple documents

Hierarchical Strategy (Recommended)

Best for most business documents. The system breaks the text into semantic chunks, summarizes each chunk in parallel, then combines the results into a coherent final summary.

Attribute Value
Recommended text length 50,000 to 300,000 characters
Method Semantic chunking, parallel summarization, combination
Use cases Articles, reports, documentation

Detailed Strategy

Best for comprehensive analysis where preserving key details is critical. The system first extracts important sentences using TextRank algorithm, then generates an abstractive summary from those extractions.

Attribute Value
Recommended text length Any size
Method Extractive sentence extraction plus abstractive LLM summary
Use cases Research papers, detailed analysis

Compression Ratio

The compression ratio determines the target length of the summary as a percentage of the original text. Users can adjust this setting based on their needs.

Ratio Description Use Case
5% Ultra-brief Executive summaries, key points only
15% Brief Main ideas and conclusions
20% Balanced (default) General-purpose summaries
30% Moderate detail Comprehensive overviews
50% Comprehensive Detailed summaries preserving nuance

Note: Actual compression may vary based on content complexity and the selected strategy.


Technology Stack

This section details the technologies used in each layer of the application.

Layer Technology Version Purpose
Frontend Next.js 14+ App Router, server-side rendering, streaming
UI Framework Tailwind CSS Latest Responsive styling
Backend FastAPI 0.109+ High-performance async API
LLM Integration LangChain Latest Provider abstraction
AI Models Anthropic Claude 4.5 Sonnet Text summarization
Authentication python-jose Latest JWT token management
Password Security bcrypt Latest Password hashing
Testing Pytest, Jest Latest Unit and integration tests
Containerization Docker Latest Development and deployment

Project Structure

The repository is organized into separate frontend and backend directories, each with its own configuration and deployment settings.

smart-summary/
├── backend/
│   ├── app/
│   │   ├── api/routes/        # API endpoints
│   │   │   ├── auth.py        # Login/register
│   │   │   └── summary.py     # Summarization endpoints
│   │   ├── services/          # Business logic
│   │   │   ├── llm_service.py      # LangChain wrapper
│   │   │   ├── summarizer.py       # Strategies
│   │   │   └── text_processor.py   # Text utilities
│   │   ├── core/              # Configuration
│   │   │   ├── config.py      # Settings
│   │   │   └── security.py    # JWT & passwords
│   │   ├── models/
│   │   │   └── schemas.py     # Pydantic models
│   │   └── main.py            # FastAPI app
│   ├── tests/
│   │   ├── test_api.py        # Endpoint tests
│   │   └── test_services.py   # Service tests
│   ├── Dockerfile
│   ├── requirements.txt
│   ├── render.yaml            # Deployment config
│   └── .env.example
├── frontend/
│   ├── app/
│   │   ├── page.tsx           # Main page
│   │   ├── layout.tsx         # Root layout
│   │   └── globals.css        # Global styles
│   ├── components/
│   │   ├── SummaryForm.tsx    # Input form
│   │   ├── SummaryDisplay.tsx # Output display
│   │   └── AuthModal.tsx      # Login/register
│   ├── lib/
│   │   └── api.ts             # API client
│   ├── Dockerfile
│   ├── vercel.json            # Deployment config
│   └── .env.example
└── docker-compose.yml         # Local development

Getting Started

This section provides instructions for setting up the application locally. For a faster setup experience, refer to the Quick Start Guide.

Prerequisites

Before beginning, ensure the following software is installed:

  • Node.js 20 or later
  • Python 3.11 or later
  • Docker and Docker Compose
  • An Anthropic API key (obtain from console.anthropic.com)

Docker Setup (Recommended)

Docker provides the simplest path to running the application locally. Follow these steps:

Step 1: Clone the repository

git clone https://github.com/ebertolo/smart-summary.git
cd smart-summary

Step 2: Configure the backend

cp backend/.env.example backend/.env

Edit backend/.env and configure the following values:

  • ANTHROPIC_API_KEY: Your Anthropic API key
  • JWT_SECRET: A secure random string for token signing
  • DEMO_USER_PASSWORD: Password for the demo user account

Step 3: Configure the frontend

cp frontend/.env.example frontend/.env.local

The default value NEXT_PUBLIC_API_URL=http://localhost:8000 is correct for local development.

Step 4: Start the application

docker-compose up --build

Step 5: Access the application

Log in with username demo and the password configured in your .env file.

Important: The .env files must be created before running Docker. The application will not start without proper configuration.

Local Development (Without Docker)

For development work where you need to modify code and see changes immediately, you may prefer running the services directly.

Backend Setup:

cd backend
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env and add ANTHROPIC_API_KEY
python scripts/init_db.py  # Creates demo user
uvicorn app.main:app --reload

Backend API documentation: http://localhost:8000/docs

Frontend Setup:

cd frontend
npm install
cp .env.example .env.local
# Verify NEXT_PUBLIC_API_URL=http://localhost:8000
npm run dev

Frontend: http://localhost:3000

For detailed backend setup instructions, see backend/QUICKSTART.md.


Deployment

This section covers deploying the application to production environments.

Pre-Deployment Security Checklist

Before deploying to production, complete the following security tasks:

  • Change JWT_SECRET to a cryptographically secure random value
  • Create production users with strong passwords (never use demo credentials)
  • Update CORS_ORIGINS with your production frontend URL
  • Set PYTHON_ENV=production in environment variables
  • Review and rotate all API keys
  • Consider migrating from SQLite to PostgreSQL for production workloads

Frontend Deployment (Vercel)

Option 1: Vercel Dashboard

  1. Navigate to vercel.com and sign in
  2. Import your GitHub repository
  3. Set the root directory to frontend
  4. Add environment variable: NEXT_PUBLIC_API_URL with your backend URL
  5. Deploy

Option 2: Vercel CLI

cd frontend
npm install -g vercel
vercel --prod

Backend Deployment (Render.com)

Option 1: Render Dashboard

  1. Navigate to render.com and sign in
  2. Create a new Web Service and connect your GitHub repository
  3. Set the root directory to backend
  4. Render will automatically detect the render.yaml configuration
  5. Add environment variable: ANTHROPIC_API_KEY
  6. Deploy

Post-Deployment: Update CORS Configuration

After deploying the backend, update the CORS configuration to allow requests from your frontend:

# backend/app/core/config.py
CORS_ORIGINS = [
    "http://localhost:3000",
    "https://your-app.vercel.app"  # Add your Vercel URL
]

Commit and push this change to trigger a redeployment.


API Documentation

The backend exposes a RESTful API with authentication and summarization endpoints. Full interactive documentation is available at /docs when running the backend.

Authentication Endpoints

POST /api/auth/login

Authenticates a user and returns a JWT token for subsequent API calls.

Request:

{
  "username": "demo",
  "password": "your_password"
}

Response:

{
  "access_token": "eyJhbGciOiJIUz...",
  "token_type": "bearer",
  "expires_in": 3600
}

POST /api/auth/register

Creates a new user account.

Request:

{
  "username": "newuser",
  "password": "your_secure_password"
}

Summarization Endpoints

POST /api/summary/summarize (Streaming)

Generates a summary with real-time streaming response.

curl -X POST http://localhost:8000/api/summary/summarize \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your long text here...",
    "strategy": "hierarchical",
    "compression_ratio": 0.20
  }'

Parameters:

Parameter Required Description
text Yes Text to summarize (100 to 300,000 characters)
strategy No simple, hierarchical, or detailed (default: hierarchical)
compression_ratio No 0.05 to 0.50 (default: 0.20)

Response (SSE Stream):

data: {"type": "content", "content": "First chunk", "done": false}

data: {"type": "content", "content": " continues...", "done": false}

data: {"type": "complete", "done": true}

POST /api/summary/summarize-sync (Non-Streaming)

Returns the complete summary in a single response. Use this endpoint when streaming is not required.


Testing

The application includes comprehensive test suites for both backend and frontend components.

Running Tests

Backend Tests:

cd backend
pytest --cov=app --cov-report=html

Run a specific test:

pytest tests/test_api.py::TestAuthEndpoints::test_login_with_valid_credentials

Frontend Tests:

cd frontend
npm test

Test Coverage

The test suites cover the following areas:

  • Authentication endpoints (login, registration, token validation)
  • All summarization strategies
  • Text processing utilities
  • JWT token generation and validation
  • Input validation and error handling

Security

This section details the security measures implemented in the application and recommendations for production environments.

Implemented Security Measures

Authentication and Authorization:

  • JWT authentication with configurable token expiration
  • Bcrypt password hashing with salt
  • Token-based API access control

Input Protection:

  • CORS protection limiting allowed origins
  • Pydantic-based input validation
  • Text length limits to prevent abuse
  • Prompt injection protection

Configuration Security:

  • Environment variable management for secrets
  • Separation of development and production configurations

Production Security Recommendations

For production deployments, implement the following additional measures:

Critical (Must Implement):

  • Generate a new JWT_SECRET using: python -c "import secrets; print(secrets.token_urlsafe(32))"
  • Create production users with strong passwords using scripts/init_db.py
  • Use HTTPS exclusively
  • Update CORS configuration with production URLs only

Recommended:

  • Implement rate limiting (consider slowapi library)
  • Add refresh token rotation
  • Implement API key rotation schedule
  • Set up logging and monitoring (APM tools)
  • Add request signing for sensitive operations
  • Build a user management interface

Creating Production Users

Never use demo credentials in production. Create new users with the following command:

cd backend
python scripts/init_db.py --username admin --password YOUR_SECURE_PASSWORD --email admin@yourcompany.com

Scaling Considerations

The application is designed to scale from small teams to enterprise deployments. This section outlines the current capacity and the path to higher scale.

Current Capacity

Metric Capacity
Concurrent users 100 to 1,000
Requests per second 50 to 100
Maximum text size 300,000 characters (~75,000 tokens)

Scaling Roadmap

Phase 1: 1,000 to 10,000 Users

  • Migrate from SQLite to PostgreSQL for reliable concurrent access
  • Enable Render auto-scaling for the backend
  • Implement per-user rate limiting

Phase 2: 10,000 to 100,000 Users

  • Add Celery for background job processing
  • Deploy a CDN for static assets
  • Configure database read replicas
  • Implement caching layer

Phase 3: 100,000+ Users

  • Migrate to microservices architecture
  • Implement message queue (RabbitMQ or Kafka)
  • Deploy on Kubernetes for orchestration
  • Configure multi-region deployment for global availability

Future Roadmap

Near-Term (Next Sprints)

  • Summary history with database persistence
  • File upload support (PDF, DOCX, TXT)
  • Export summaries to PDF and Markdown
  • Rate limiting implementation

Long-Term

  • Multi-language support
  • Custom prompts and templates
  • Batch processing interface
  • Summary sharing via public links
  • Mobile application
  • Voice input and output
  • Collaborative features
  • Analytics dashboard

Design Decisions

This section explains the rationale behind key architectural and technology choices.

Stateless Architecture

The application maintains no server-side session state. All user data is stored in the database (SQLite for development, PostgreSQL recommended for production). This design enables horizontal scaling—multiple backend instances can serve requests without coordination.

LangChain Integration

LangChain provides an abstraction layer over multiple LLM providers. This enables organizations to switch between Anthropic, OpenAI, or Gemini models by changing configuration rather than modifying code. This flexibility protects against vendor lock-in and allows optimization based on cost and performance requirements.

Streaming Responses

Server-Sent Events deliver summary content to users as it is generated. For long documents, this means users see results within seconds rather than waiting 30-60 seconds for completion. This approach significantly improves perceived performance and user satisfaction.

JWT Authentication

JSON Web Tokens provide stateless authentication suitable for scalable APIs. Tokens are self-contained—the backend can validate them without database lookups. This eliminates the need for session storage and simplifies horizontal scaling.


Assumptions and Constraints

Constraint Details
Text size Maximum 300,000 characters (~75,000 tokens), aligned with Claude model limits
Compression range 5% to 50% of original text, configurable per request
User storage SQLite for development; PostgreSQL recommended for production
Rate limiting Not implemented; required for production deployment
Monitoring Basic logging only; APM tools recommended for production
Security Prompt injection protection via input validation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors