Smart Summary App

AI-powered text summarization with real-time streaming, built with FastAPI and Next.js

Business Description

Smart Summary App is an enterprise-ready solution that transforms lengthy documents into concise, actionable summaries using artificial intelligence. The application addresses a common business challenge: information overload. Teams across organizations spend significant time reading and processing large volumes of text—reports, articles, research papers, and documentation.

This solution enables employees to paste any text and receive an intelligent summary within seconds. The AI processes the content in real-time, showing results as they are generated rather than requiring users to wait for completion. This streaming approach provides immediate feedback and improves productivity.

Key Business Benefits:

Reduce time spent reading lengthy documents by up to 80%
Maintain consistency in how information is summarized across teams
Scale document processing without increasing headcount
Secure, authenticated access ensures only authorized users can access the system
Flexible deployment options support cloud or on-premise infrastructure

The application is designed to integrate with existing enterprise workflows and can be customized to meet specific organizational needs. It supports multiple AI providers, allowing organizations to choose the model that best fits their requirements and budget.

Technical Requirements

The application fulfills the following technical specifications:

Requirement	Implementation
Frontend	React with Next.js 14 (App Router)
Backend	FastAPI with Python 3.11+
LLM Integration	LangChain with Anthropic Claude (switchable to OpenAI/Gemini)
Streaming	Server-Sent Events (SSE) for progressive summary generation
Deployment	Configured for Vercel (frontend) and Render (backend)

Demo Credentials (Testing Only):

Username: demo
Password: As configured in environment variables

Important: For production environments, create users with strong passwords. Demo credentials should never be used in production.

Features

The application provides two categories of functionality: core features essential for summarization and advanced features that enhance the user experience.

Core Functionality

Real-time Streaming enables users to see the summary being generated token by token, providing immediate feedback rather than waiting for the entire response.

Multiple Summarization Strategies allow users to choose between simple, hierarchical, or detailed approaches depending on their needs.

JWT Authentication secures all API endpoints with industry-standard token-based authentication.

LLM Flexibility allows organizations to switch between Anthropic, OpenAI, or Gemini models without modifying application code.

Responsive Interface works seamlessly across desktop and mobile devices using Tailwind CSS.

Docker Support provides full containerization for consistent development and deployment environments.

Advanced Features

Progress indicators during summarization
Adjustable compression ratio (5% to 50% of original text)
Real-time character and word counting
One-click copy to clipboard
Input validation and prompt injection protection

Architecture

The application follows a modern three-tier architecture separating the user interface, business logic, and AI processing layers. This separation enables independent scaling and maintenance of each component.

┌────────────────────────────────────────────────────┐
│         Next.js Frontend (Vercel)                  │
│  - React Components (TypeScript)                   │
│  - SSE Client for Streaming                        │
│  - JWT Token Management                            │
└─────────────────┬──────────────────────────────────┘
                  │ HTTPS + SSE
┌─────────────────▼──────────────────────────────────┐
│       FastAPI Backend (Render.com)                 │
│  ┌──────────────────────────────────────────────┐  │
│  │  API Routes (JWT Protected)                  │  │
│  │  - /api/auth/* - Login/Register              │  │
│  │  - /api/summary/* - Streaming/Sync           │  │
│  └──────────────────┬───────────────────────────┘  │
│  ┌──────────────────▼───────────────────────────┐  │
│  │  Services Layer                              │  │
│  │  - SummarizerService (3 strategies)          │  │
│  │  - LLMService (LangChain abstraction)        │  │
│  │  - TextProcessor (chunking, cleaning)        │  │
│  └──────────────────┬───────────────────────────┘  │
│  ┌──────────────────▼───────────────────────────┐  │
│  │  LangChain + LLM APIs                        │  │
│  │  - Anthropic Claude 4.5 Sonnet               │  │
│  └──────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────┘

Technical Details:

The frontend communicates with the backend over HTTPS using Server-Sent Events for streaming responses
All API routes are protected by JWT authentication except for login and registration
The Services Layer implements business logic independently from the API layer
LangChain provides an abstraction layer that enables switching between AI providers without code changes

Summarization Strategies

Users can choose from three summarization approaches depending on the length and complexity of their source material. Each strategy is optimized for different use cases.

Simple Strategy

Best for quick summaries of shorter documents. The system sends the entire text to the AI model in a single request and returns the summary.

Attribute	Value
Recommended text length	Under 50,000 characters
Speed	Fast (single LLM call)
Use cases	Quick summaries, simple documents

Hierarchical Strategy (Recommended)

Best for most business documents. The system breaks the text into semantic chunks, summarizes each chunk in parallel, then combines the results into a coherent final summary.

Attribute	Value
Recommended text length	50,000 to 300,000 characters
Method	Semantic chunking, parallel summarization, combination
Use cases	Articles, reports, documentation

Detailed Strategy

Best for comprehensive analysis where preserving key details is critical. The system first extracts important sentences using TextRank algorithm, then generates an abstractive summary from those extractions.

Attribute	Value
Recommended text length	Any size
Method	Extractive sentence extraction plus abstractive LLM summary
Use cases	Research papers, detailed analysis

Compression Ratio

The compression ratio determines the target length of the summary as a percentage of the original text. Users can adjust this setting based on their needs.

Ratio	Description	Use Case
5%	Ultra-brief	Executive summaries, key points only
15%	Brief	Main ideas and conclusions
20%	Balanced (default)	General-purpose summaries
30%	Moderate detail	Comprehensive overviews
50%	Comprehensive	Detailed summaries preserving nuance

Note: Actual compression may vary based on content complexity and the selected strategy.

Technology Stack

This section details the technologies used in each layer of the application.

Layer	Technology	Version	Purpose
Frontend	Next.js	14+	App Router, server-side rendering, streaming
UI Framework	Tailwind CSS	Latest	Responsive styling
Backend	FastAPI	0.109+	High-performance async API
LLM Integration	LangChain	Latest	Provider abstraction
AI Models	Anthropic Claude	4.5 Sonnet	Text summarization
Authentication	python-jose	Latest	JWT token management
Password Security	bcrypt	Latest	Password hashing
Testing	Pytest, Jest	Latest	Unit and integration tests
Containerization	Docker	Latest	Development and deployment

Project Structure

The repository is organized into separate frontend and backend directories, each with its own configuration and deployment settings.

smart-summary/
├── backend/
│   ├── app/
│   │   ├── api/routes/        # API endpoints
│   │   │   ├── auth.py        # Login/register
│   │   │   └── summary.py     # Summarization endpoints
│   │   ├── services/          # Business logic
│   │   │   ├── llm_service.py      # LangChain wrapper
│   │   │   ├── summarizer.py       # Strategies
│   │   │   └── text_processor.py   # Text utilities
│   │   ├── core/              # Configuration
│   │   │   ├── config.py      # Settings
│   │   │   └── security.py    # JWT & passwords
│   │   ├── models/
│   │   │   └── schemas.py     # Pydantic models
│   │   └── main.py            # FastAPI app
│   ├── tests/
│   │   ├── test_api.py        # Endpoint tests
│   │   └── test_services.py   # Service tests
│   ├── Dockerfile
│   ├── requirements.txt
│   ├── render.yaml            # Deployment config
│   └── .env.example
├── frontend/
│   ├── app/
│   │   ├── page.tsx           # Main page
│   │   ├── layout.tsx         # Root layout
│   │   └── globals.css        # Global styles
│   ├── components/
│   │   ├── SummaryForm.tsx    # Input form
│   │   ├── SummaryDisplay.tsx # Output display
│   │   └── AuthModal.tsx      # Login/register
│   ├── lib/
│   │   └── api.ts             # API client
│   ├── Dockerfile
│   ├── vercel.json            # Deployment config
│   └── .env.example
└── docker-compose.yml         # Local development

Getting Started

This section provides instructions for setting up the application locally. For a faster setup experience, refer to the Quick Start Guide.

Prerequisites

Before beginning, ensure the following software is installed:

Node.js 20 or later
Python 3.11 or later
Docker and Docker Compose
An Anthropic API key (obtain from console.anthropic.com)

Docker Setup (Recommended)

Docker provides the simplest path to running the application locally. Follow these steps:

Step 1: Clone the repository

git clone https://github.com/ebertolo/smart-summary.git
cd smart-summary

Step 2: Configure the backend

cp backend/.env.example backend/.env

Edit backend/.env and configure the following values:

ANTHROPIC_API_KEY: Your Anthropic API key
JWT_SECRET: A secure random string for token signing
DEMO_USER_PASSWORD: Password for the demo user account

Step 3: Configure the frontend

cp frontend/.env.example frontend/.env.local

The default value NEXT_PUBLIC_API_URL=http://localhost:8000 is correct for local development.

Step 4: Start the application

docker-compose up --build

Step 5: Access the application

Frontend: http://localhost:3000
Backend API Documentation: http://localhost:8000/docs

Log in with username demo and the password configured in your .env file.

Important: The .env files must be created before running Docker. The application will not start without proper configuration.

Local Development (Without Docker)

For development work where you need to modify code and see changes immediately, you may prefer running the services directly.

Backend Setup:

cd backend
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env and add ANTHROPIC_API_KEY
python scripts/init_db.py  # Creates demo user
uvicorn app.main:app --reload

Backend API documentation: http://localhost:8000/docs

Frontend Setup:

cd frontend
npm install
cp .env.example .env.local
# Verify NEXT_PUBLIC_API_URL=http://localhost:8000
npm run dev

Frontend: http://localhost:3000

For detailed backend setup instructions, see backend/QUICKSTART.md.

Deployment

This section covers deploying the application to production environments.

Pre-Deployment Security Checklist

Before deploying to production, complete the following security tasks:

Change JWT_SECRET to a cryptographically secure random value
Create production users with strong passwords (never use demo credentials)
Update CORS_ORIGINS with your production frontend URL
Set PYTHON_ENV=production in environment variables
Review and rotate all API keys
Consider migrating from SQLite to PostgreSQL for production workloads

Frontend Deployment (Vercel)

Option 1: Vercel Dashboard

Navigate to vercel.com and sign in
Import your GitHub repository
Set the root directory to frontend
Add environment variable: NEXT_PUBLIC_API_URL with your backend URL
Deploy

Option 2: Vercel CLI

cd frontend
npm install -g vercel
vercel --prod

Backend Deployment (Render.com)

Option 1: Render Dashboard

Navigate to render.com and sign in
Create a new Web Service and connect your GitHub repository
Set the root directory to backend
Render will automatically detect the render.yaml configuration
Add environment variable: ANTHROPIC_API_KEY
Deploy

Post-Deployment: Update CORS Configuration

After deploying the backend, update the CORS configuration to allow requests from your frontend:

# backend/app/core/config.py
CORS_ORIGINS = [
    "http://localhost:3000",
    "https://your-app.vercel.app"  # Add your Vercel URL
]

Commit and push this change to trigger a redeployment.

API Documentation

The backend exposes a RESTful API with authentication and summarization endpoints. Full interactive documentation is available at /docs when running the backend.

Authentication Endpoints

POST /api/auth/login

Authenticates a user and returns a JWT token for subsequent API calls.

Request:

{
  "username": "demo",
  "password": "your_password"
}

Response:

{
  "access_token": "eyJhbGciOiJIUz...",
  "token_type": "bearer",
  "expires_in": 3600
}

POST /api/auth/register

Creates a new user account.

Request:

{
  "username": "newuser",
  "password": "your_secure_password"
}

Summarization Endpoints

POST /api/summary/summarize (Streaming)

Generates a summary with real-time streaming response.

curl -X POST http://localhost:8000/api/summary/summarize \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your long text here...",
    "strategy": "hierarchical",
    "compression_ratio": 0.20
  }'

Parameters:

Parameter	Required	Description
text	Yes	Text to summarize (100 to 300,000 characters)
strategy	No	`simple`, `hierarchical`, or `detailed` (default: `hierarchical`)
compression_ratio	No	0.05 to 0.50 (default: 0.20)

Response (SSE Stream):

data: {"type": "content", "content": "First chunk", "done": false}

data: {"type": "content", "content": " continues...", "done": false}

data: {"type": "complete", "done": true}

POST /api/summary/summarize-sync (Non-Streaming)

Returns the complete summary in a single response. Use this endpoint when streaming is not required.

Testing

The application includes comprehensive test suites for both backend and frontend components.

Running Tests

Backend Tests:

cd backend
pytest --cov=app --cov-report=html

Run a specific test:

pytest tests/test_api.py::TestAuthEndpoints::test_login_with_valid_credentials

Frontend Tests:

cd frontend
npm test

Test Coverage

The test suites cover the following areas:

Authentication endpoints (login, registration, token validation)
All summarization strategies
Text processing utilities
JWT token generation and validation
Input validation and error handling

Security

This section details the security measures implemented in the application and recommendations for production environments.

Implemented Security Measures

Authentication and Authorization:

JWT authentication with configurable token expiration
Bcrypt password hashing with salt
Token-based API access control

Input Protection:

CORS protection limiting allowed origins
Pydantic-based input validation
Text length limits to prevent abuse
Prompt injection protection

Configuration Security:

Environment variable management for secrets
Separation of development and production configurations

Production Security Recommendations

For production deployments, implement the following additional measures:

Critical (Must Implement):

Generate a new JWT_SECRET using: python -c "import secrets; print(secrets.token_urlsafe(32))"
Create production users with strong passwords using scripts/init_db.py
Use HTTPS exclusively
Update CORS configuration with production URLs only

Recommended:

Implement rate limiting (consider slowapi library)
Add refresh token rotation
Implement API key rotation schedule
Set up logging and monitoring (APM tools)
Add request signing for sensitive operations
Build a user management interface

Creating Production Users

Never use demo credentials in production. Create new users with the following command:

cd backend
python scripts/init_db.py --username admin --password YOUR_SECURE_PASSWORD --email admin@yourcompany.com

Scaling Considerations

The application is designed to scale from small teams to enterprise deployments. This section outlines the current capacity and the path to higher scale.

Current Capacity

Metric	Capacity
Concurrent users	100 to 1,000
Requests per second	50 to 100
Maximum text size	300,000 characters (~75,000 tokens)

Scaling Roadmap

Phase 1: 1,000 to 10,000 Users

Migrate from SQLite to PostgreSQL for reliable concurrent access
Enable Render auto-scaling for the backend
Implement per-user rate limiting

Phase 2: 10,000 to 100,000 Users

Add Celery for background job processing
Deploy a CDN for static assets
Configure database read replicas
Implement caching layer

Phase 3: 100,000+ Users

Migrate to microservices architecture
Implement message queue (RabbitMQ or Kafka)
Deploy on Kubernetes for orchestration
Configure multi-region deployment for global availability

Future Roadmap

Near-Term (Next Sprints)

Summary history with database persistence
File upload support (PDF, DOCX, TXT)
Export summaries to PDF and Markdown
Rate limiting implementation

Long-Term

Multi-language support
Custom prompts and templates
Batch processing interface
Summary sharing via public links
Mobile application
Voice input and output
Collaborative features
Analytics dashboard

Design Decisions

This section explains the rationale behind key architectural and technology choices.

Stateless Architecture

The application maintains no server-side session state. All user data is stored in the database (SQLite for development, PostgreSQL recommended for production). This design enables horizontal scaling—multiple backend instances can serve requests without coordination.

LangChain Integration

LangChain provides an abstraction layer over multiple LLM providers. This enables organizations to switch between Anthropic, OpenAI, or Gemini models by changing configuration rather than modifying code. This flexibility protects against vendor lock-in and allows optimization based on cost and performance requirements.

Streaming Responses

Server-Sent Events deliver summary content to users as it is generated. For long documents, this means users see results within seconds rather than waiting 30-60 seconds for completion. This approach significantly improves perceived performance and user satisfaction.

JWT Authentication

JSON Web Tokens provide stateless authentication suitable for scalable APIs. Tokens are self-contained—the backend can validate them without database lookups. This eliminates the need for session storage and simplifies horizontal scaling.

Assumptions and Constraints

Constraint	Details
Text size	Maximum 300,000 characters (~75,000 tokens), aligned with Claude model limits
Compression range	5% to 50% of original text, configurable per request
User storage	SQLite for development; PostgreSQL recommended for production
Rate limiting	Not implemented; required for production deployment
Monitoring	Basic logging only; APM tools recommended for production
Security	Prompt injection protection via input validation

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Smart Summary App

Business Description

Table of Contents

Technical Requirements

Features

Core Functionality

Advanced Features

Architecture

Summarization Strategies

Simple Strategy

Hierarchical Strategy (Recommended)

Detailed Strategy

Compression Ratio

Technology Stack

Project Structure

Getting Started

Prerequisites

Docker Setup (Recommended)

Local Development (Without Docker)

Deployment

Pre-Deployment Security Checklist

Frontend Deployment (Vercel)

Backend Deployment (Render.com)

API Documentation

Authentication Endpoints

Summarization Endpoints

Testing

Running Tests

Test Coverage

Security

Implemented Security Measures

Production Security Recommendations

Creating Production Users

Scaling Considerations

Current Capacity

Scaling Roadmap

Future Roadmap

Near-Term (Next Sprints)

Long-Term

Design Decisions

Stateless Architecture

LangChain Integration

Streaming Responses

JWT Authentication

Assumptions and Constraints

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages