A sophisticated AI chatbot system that provides intelligent responses about blog content, handles website monitoring, and manages user interactions through a modern web interface.
Features • Installation • Configuration • Usage • API • Deployment
- Overview
- Features
- Installation
- Configuration
- Usage
- API Reference
- Architecture
- Deployment
- Troubleshooting
- Contributing
- License
vMeNext is a comprehensive AI-powered chatbot system designed to serve as an intelligent interface for blog content and website management. Built with modern Python technologies, it combines the power of OpenAI's GPT models with automated web scraping, monitoring, and user engagement features.
- Intelligent Conversations: Powered by OpenAI's latest GPT models for natural, context-aware responses
- Blog Content Integration: Automatic scraping, processing, and summarization of blog posts
- Website Monitoring: Continuous availability checking with real-time alerts
- Document Processing: Support for multiple file formats (PDF, DOCX, TXT, MD)
- User Engagement: Automated email notifications and contact management
- Analytics Dashboard: Website uptime statistics with visualizations
- Context-Aware Responses: Maintains conversation history and context
- Tool Integration: Can execute functions like sending emails and fetching data
- Customizable Personality: Configurable system prompts for different personas
- Streaming Responses: Real-time response generation for better UX
- Automatic Scraping: Uses Playwright for robust web scraping
- Content Summarization: AI-powered summarization of blog posts
- Multi-format Support: Handles various blog layouts and structures
- Pagination Handling: Automatically follows pagination links
- Continuous Monitoring: 24/7 website availability checking
- Email Alerts: Instant notifications when issues are detected
- Response Time Tracking: Monitors and logs response times
- Historical Data: Maintains logs with configurable retention periods
- Uptime Visualization: Interactive charts showing website availability
- Performance Metrics: Response time analysis and trends
- Data Export: JSON-based logging for external analysis
- SMTP Integration: Uses SMTP2GO for reliable email delivery
- User Notifications: Automated contact form handling
- Admin Alerts: System status and user interaction notifications
- Python 3.8 or higher
- OpenAI API key
- SMTP2GO account (for email functionality)
- Modern web browser
git clone https://github.com/yourusername/vMeNext.git
cd vMeNextpip install -r requirements.txtplaywright install chromiumNote: The application automatically installs Playwright Chromium on startup, but you can also install it manually for faster startup times.
-
Copy the environment template:
cp .env.example .env
-
Edit
.envwith your configuration (see Configuration section) -
Validate your setup:
python check_env.py
python main.pyThe application will start a Gradio web interface accessible at http://localhost:7860
The application requires comprehensive environment configuration. All variables are validated by check_env.py.
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4 # or gpt-3.5-turboSMTP2GO_API_KEY=your_smtp2go_api_key
ALERT_EMAIL_TO=[email protected]
ALERT_EMAIL_FROM=[email protected]
SMTP2GO_API_URL=https://api.smtp2go.com/v3/email/send
EMAIL_TIMEOUT=30MONITOR_URL=https://yourwebsite.com
ALLOWED_SCRAPE_DOMAIN=yourdomain.com
HTTP_TIMEOUT=10
USER_AGENT=Mozilla/5.0 (compatible; vMeNext/1.0)LOGS_DIR=logs
DOCUMENTS_DIR=about_me
AVAILABILITY_LOG_FILE=logs/availability_log.json
CHAT_MEMORY_FILE=memory/chat_memory.json
BLOG_SUMMARY_FILE=about_me/blog_posts_summary.md
AVAILABILITY_PLOT_FILE=logs/availability_plot.pngBLOG_CREATOR_NAME=Your Name
BLOG_BASE_URL=https://yourblog.com
MAX_SCRAPE_PAGES=10
MAX_SCRAPE_POSTS=50
PLAYWRIGHT_TIMEOUT=30000TIMEZONE=UTC
LOG_RETENTION_DAYS=30GRADIO_TITLE=AI Blog Assistant
GRADIO_WELCOME_MESSAGE=Welcome! How can I help you today?
GRADIO_INPUT_LABEL=Your message
GRADIO_BUTTON_TEXT=SendHF_TOKEN=hf_your_access_token_here-
Start the application:
python main.py
-
Access the web interface at
http://localhost:7860 -
Begin chatting with the AI assistant about blog content, cybersecurity, or any configured topics
The chatbot supports several special admin commands:
read all blog posts- Scrapes and summarizes all blog postsdisplay stats- Shows website availability statisticsreload context- Reloads document context from files
Place documents in the about_me/ directory:
- PDF files: Automatically processed and indexed
- DOCX files: Microsoft Word documents supported
- TXT files: Plain text files
- MD files: Markdown documents
The monitoring system runs automatically in the background:
- Checks website availability every 60 seconds
- Logs all results with timestamps
- Sends email alerts for downtime
- Generates uptime statistics and visualizations
Main chat function that processes user input and generates AI responses.
Parameters:
user_input: User's messagechat_id: Optional session identifier
Returns:
- Response text
- Table HTML (if applicable)
- Graph HTML (if applicable)
- Final chat ID
Scrapes blog posts and generates AI summaries.
Parameters:
base_url: Blog URL to scrapemax_pages: Maximum pages to crawlmax_posts: Maximum posts to process
Returns:
- Number of posts successfully summarized
Continuously monitors website availability in background thread.
Sends admin notification when user provides contact information.
vMeNext/
├── main.py # Application entry point and Gradio interface
├── check_env.py # Environment validation
├── requirements.txt # Python dependencies
├── packages.txt # System dependencies (libnss3 for Playwright)
├── src/
│ ├── chatbot.py # Core AI conversation logic
│ ├── memory.py # Chat session persistence
│ ├── availability_checker.py # Website monitoring
│ ├── email.py # Email notification system
│ ├── blog_scraper.py # Blog content extraction
│ ├── document_loader.py # Document processing
│ ├── stats.py # Analytics and visualizations
│ └── utils.py # Utility functions
├── prompts/
│ └── system_prompt.txt # AI system prompt template
├── about_me/ # Document storage directory
├── logs/ # Log files and visualizations
└── memory/ # Chat session storage
- Gradio Interface: Modern web UI with real-time chat
- OpenAI Integration: GPT model integration with tool calling
- Playwright Scraping: Robust web scraping with browser automation
- Email System: SMTP2GO integration for notifications
- Monitoring System: Background thread for continuous monitoring
- Document Processing: Multi-format document loading and indexing
vMeNext can be deployed on multiple platforms. Choose the option that best fits your needs:
This application can be deployed to HuggingFace Spaces using Gradio. Follow the steps below:
- Ensure your
about_me/folder contains personalized documents (e.g., your résumé, portfolio summary) - Remove any pre-existing README.md files inside your project directory if created by previous deployments
- Ensure all required files are in your project directory:
main.py(application entry point)requirements.txt(Python dependencies)packages.txt(system dependencies).envfile with all required environment variables
-
Create a Hugging Face account:
- Visit huggingface.co and sign up or log in
-
Generate an Access Token:
- Click your avatar in the top right → "Access Tokens"
- Click "Create New Token", name it something like
gradio-deploy, and give it WRITE permissions - Copy the generated token
-
Add the token to your .env file:
HF_TOKEN=hf_...
-
Deploy with Gradio CLI: Run from your project directory:
uv run gradio deploy
If Hugging Face doesn't detect your token, use:
uv run dotenv -f ../.env run -- uv run gradio deploy
-
During deployment, you'll be prompted to enter:
- Space name: e.g.,
vmeorvmenext - Script path:
main.py - Hardware type:
cpu-basic(free) or upgrade for better performance - Secrets: Add secrets such as
OPENAI_API_KEYandSMTP2GO_API_KEY - Skip GitHub Actions unless you're automating CI/CD
- Space name: e.g.,
-
Access your deployed application:
- Your application will be available at
https://huggingface.co/spaces/yourusername/your-space-name - The deployment process will automatically handle system dependencies from
packages.txt
- Your application will be available at
If you prefer using the web interface instead of CLI:
- Go to huggingface.co/spaces and sign in
- Click "Create new Space"
- Fill in the space details:
- Space name:
vmenext(or your preferred name) - License: Choose appropriate license (e.g., MIT)
- SDK:
Gradio - Hardware:
CPU basic(free) or upgrade for better performance - Visibility:
PublicorPrivate
- Space name:
- Connect your GitHub repository and set:
- App file:
main.py - SDK:
gradio - SDK version:
5.38.0(or latest)
- App file:
- Configure Environment Variables in Settings → Variables
- Deploy: Your Space will automatically build and deploy
Gradio Cloud also provides hosting for Gradio applications.
- Gradio Account: Sign up at gradio.app
- GitHub Repository: Push your code to a GitHub repository
- Environment Variables: Configure all required environment variables in Gradio Cloud
-
Prepare your repository:
- Ensure all files are committed to your GitHub repository
- The
main.pyfile should be in the root directory - Include
requirements.txtandpackages.txtfiles
-
Deploy to Gradio Cloud:
- Go to gradio.app and sign in
- Click "Create" → "New Space"
- Connect your GitHub repository
- Set the following configuration:
- App File:
main.py - SDK:
gradio - SDK Version:
5.38.0(or latest)
- App File:
-
Configure Environment Variables:
- In your Gradio Space settings, add all required environment variables
- Go to Settings → Variables and add each variable from the Configuration section
-
System Dependencies:
- The
packages.txtfile includes system dependencies (libnss3) required for Playwright - Gradio Cloud will automatically install these dependencies
- The
# Development mode with auto-reload
python main.pyThe application will be accessible at http://localhost:7860
| Feature | Hugging Face Spaces | Gradio Cloud |
|---|---|---|
| Free Tier | ✅ CPU basic | ✅ Available |
| Performance | ⭐⭐⭐ Excellent | ⭐⭐ Good |
| Community | ⭐⭐⭐ Large ML community | ⭐⭐ Gradio-focused |
| Custom Domains | ❌ No | ✅ Yes |
| Private Spaces | ✅ Yes | ✅ Yes |
| Hardware Upgrades | ✅ CPU/GPU options | ✅ Available |
- Security: Use environment variables for all sensitive data
- Monitoring: Set up log rotation for the logs directory
- Backup: Regularly backup the memory and logs directories
- Updates: Keep dependencies updated for security patches
- Rate Limits: Be aware of API rate limits for OpenAI and email services
- Resource Usage: Monitor memory and CPU usage, especially for web scraping
# Check environment configuration
python check_env.pySolution: Ensure all required variables are set in .env file
# Reinstall browsers
playwright install chromiumSolution: Ensure Playwright browsers are properly installed
- Verify SMTP2GO API key and configuration
- Check network connectivity
- Review email timeout settings
- Verify
MONITOR_URLis accessible - Check
HTTP_TIMEOUTsettings - Review logs in
logs/availability_log.json
- Ensure
memory/directory exists and is writable - Check
CHAT_MEMORY_FILEpath configuration
- Build Failures: Check that all dependencies in
requirements.txtare compatible - Environment Variables: Ensure all required variables are set in Space settings
- Memory Issues: Consider upgrading to a higher hardware tier if running out of memory
- Timeout Issues: Increase timeout values for web scraping operations
- Playwright Issues: Verify that
packages.txtincludes all required system dependencies
Enable debug logging by setting:
DEBUG=true- Availability logs:
logs/availability_log.json - Chat memory:
memory/chat_memory.json - Application logs: Check console output
We welcome contributions! Please follow these guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature-name
- Make your changes
- Test thoroughly
- Submit a pull request
- Follow PEP 8 guidelines
- Use type hints for function parameters and returns
- Add docstrings for all functions and classes
- Keep functions focused and modular
- Test all new features thoroughly
- Ensure environment validation passes
- Test with various document formats
- Verify email functionality
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for providing the GPT models
- Gradio for the excellent web interface framework
- Playwright for robust web scraping capabilities
- SMTP2GO for reliable email delivery
Made with ❤️ for the cybersecurity community
