Skip to content

A Python script that monitors specified Telegram chats and forwards messages matching your custom filters to another chat.

License

Notifications You must be signed in to change notification settings

5wHN28Dg/tele-notify

Repository files navigation

tele-notify

License: AGPL v3 Python Telethon

A Python script that monitors specified Telegram chats and forwards messages matching your custom filters to another chat. It also supports OCR on attached images and avoids forwarding duplicate messages.

⚠️ Disclaimer: This project is a custom-built solution for a specific problem I encountered, designed solely to meet my personal needs. It is not intended for high-volume use or scenarios that might approach API rate limits. Features outside my requirements have not been implemented, so you may need to adapt or modify the code to fit your own use case.

✨ Features

  • Keyword-based filtering: Matches at least one keyword from each of three keyword categories.
  • Image text recognition: Downloads attached images and extracts text using OCR (pytesseract).
  • Duplicate prevention: Checks your last 10 forwarded messages before sending a new one.
  • Customizable filters: Store your keywords and Telegram API keys in a JSON file.
  • Multi-chat monitoring: Watch multiple Telegram chats at once.

🎯 Version Guide

On October 7th, 2025, the project reached a divergence point — a moment where I had to choose between two paths: specificity, at the expense of ease of repurposing, or generality, at the expense of reliability for my current use case.

In the end, I chose both. I wanted the strengths of each approach, so now this project provides two distinct versions, each tailored to different needs:

v2 (Current) - Specialized Job Filter

Best for: Filtering job postings with intelligent level detection

  • Two-stage filtering with inference logic
  • Entry-level vs. mid-level classification
  • Experience, certification, and responsibility pattern matching
  • Optimized for English/Arabic job market terminology
  • Trade-off: Highly tailored for my very specific application; requires significant modification for other use cases but only some adjustments to be used as a specialized job filtering

v1 (Legacy) - General-ish Message Filter

Best for: Simple keyword-based filtering for any content type

  • Straightforward AND logic (level + role + location)
  • Easy to repurpose for different domains (e.g., real estate, events, products)
  • Minimal configuration required
  • Trade-off: Less intelligent; may miss nuanced matches

📁 Files:

  • main.py - Current specialized version (v2)
  • main_simple.py - Original general-purpose version (v1)

💡 Which should you use?

  • Filtering job postings specifically? → Use v2
  • Need a simple keyword filter for other content? → Use v1
  • Want to build something custom? → Start with v1 as a template

🔧 Customization Difficulty

Feature v1 (Simple) v2 (Specialized)
Add new keywords Easy Easy
Change filter logic Moderate Complex
Repurpose for different domain Moderate Very Complex
Add new languages Moderate Challenging

📦 Requirements

  • Python 3.8+
  • Telegram API credentials (API ID & API Hash, check Telethon documentation for detailed instructions)
  • tesseract-ocr for OCR functionality

📚 Dependencies

Python libraries used:

telethon
pillow
pytesseract
scikit-learn
cryptg
tenacity
Beautiful Soup
lxml

⚙️ Installation

  1. clone the repository:

    git clone https://github.com/5wHN28Dg/tele-notify.git
    cd tele-notify
  2. create a virtual environment:

    python3 -m venv venv
    source venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Install Tesseract OCR:

    • Ubuntu/Debian:

      sudo apt install tesseract-ocr tesseract-ocr-eng tesseract-ocr-ara
    • Windows: Download installer

    • MacOS:

      brew install tesseract tesseract-lang
    • Note: If you encounter any issues or difficulties with Tesseract installation, refer to the official documentation or community forums.

📱 Android Users: lost in dependency hell (coming soon!).

🛠 Configuration

  1. Run this code (after you fill in your API ID and API Hash) to get a list of your chat list with their names and IDs:
    from telethon import TelegramClient
    
    api_id = YOUR_API_ID
    api_hash = 'YOUR_API_HASH'
    
    client = TelegramClient('session_name', api_id, api_hash)
    
    async def main():
        async for dialog in client.iter_dialogs():
            print('{:>14}: {}'.format(dialog.id, dialog.title))
    
    with client:
     client.loop.run_until_complete(main())
  2. Open config.json file in the project directory and fill it with the necessary information:
  • Your API ID and API Hash.
  • the IDs of the chats you want to watch.
  • the ID of the chat you want to forward messages to.
  • the keywords you want to filter messages based on.

Note: do not touch recent_messages.

🚀 Usage

Run the script:

python main.py

The script will:

  1. ask you to login as the user by entering your phone number and code.
  2. starts watching the specified Telegram chats.
  3. starts processing unread messages if there are any and watch for new messages:
    • Extract text from the message body and image (if present).
    • Check for required keywords.
    • Skip if it’s a duplicate of one of your last 10 messages.
    • Forward it to your target chat.

📝 To Do List:

  • Fix race conditions when updating recent_messages and writing to config.json.

Documentation & Support

  • Add a FAQ section in the wiki with a table of contents.

Message Processing

  • Improve regex matching to detect messages formatted like: #Basrah www.example.com/electrical-engineering-intern/.
  • Determine whether account bans reported by telethon.client.updates are caused by the script (highly unlikely, as none of the reported chat IDs appear in the dialogs list obtained beforehand).
  • Rethink & test job level identification logic for posts without clear level markers.
    • ✅ Implemented two-stage filtering: explicit keywords (stage 1) + inference-based detection (stage 2)
    • ✅ Entry-level: Matches if no experience/certification requirements found
    • ✅ Mid-level: Matches if no experience/certification/responsibility requirements found
    • ✅ Ambiguous messages forwarded to personal chat for manual review
  • Fallback to message link sharing for the channels that have message forwarding disabled.
    • share the message link with a brief summary (job title, location)

Reliability & Error Handling

  • Review and improve the retry mechanism.
  • Set up crash notifications (email, webhook, or other) and autostart upon system boot.🔄
  • Add logging for pattern match stages (which stage matched, which patterns triggered) for debugging? Maybe, we will see.
  • switch from requests to aiohttp for a truly async operation
  • reduce false positives 🔄
  • add web scraping for ambiguous job posts

CLI & User Experience

  • Create a modern CLI with real-time statistics instead of plain logs:

    • Show progress bar for unread message processing.
    • Display processed message counts per chat and overall (over a time period).
    • Display forwarded message counts per chat and overall (over a time period).
    • Show breakdown of matches by stage (stage 1 vs. stage 2 inference)?
    • Display count of ambiguous messages forwarded for manual review.
    • Highlight important events (account bans, connection issues, etc.).

Code Quality

  • Analyze the codebase for a possible second refactoring.

📜 License

This project is licensed under the AGPL License.

About

A Python script that monitors specified Telegram chats and forwards messages matching your custom filters to another chat.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages