Skip to content

A BrightData plugin for the Dify platform, the plugin contains all of Bright's web scraping, unlocking and dataset tools

License

Notifications You must be signed in to change notification settings

Idanvilenski/BrightData_Dify_Plugin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bright Data Web Scraper Plugin for Dify

A comprehensive web scraping and data extraction plugin powered by Bright Data's enterprise-grade infrastructure with intelligent auto-detection. Supports 50+ platforms including Amazon, LinkedIn, Instagram, YouTube, and more.

image

🚀 Quick Start

1. Download the Plugin

Download the latest plugin package: brightdata_plugin.difypkg

2. Install in Dify

  1. Go to Dify.aiPluginsAdd Plugin
  2. Choose Add from Local File
  3. Upload the brightdata_plugin.difypkg file

3. Setup Bright Data Account

  1. Visit Bright Data and create an account
  2. Navigate to your account settings to get your API key
  3. Copy your API token for the next step

4. Create Your First Workflow

  1. Go to Dify StudioWorkflow

  2. Add one of the Bright Data Web Scraper tools:

    • Structured Data Feeds - Extract structured data from 20+ platforms
    • Scrape As Markdown - Convert any webpage to clean markdown
    • Search Engine - Get search results from Google, Bing, Yandex
  3. Enter your Bright Data API key when prompted

  4. Connect an LLM node to process and summarize the scraped data

📋 Available Tools

🔍 Structured Data Feeds

Extract structured data from popular platforms:

  • E-commerce: Amazon, eBay, Walmart, Best Buy, Etsy, Zara
  • Social Media: Instagram, Facebook, TikTok, YouTube, X (Twitter)
  • Professional: LinkedIn profiles, companies, jobs
  • Business: Crunchbase, ZoomInfo
  • Maps & Reviews: Google Maps, booking sites
  • News: Reuters and other news sources

📄 Scrape As Markdown

Convert any webpage into clean, readable markdown format perfect for:

  • Content analysis
  • Documentation extraction
  • Article processing

🔎 Search Engine

Get search results from major search engines:

  • Google
  • Bing
  • Yandex

💡 Example Workflow

(see workflow in banner image) Sample Use Case: Extract Amazon product information and create a summary

  1. START → Input: Product URL
  2. STRUCTURED DATA FEEDS → Extract product details
  3. LLM → Summarize into easy-to-read text
  4. END → Output: Clean product summary Important tips;
  • Referance every stage of the workflow to the output of the previos stage
  • Set a high charecter limit in input fields (for the URL input field choose the "short paragraph" var option)

🛠️ Development

Building from Source

# Clone the repository
git clone https://github.com/idanvilenski/BrightData_Dify_Plugin.git
cd BrightData_Dify_Plugin

# Package the plugin
dify plugin package ./brightdata_plugin

# Sign the plugin (optional)
dify signature sign brightdata_plugin.difypkg -p your_key_pair.private.pem

Requirements

  • Python 3.11+
  • Dify Plugin SDK
  • Bright Data API access

📦 Installation Options

Option 1: Local Upload (Recommended)

Download and upload the .difypkg file directly

Option 2: GitHub Integration

Install directly from GitHub repository URL in Dify

Option 3: Dify Marketplace (Coming Soon)

The plugin will be available in the official Dify Marketplace

🔧 Configuration

API Key Setup

  1. Get your API key from Bright Data Dashboard
  2. The key should start with your zone credentials
  3. Enter the key in any Bright Data tool configuration

Supported Parameters

  • URL: Target website URL
  • Search Query: Search terms for search engines
  • Engine: Choose search engine (Google/Bing/Yandex)
  • Response Format: JSON, Markdown, or Raw data

🎯 Use Cases

  • E-commerce Monitoring: Track product prices and availability
  • Lead Generation: Extract business information from LinkedIn
  • Content Research: Gather articles and news for analysis
  • Market Research: Monitor competitor websites and social media
  • SEO Analysis: Track search engine results and rankings

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🆘 Support

🏷️ Tags

web-scraping data-extraction automation api-integration dify bright-data


⭐ If this plugin helps you, please star the repository!

About

A BrightData plugin for the Dify platform, the plugin contains all of Bright's web scraping, unlocking and dataset tools

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages