Skip to content

Conversation

@MatousMarik
Copy link
Collaborator

Polished PR

By apify init:

  • add .actor/actor.json
  • add .gitignore

Additionally:

  • add minimalistic Dockerfile
  • suppress Streamlit logs
  • update README

- init by running apify-cli command: `apify init`
- update .gitignore
Enhanced python dockerfile template from https://github.com/apify/actor-templates/blob/master/templates/python-empty/.actor/Dockerfile

Main features:
- python
- install rust for building minify-html (dependency of requirements)
- install requirements.txt
- playwright update
- run streamlit on the apify port
- suppress Streamlit logs
- optimize Dockerfile with multi-stage build and virtual environment
    - reduce image size to 1/3
@MatousMarik MatousMarik added the enhancement New feature or request label Mar 13, 2025
@MatousMarik MatousMarik self-assigned this Mar 13, 2025
@MatousMarik
Copy link
Collaborator Author

MatousMarik commented Mar 13, 2025

@MatousMarik
Copy link
Collaborator Author

Actorification – Deploying Web Scraping AI Agent on Apify 🚀

Hey Shubham,

We love what you’ve built with this Web Scraping AI Agent! Your work is truly impressive, and we believe Apify is the perfect place to run this app at scale. Apify’s platform makes deploying, managing, and scaling web scraping tasks incredibly easy. Plus, its Actor model—designed for serverless automation—aligns perfectly with this project’s goals. (Learn more in the Actor Whitepaper).

🔥 What’s in this PR?

This PR transforms the Web Scraping AI Agent into an Apify Actor, enabling seamless deployment on Apify’s cloud platform while keeping the local version intact.

🚀 How to Deploy on Apify?

You can now run this project as an Apify Actor with just a few steps:

  1. Why Apify? If you’re looking for a scalable, serverless way to run your scraping tasks, Apify’s platform is built exactly for this. It provides managed cloud infrastructure, easy scheduling, API integration, and much more—without worrying about servers.
  2. Create an Apify Account (if you don’t have one) – Sign up here.
  3. Fork this Repository and push it to your GitHub.
  4. Connect GitHub to Apify – In the Apify Console, go to Actors → Create New → Import from GitHub.
  5. Build and Run the Actor – Apify will handle everything, and you’ll get a URL to access the Streamlit app!

📖 Learn more about Actor Development in the Apify Docs.

🔧 What Changes Were Made?

✅ Added .actor/actor.json (Apify config).
✅ Created a minimal Dockerfile for deployment.
✅ Suppressed unnecessary Streamlit logs for cleaner output.
✅ Updated the README with Apify deployment instructions.


🎨 How It Looks After Deployment

Here's how the Apify Actor Console will look once the Web Scraping AI Agent is deployed and running:

1️⃣ Actor Information in Apify

image

2️⃣ Actor Running in Apify Console

image

3️⃣ Streamlit Web App

image


🔗 Related PR: Alternative version of this PR


This PR makes it super easy for anyone to deploy, run, and scale this AI-powered web scraping tool on Apify with minimal effort. Looking forward to your thoughts! 🚀

@MatousMarik
Copy link
Collaborator Author

@tomasjindra
What do you think about this new message? I won't create a new PR yet; it would look the same...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants