Skip to content

Manivenkat3612/BeyondChats-Assignment-

Repository files navigation

Reddit Persona Extractor

Introduction

Reddit Persona Extractor is a tool that generates a detailed persona profile for any Reddit user by analyzing their public comments and submissions. It leverages advanced language models and semantic search to extract demographic and psychographic traits, providing direct quotes and source links for each insight.

Key Features

  • Flexible Output Modes:
    • --mode confident (default): Only well-cited traits (with source URLs)
    • --mode all: All extracted traits, even if not confidently linked to a source
  • Extracts demographic and psychographic traits from Reddit activity
  • Uses OpenRouter/OpenAI LLMs for insight extraction
  • Links each insight to the original Reddit source

Setup Instructions

1. Clone the Repository

git clone <repo-url>
cd beyondchats

2. Install Python Dependencies

It is recommended to use a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

3. Configure API Keys

Create a .env file in the project root with your OpenRouter API key:

OPENROUTER_API_KEY=your_openrouter_api_key_here

You can get an API key from https://openrouter.ai/

4. Configure Reddit API (praw)

Create a praw.ini file in the project root with your Reddit API credentials. Example:

[DEFAULT]
client_id=YOUR_CLIENT_ID
client_secret=YOUR_CLIENT_SECRET
user_agent=YOUR_USER_AGENT

See https://praw.readthedocs.io/en/stable/getting_started/configuration.html for details.

Usage

To generate a persona for a Reddit user, run:

python reddit_persona.py <reddit_username_or_profile_url>

Example

python reddit_persona.py spez

This will create a file like persona_spez.txt with the extracted persona.

Output Modes

  • --mode confident (default): Only well-cited traits (with source URLs)
  • --mode all: All extracted traits, even if not confidently linked to a source

Example:

python reddit_persona.py spez --mode all

Notes

  • The more public comments and posts a user has, the better the persona quality.
  • If you see a warning about low insight quality, try a user with more Reddit activity.

Troubleshooting

  • Ensure your .env and praw.ini files are correctly set up.
  • Make sure your API keys are valid and have sufficient quota.

About

tool that generates a detailed persona profile for any Reddit user by analyzing their public comments and submissions. It leverages advanced language models and semantic search to extract demographic and psychographic traits, providing direct quotes and source links for each insight.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages