Skip to content

Sample code to show how to extract comments from facebook ads campaign to be used for sentiment analysis

License

Notifications You must be signed in to change notification settings

mrharel/facebook_ads_comments_analysis

Repository files navigation

Facebook Campaign Comments Extractor

In the current digital advertising landscape, comments offer a valuable, often untapped, source of insights. The prevalent method for sentiment analysis of ad comments involves manual copying and pasting from Ads Manager, a process that is time-consuming, prone to errors, and unscalable for large campaigns.

To address this, I developed a Python script that automates comment extraction from Facebook ad campaigns using the Meta Marketing API (MAPI). This tool is designed to facilitate the analysis of user feedback and sentiment from these campaigns.

While the script currently focuses on extracting comments from Facebook ads, it can be readily adapted to support other social media platforms. My primary focus with this tool is comment extraction, rather than AI-driven sentiment analysis. Although I've included examples of how you might approach sentiment analysis, that aspect is left for your implementation.

Table of Contents


Using the Web Interface (New!)

The tool now includes a user-friendly web interface that allows you to extract comments and download them as JSON or CSV without using the command line.

Setup

  1. Create a virtual environment and install dependencies:
python3 -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate
pip install -r requirements.txt

Running the Server

  1. Start the backend server:
python -m backend.app
  1. Open your browser and navigate to: http://localhost:5000

  2. Enter your Access Token and Campaign ID in the form.

  3. Select your desired output format (JSON or CSV).

  4. Click "Fetch Comments" and watch the progress in real-time.

  5. Download the result file when finished.


Using the Script (CLI)

Prerequisites

  • Python 3.7+
  • Access to Meta Marketing API (MAPI)
  • Valid MAPI access token
  • Campaign ID from your Facebook ad account

Installation

  1. Clone or download this repository

  2. Install required dependencies using the requirements file:

pip install -r requirements.txt

Alternatively, you can install dependencies individually:

pip install requests python-dotenv
  1. Create a .env file in the project directory:
touch .env

Configuration

Add the following environment variables to your .env file:

# Required: Your Meta Marketing API access token
MAPI_ACCESS_TOKEN=your_access_token_here

# Optional: MAPI cookies for additional authentication
MAPI_COOKIE="cookie_name1=value1; cookie_name2=value2"

Note: To obtain your MAPI access token, you need access to Meta's Business Manager with appropriate permissions. More info about access token can be found here and here.

Running the Script

Basic usage:

python get_comments.py <CAMPAIGN_ID>

With options:

# Enable debug mode
python get_comments.py <CAMPAIGN_ID> --debug 1

# Limit the number of ads to process
python get_comments.py <CAMPAIGN_ID> --max-ads 50

# Combine options
python get_comments.py <CAMPAIGN_ID> --debug 1 --max-ads 50

Parameters:

  • campaign_id (required): The Facebook campaign ID to extract comments from
  • --debug or -d (optional): Enable debug output for detailed logging
  • --max-ads (optional): Maximum number of ads to process from the campaign (default: 100)

Example:

python get_comments.py 1****************8 --max-ads 100

Interactive DCO/PAC Ad Handling

When the script encounters a DCO (Dynamic Creative Optimization) or PAC (Placement Asset Customization) ad, it will pause and prompt you for input. This is because currently this script is not optimised for DCO/PAC comments. As a result this can slow down significantly the process and not prvide the desired results.

What are DCO and PAC ads?

  • DCO (Dynamic Creative Optimization): Facebook automatically generates and tests combinations of provided assets (images, videos, text, CTAs) to find the best performing creative for each audience.
  • PAC (Placement Asset Customization): Allows advertisers to specify which creative asset should be used for each placement (e.g., Instagram Story, Facebook Feed, Messenger).

Interactive Prompt:

When a DCO/PAC ad is detected, you'll see:

⚠️  Ad 9****************2 is a DCO/PAC ad.
This script is not optimised for DCO/PAC which may slow down the process and not provide the desired results.
What would you like to do?
  1. Continue with this ad
  2. Skip this ad
  3. Continue with all DCO/PAC ads
  4. Skip all DCO/PAC ads
Enter your choice (1-4):

Options Explained:

Option Description Use Case
1. Continue with this ad Processes the current DCO/PAC ad and extracts its comments. You'll be prompted again if another DCO/PAC ad is encountered. Use when you want to selectively process specific DCO/PAC ads (e.g., when you know this particular ad is important for your analysis).
2. Skip this ad Skips the current DCO/PAC ad without extracting comments. You'll be prompted again for the next DCO/PAC ad. Use when you want to avoid processing this specific ad but still want to review other DCO/PAC ads individually.
3. Continue with all DCO/PAC ads Processes the current ad AND automatically processes all future DCO/PAC ads without prompting again. Use when you want comprehensive data including all DCO/PAC ads and are willing to accept longer processing time.
4. Skip all DCO/PAC ads Skips the current ad AND automatically skips all future DCO/PAC ads without prompting again. Recommended for most users. Use when you want faster processing and are okay with excluding DCO/PAC ads (which may have incomplete comment data anyway).

Recommendation:

For most use cases, Option 4 (Skip all DCO/PAC ads) is recommended because:

  • DCO/PAC ads can have hundreds of dynamic posts, making extraction very slow
  • There are known issues with comment extraction for these ad types
  • Regular (non-DCO/PAC) ads typically provide sufficient data for analysis

Example Session:

$ python get_comments.py 1****************8

⚠️  Ad 9****************2 is a DCO/PAC ad.
This script is not optimised for DCO/PAC which may slow down the process and not provide the desired results.
What would you like to do?
  1. Continue with this ad
  2. Skip this ad
  3. Continue with all DCO/PAC ads
  4. Skip all DCO/PAC ads
Enter your choice (1-4): 4

Skipping ad 9****************2 and all future DCO/PAC ads
Skipping DCO/PAC ad 9****************3
Skipping DCO/PAC ad 9****************4
...

In the final JSON output, skipped DCO/PAC ads will have the field "skip_ad": true.

Output Format

The script generates a JSON file named campaign_<CAMPAIGN_ID>.json with the following structure:

{
    "campaign_id": "1****************8",
    "campaign_info": {
        "name": "Campaign Name",
        "created_time": "2024-01-15T10:30:00+0000",
        "impressions": "150000",
        "objective": "OUTCOME_SALES",
        "status": "ACTIVE",
        "start_time": "2024-01-15T00:00:00+0000",
        "stop_time": "2024-12-31T23:59:59+0000",
        "updated_time": "2024-10-30T12:00:00+0000"
    },
    "ads": [
        {
            "ad_id": "1****************6",
            "adcreative_id": "7****************2",
            "is_dynamic_ad": false,
            "is_dco_pac": false,
            "text": {
                "body": "Ad copy text here..."
            },
            "dynamic_posts_count": 0,
            "comments": [
                {
                    "message": "This is a comment",
                    "comment_count": 5,
                    "like_count": 4,
                    "is_dynamic": false,
                    "ignore_comment": false,
                    "ignore_reason": "",
                    "comment_score": 13
                }
            ]
        }
    ]
}

Key Fields:

  • campaign_info: Metadata about the campaign
  • ads: Array of ads with their comments
    • ad_id: Unique ad identifier
    • is_dynamic_ad: Whether the ad uses dynamic creative
    • is_dco_pac: Whether the ad uses Dynamic Creative Optimization or Placement Asset Customization
    • skip_ad: Whether the ad was skipped during processing
    • text: Ad creative text (body, descriptions, titles)
    • dynamic_posts_count: Number of dynamic posts for this ad
    • comments: Array of comments with metadata
      • message: Comment text
      • comment_count: Number of replies to this comment
      • like_count: Number of likes on the comment
      • comment_score: Calculated score (like_count × 2 + comment_count)
      • ignore_comment: Whether the comment should be ignored in analysis
      • ignore_reason: Reason for ignoring (e.g., "Comment is too short")

Under the Hood

I want to give a high level overview of the process of extracting the comments.

Extraction Process Overview

The script follows a multi-step process to extract comments from a Facebook ad campaign:

1. Authentication & Validation

validate_mapi_access_token()
  • Validates the MAPI access token by making a test API call
  • Ensures the token has proper permissions

2. Campaign Analysis

get_campaign_info() → get_campaign_objective() → is_campaign_dynamic_ad()
  • Fetches campaign metadata (name, dates, objective, status)
  • Determines the campaign objective (e.g., PRODUCT_CATALOG_SALES, CONVERSIONS)
  • Checks if the campaign uses dynamic ads based on:
    • Campaign objective
    • Presence of product catalog ID in promoted_object

3. Adset Analysis (if not clearly dynamic)

get_adsets_data() → is_adset_dynamic_ad()
  • If campaign-level analysis is inconclusive, checks adset level
  • Retrieves top 20 adsets by impressions (last 365 days)
  • Examines adset promoted_object for dynamic ad indicators

4. Ad Retrieval

get_ads_data_from_campaing()
  • Fetches ads from the campaign using the Insights API
  • Filters ads by minimum impressions (default: 1000)
  • Sorts by impressions descending
  • Limits to max_ads parameter

5. Creative Extraction

get_ads_creatives()
  • Retrieves ad creative data in batches (25 ads per API call)
  • Extracts:
    • effective_object_story_id: Main post ID
    • effective_instagram_story_id: Main Instagram post ID
    • product_set_id: For catalog-based ads
    • object_story_spec: Template data for dynamic ads
    • asset_feed_spec: DCO/PAC configuration
    • creative_sourcing_spec: Product set associations

6. Dynamic Post Discovery

get_dynamic_posts_for_ad_creatives()
  • For each ad, determines if it's dynamic using is_ad_dynamic()
  • Checks for:
    • Product set ID
    • Template data in object_story_spec
    • Optimization type in asset_feed_spec (REGULAR, PLACEMENT, LANGUAGE, ASSET_CUSTOMIZATION)
    • Associated product set in creative_sourcing_spec
  • For DCO/PAC ads, prompts user to continue or skip
  • Fetches dynamic posts via pagination (up to MAX_DYNAMIC_POSTS_PER_AD = 10)
  • Enforces limit of MAX_DYNAMIC_POSTS_PER_CAMPAIGN = 50

Please note that some campaigns for catalog ads might have way more dynamic posts per ad, so you might want to increase the amount of MAX_DYNAMIC_POSTS_PER_AD and MAX_DYNAMIC_POSTS_PER_CAMPAIGN for these campaigns.

7. Comment Extraction

get_comments_for_ads() → get_ad_comments_from_posts()
  • For each ad, collects all post IDs:
    • Main post (effective_object_story_id)
    • Dynamic posts (if any)
  • Fetches comments for each post with pagination
  • Retrieves comment metadata:
    • Message text
    • Comment count (replies)
    • Like count
  • Tags comments from dynamic posts with is_dynamic: true
  • Enforces MAX_COMMENTS_PER_CAMPAIGN = 10000 limit

8. Comment Analysis

analyse_all_comments()
  • Calculates comment_score = (like_count × 2) + comment_count
  • Filters out short comments (<4 characters) unless they're only emojis
  • Sorts comments by score (highest engagement first)
  • Sets ignore flags for low-quality comments
  • Cleans up unnecessary fields

9. Report Generation & Save

generate_report() → save_data_to_file()
  • Generates console report with statistics
  • Saves complete data to JSON file

Dynamic Ads Detection

The script uses a multi-layered approach to detect dynamic ads:

  1. Campaign Level: Checks campaign objective and promoted_object
  2. Adset Level: Examines adset promoted_object for catalog fields
  3. Ad Level: Analyzes creative data for:
    • DCO (Dynamic Creative Optimization)
    • PAC (Placement Asset Customization)
    • DLO (Dynamic Language Optimization)
    • Catalog-based dynamic ads

Comment Filtering

Comments are filtered based on:

  • Length: Comments shorter than 4 characters are ignored (unless emoji-only)
  • Quality scoring: Comments are ranked by engagement (likes × 2 + replies)
  • Source tracking: Comments are tagged as dynamic or static

Developer Section

Code Structure

The script is organized into logical sections:

├── Configuration & Constants
│   ├── MAPI base URL and access token
│   ├── MAX_DYNAMIC_POSTS_PER_CAMPAIGN = 50
│   ├── MAX_DYNAMIC_POSTS_PER_AD = 10
│   └── MAX_COMMENTS_PER_CAMPAIGN = 10000
│
├── Utility Functions
│   ├── get_mapi_cookies() - Parse cookie string
│   ├── call_mapi_api() - HTTP request wrapper with retry logic
│   ├── debug_print() - Colored debug output
│   └── is_only_emoji() - Emoji detection
│
├── API Interaction Layer
│   ├── validate_mapi_access_token()
│   ├── get_campaign_info()
│   ├── get_campaign_objective()
│   ├── get_adsets_data()
│   └── get_insights_time_range()
│
├── Dynamic Ad Detection
│   ├── is_campaign_dynamic_ad()
│   ├── is_adset_dynamic_ad()
│   └── is_ad_dynamic()
│
├── Data Extraction
│   ├── get_ads_data_from_campaing()
│   ├── get_ads_creatives()
│   ├── get_creatives_from_response()
│   ├── get_dynamic_posts_for_ad_creatives()
│   └── get_ad_comments_from_posts()
│
├── Processing & Analysis
│   ├── get_comments_for_ads()
│   ├── analyse_all_comments()
│   └── get_ad_text()
│
└── Output & Reporting
    ├── generate_report()
    └── save_data_to_file()

Main Methods

call_mapi_api(url, params, with_access_token, max_retries, wait_seconds)

Core API wrapper with retry logic for rate limiting and server errors.

Features:

  • Automatic retry on 429, 500, 502, 503, 504 status codes
  • Cookie-based authentication support
  • Configurable retry attempts and wait time

get_dynamic_posts_for_ad_creatives(ad_creatives, campaign_likely_dynamic_ad)

Critical method for handling dynamic ads. Iterates through ads and fetches dynamic posts with pagination.

Flow:

  1. Check if ad is dynamic
  2. Prompt user for DCO/PAC ads (can slow down process)
  3. Paginate through dynamic posts
  4. Enforce per-ad and per-campaign limits
  5. Add dynamic posts to ad_creatives structure

get_ad_comments_from_posts(ad_id, posts, total_comments)

Fetches comments for a list of posts with pagination.

Features:

  • Handles multiple posts per ad
  • Pagination support with cursor-based paging
  • Tags comments from dynamic posts
  • Enforces MAX_COMMENTS_PER_CAMPAIGN limit

analyse_all_comments(ads_creatives)

Post-processes comments and cleans up data structure.

Operations:

  1. Calculates comment_score for ranking
  2. Filters short/low-quality comments
  3. Extracts ad text to separate field
  4. Removes unnecessary API response fields
  5. Sorts comments by engagement score

LLM Integration

This project mainly focus on extracting the comments so you can have your own falvor and prompts to analyse the comments. I might add a sentiment analysis code in the future, maybe to handle aggregation when we have a lot of comments. But for now, it is up to you to implement this section.

Converting to CSV for LLM Analysis

To mke things a bit easier when you don't want to invest inomplementing an LLM script to automatically analayse the comments, I have also created a script that comments_to_csv.py to convert the JSON output to a simplified CSV format. The script will convert the comments to a csv format and will give you a prompt that you can copy and paste into any AI platform that you like and upload the csv file to get an instant sentiment analysis.

Basic Usage

python3 comments_to_csv.py campaign_<CAMPAIGN_ID>.json

This will:

  1. Sort all comments by comment_score (highest engagement first)
  2. Create a CSV file with only essential fields (comment, comment_score, ad_id)
  3. Filter out ignored comments
  4. Print a suggested prompt for sentiment analysis

Advanced Usage - Limiting Comments

If you have a large number of comments and want to focus on the most engaging ones, you can limit the number of comments extracted:

# Extract only the top 100 most engaging comments
python3 comments_to_csv.py campaign_<CAMPAIGN_ID>.json --max-comments 100

# Extract only the top 50 most engaging comments
python3 comments_to_csv.py campaign_<CAMPAIGN_ID>.json --max-comments 50

Parameters:

  • json_file (required): The JSON file generated by get_comments.py
  • --max-comments (optional): Maximum number of comments to extract, sorted by engagement score (highest first)

Why limit comments?

  • LLM token limits: Many LLMs have context limits that may not handle thousands of comments
  • Cost optimization: Reducing input tokens can lower API costs for paid LLMs
  • Focus on quality: High-engagement comments often contain more valuable insights
  • Faster analysis: Smaller datasets process quicker

The script outputs a ready-to-use prompt that you can copy and paste into ChatGPT, Claude, Meta AI, or any other LLM interface. Simply upload the generated CSV file along with the prompt.

Example:

$ python3 comments_to_csv.py campaign_1234567890.json --max-comments 100

Loading JSON file: campaign_1234567890.json
Extracting comments...
Found 215 non-ignored comments
Comments sorted by comment_score (highest first)
Limited to top 100 comments (from 215 total)
Writing to CSV: campaign_1234567890.csv

✅ CSV file created successfully: campaign_1234567890.csv

================================================================================
SUGGESTED PROMPT FOR LLM ANALYSIS
================================================================================

[Detailed prompt text will be displayed here]

================================================================================

📋 Copy the prompt above and paste it into ChatGPT or Meta AI
📎 Then upload the CSV file: campaign_1234567890.csv
================================================================================

Using Output for Sentiment Analysis

The JSON output from this script is perfectly structured for LLM-based sentiment analysis. Each comment includes engagement metrics that can help prioritize which comments to analyze first.

Example Code

Batch Analysis with Aggregation

Here is an example code using the output file using local Llama model running using ollama.

import requests
import json
def analyze_campaign_sentiment(campaign_data):
    """Analyze all comments for a campaign using Ollama Llama model"""
    # Collect all comments with metadata
    all_comments = []
    for ad in campaign_data["ads"]:
        if "comments" not in ad or ad.get("skip_ad", False):
            continue
        for comment in ad["comments"]:
            if not comment["ignore_comment"]:
                all_comments.append({
                    "ad_id": ad["ad_id"],
                    "message": comment["message"],
                    "score": comment["comment_score"],
                    "is_dynamic": comment["is_dynamic"]
                })
    # Sort by engagement score
    all_comments.sort(key=lambda x: x["score"], reverse=True)
    # Analyze top 50 comments
    top_comments = all_comments[:50]
    comments_text = "\n".join([
        f"[Score: {c['score']}] {c['message']}"
        for c in top_comments
    ])
    prompt = f"""Analyze these {len(top_comments)} comments from a Facebook ad campaign.
Comments:
{comments_text}
Please provide:
1. Overall sentiment distribution (positive/negative/neutral percentages)
2. Top 5 recurring themes or topics
3. Key concerns or complaints
4. Positive feedback highlights
5. Actionable recommendations for the advertiser
Respond in JSON format."""
    # Call Ollama API
    ollama_url = "http://localhost:11434/api/chat"  # Change if using remote Ollama
    payload = {
        "model": "llama3",  # Or "llama2", depending on your Ollama setup
        "messages": [
            {"role": "user", "content": prompt}
        ]
    }
    response = requests.post(ollama_url, json=payload)
    response.raise_for_status()
    result = response.json()
    # Extract and parse the JSON response from the model
    try:
        # Ollama returns the model's reply in result['message']['content']
        return json.loads(result['message']['content'])
    except (KeyError, json.JSONDecodeError) as e:
        print("Failed to parse Ollama response:", e)
        print("Raw response:", result)
        return None
# Run analysis
results = analyze_campaign_sentiment(campaign_data)
print(json.dumps(results, indent=2))

Additional Resources

License

This tool is for demonstration/educational use. Ensure compliance with Meta's API terms of service and data privacy regulations when using this script.

About

Sample code to show how to extract comments from facebook ads campaign to be used for sentiment analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •