Facebook Campaign Comments Extractor

In the current digital advertising landscape, comments offer a valuable, often untapped, source of insights. The prevalent method for sentiment analysis of ad comments involves manual copying and pasting from Ads Manager, a process that is time-consuming, prone to errors, and unscalable for large campaigns.

To address this, I developed a Python script that automates comment extraction from Facebook ad campaigns using the Meta Marketing API (MAPI). This tool is designed to facilitate the analysis of user feedback and sentiment from these campaigns.

While the script currently focuses on extracting comments from Facebook ads, it can be readily adapted to support other social media platforms. My primary focus with this tool is comment extraction, rather than AI-driven sentiment analysis. Although I've included examples of how you might approach sentiment analysis, that aspect is left for your implementation.

Using the Web Interface (New!)

The tool now includes a user-friendly web interface that allows you to extract comments and download them as JSON or CSV without using the command line.

Setup

Create a virtual environment and install dependencies:

python3 -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate
pip install -r requirements.txt

Running the Server

Start the backend server:

python -m backend.app

Open your browser and navigate to: http://localhost:5000
Enter your Access Token and Campaign ID in the form.
Select your desired output format (JSON or CSV).
Click "Fetch Comments" and watch the progress in real-time.
Download the result file when finished.

Using the Script (CLI)

Prerequisites

Python 3.7+
Access to Meta Marketing API (MAPI)
Valid MAPI access token
Campaign ID from your Facebook ad account

Installation

Clone or download this repository
Install required dependencies using the requirements file:

pip install -r requirements.txt

Alternatively, you can install dependencies individually:

pip install requests python-dotenv

Create a .env file in the project directory:

touch .env

Configuration

Add the following environment variables to your .env file:

# Required: Your Meta Marketing API access token
MAPI_ACCESS_TOKEN=your_access_token_here

# Optional: MAPI cookies for additional authentication
MAPI_COOKIE="cookie_name1=value1; cookie_name2=value2"

Note: To obtain your MAPI access token, you need access to Meta's Business Manager with appropriate permissions. More info about access token can be found here and here.

Running the Script

Basic usage:

python get_comments.py <CAMPAIGN_ID>

With options:

# Enable debug mode
python get_comments.py <CAMPAIGN_ID> --debug 1

# Limit the number of ads to process
python get_comments.py <CAMPAIGN_ID> --max-ads 50

# Combine options
python get_comments.py <CAMPAIGN_ID> --debug 1 --max-ads 50

Parameters:

campaign_id (required): The Facebook campaign ID to extract comments from
--debug or -d (optional): Enable debug output for detailed logging
--max-ads (optional): Maximum number of ads to process from the campaign (default: 100)

Example:

python get_comments.py 1****************8 --max-ads 100

Interactive DCO/PAC Ad Handling

When the script encounters a DCO (Dynamic Creative Optimization) or PAC (Placement Asset Customization) ad, it will pause and prompt you for input. This is because currently this script is not optimised for DCO/PAC comments. As a result this can slow down significantly the process and not prvide the desired results.

What are DCO and PAC ads?

DCO (Dynamic Creative Optimization): Facebook automatically generates and tests combinations of provided assets (images, videos, text, CTAs) to find the best performing creative for each audience.
PAC (Placement Asset Customization): Allows advertisers to specify which creative asset should be used for each placement (e.g., Instagram Story, Facebook Feed, Messenger).

Interactive Prompt:

When a DCO/PAC ad is detected, you'll see:

⚠️  Ad 9****************2 is a DCO/PAC ad.
This script is not optimised for DCO/PAC which may slow down the process and not provide the desired results.
What would you like to do?
  1. Continue with this ad
  2. Skip this ad
  3. Continue with all DCO/PAC ads
  4. Skip all DCO/PAC ads
Enter your choice (1-4):

Options Explained:

Option	Description	Use Case
1. Continue with this ad	Processes the current DCO/PAC ad and extracts its comments. You'll be prompted again if another DCO/PAC ad is encountered.	Use when you want to selectively process specific DCO/PAC ads (e.g., when you know this particular ad is important for your analysis).
2. Skip this ad	Skips the current DCO/PAC ad without extracting comments. You'll be prompted again for the next DCO/PAC ad.	Use when you want to avoid processing this specific ad but still want to review other DCO/PAC ads individually.
3. Continue with all DCO/PAC ads	Processes the current ad AND automatically processes all future DCO/PAC ads without prompting again.	Use when you want comprehensive data including all DCO/PAC ads and are willing to accept longer processing time.
4. Skip all DCO/PAC ads	Skips the current ad AND automatically skips all future DCO/PAC ads without prompting again.	Recommended for most users. Use when you want faster processing and are okay with excluding DCO/PAC ads (which may have incomplete comment data anyway).

Recommendation:

For most use cases, Option 4 (Skip all DCO/PAC ads) is recommended because:

DCO/PAC ads can have hundreds of dynamic posts, making extraction very slow
There are known issues with comment extraction for these ad types
Regular (non-DCO/PAC) ads typically provide sufficient data for analysis

Example Session:

$ python get_comments.py 1****************8

⚠️  Ad 9****************2 is a DCO/PAC ad.
This script is not optimised for DCO/PAC which may slow down the process and not provide the desired results.
What would you like to do?
  1. Continue with this ad
  2. Skip this ad
  3. Continue with all DCO/PAC ads
  4. Skip all DCO/PAC ads
Enter your choice (1-4): 4

Skipping ad 9****************2 and all future DCO/PAC ads
Skipping DCO/PAC ad 9****************3
Skipping DCO/PAC ad 9****************4
...

In the final JSON output, skipped DCO/PAC ads will have the field "skip_ad": true.

Output Format

The script generates a JSON file named campaign_<CAMPAIGN_ID>.json with the following structure:

{
    "campaign_id": "1****************8",
    "campaign_info": {
        "name": "Campaign Name",
        "created_time": "2024-01-15T10:30:00+0000",
        "impressions": "150000",
        "objective": "OUTCOME_SALES",
        "status": "ACTIVE",
        "start_time": "2024-01-15T00:00:00+0000",
        "stop_time": "2024-12-31T23:59:59+0000",
        "updated_time": "2024-10-30T12:00:00+0000"
    },
    "ads": [
        {
            "ad_id": "1****************6",
            "adcreative_id": "7****************2",
            "is_dynamic_ad": false,
            "is_dco_pac": false,
            "text": {
                "body": "Ad copy text here..."
            },
            "dynamic_posts_count": 0,
            "comments": [
                {
                    "message": "This is a comment",
                    "comment_count": 5,
                    "like_count": 4,
                    "is_dynamic": false,
                    "ignore_comment": false,
                    "ignore_reason": "",
                    "comment_score": 13
                }
            ]
        }
    ]
}

Key Fields:

campaign_info: Metadata about the campaign
ads: Array of ads with their comments
- ad_id: Unique ad identifier
- is_dynamic_ad: Whether the ad uses dynamic creative
- is_dco_pac: Whether the ad uses Dynamic Creative Optimization or Placement Asset Customization
- skip_ad: Whether the ad was skipped during processing
- text: Ad creative text (body, descriptions, titles)
- dynamic_posts_count: Number of dynamic posts for this ad
- comments: Array of comments with metadata
  - message: Comment text
  - comment_count: Number of replies to this comment
  - like_count: Number of likes on the comment
  - comment_score: Calculated score (like_count × 2 + comment_count)
  - ignore_comment: Whether the comment should be ignored in analysis
  - ignore_reason: Reason for ignoring (e.g., "Comment is too short")

Under the Hood

I want to give a high level overview of the process of extracting the comments.

Extraction Process Overview

The script follows a multi-step process to extract comments from a Facebook ad campaign:

1. Authentication & Validation

validate_mapi_access_token()

Validates the MAPI access token by making a test API call
Ensures the token has proper permissions

2. Campaign Analysis

get_campaign_info() → get_campaign_objective() → is_campaign_dynamic_ad()

Fetches campaign metadata (name, dates, objective, status)
Determines the campaign objective (e.g., PRODUCT_CATALOG_SALES, CONVERSIONS)
Checks if the campaign uses dynamic ads based on:
- Campaign objective
- Presence of product catalog ID in promoted_object

3. Adset Analysis (if not clearly dynamic)

get_adsets_data() → is_adset_dynamic_ad()

If campaign-level analysis is inconclusive, checks adset level
Retrieves top 20 adsets by impressions (last 365 days)
Examines adset promoted_object for dynamic ad indicators

4. Ad Retrieval

get_ads_data_from_campaing()

Fetches ads from the campaign using the Insights API
Filters ads by minimum impressions (default: 1000)
Sorts by impressions descending
Limits to max_ads parameter

5. Creative Extraction

get_ads_creatives()

Retrieves ad creative data in batches (25 ads per API call)
Extracts:
- effective_object_story_id: Main post ID
- effective_instagram_story_id: Main Instagram post ID
- product_set_id: For catalog-based ads
- object_story_spec: Template data for dynamic ads
- asset_feed_spec: DCO/PAC configuration
- creative_sourcing_spec: Product set associations

6. Dynamic Post Discovery

get_dynamic_posts_for_ad_creatives()

For each ad, determines if it's dynamic using is_ad_dynamic()
Checks for:
- Product set ID
- Template data in object_story_spec
- Optimization type in asset_feed_spec (REGULAR, PLACEMENT, LANGUAGE, ASSET_CUSTOMIZATION)
- Associated product set in creative_sourcing_spec
For DCO/PAC ads, prompts user to continue or skip
Fetches dynamic posts via pagination (up to MAX_DYNAMIC_POSTS_PER_AD = 10)
Enforces limit of MAX_DYNAMIC_POSTS_PER_CAMPAIGN = 50

Please note that some campaigns for catalog ads might have way more dynamic posts per ad, so you might want to increase the amount of MAX_DYNAMIC_POSTS_PER_AD and MAX_DYNAMIC_POSTS_PER_CAMPAIGN for these campaigns.

7. Comment Extraction

get_comments_for_ads() → get_ad_comments_from_posts()

For each ad, collects all post IDs:
- Main post (effective_object_story_id)
- Dynamic posts (if any)
Fetches comments for each post with pagination
Retrieves comment metadata:
- Message text
- Comment count (replies)
- Like count
Tags comments from dynamic posts with is_dynamic: true
Enforces MAX_COMMENTS_PER_CAMPAIGN = 10000 limit

8. Comment Analysis

analyse_all_comments()

Calculates comment_score = (like_count × 2) + comment_count
Filters out short comments (<4 characters) unless they're only emojis
Sorts comments by score (highest engagement first)
Sets ignore flags for low-quality comments
Cleans up unnecessary fields

9. Report Generation & Save

generate_report() → save_data_to_file()

Generates console report with statistics
Saves complete data to JSON file

Dynamic Ads Detection

The script uses a multi-layered approach to detect dynamic ads:

Campaign Level: Checks campaign objective and promoted_object
Adset Level: Examines adset promoted_object for catalog fields
Ad Level: Analyzes creative data for:
- DCO (Dynamic Creative Optimization)
- PAC (Placement Asset Customization)
- DLO (Dynamic Language Optimization)
- Catalog-based dynamic ads

Comment Filtering

Comments are filtered based on:

Length: Comments shorter than 4 characters are ignored (unless emoji-only)
Quality scoring: Comments are ranked by engagement (likes × 2 + replies)
Source tracking: Comments are tagged as dynamic or static

Developer Section

Code Structure

The script is organized into logical sections:

├── Configuration & Constants
│   ├── MAPI base URL and access token
│   ├── MAX_DYNAMIC_POSTS_PER_CAMPAIGN = 50
│   ├── MAX_DYNAMIC_POSTS_PER_AD = 10
│   └── MAX_COMMENTS_PER_CAMPAIGN = 10000
│
├── Utility Functions
│   ├── get_mapi_cookies() - Parse cookie string
│   ├── call_mapi_api() - HTTP request wrapper with retry logic
│   ├── debug_print() - Colored debug output
│   └── is_only_emoji() - Emoji detection
│
├── API Interaction Layer
│   ├── validate_mapi_access_token()
│   ├── get_campaign_info()
│   ├── get_campaign_objective()
│   ├── get_adsets_data()
│   └── get_insights_time_range()
│
├── Dynamic Ad Detection
│   ├── is_campaign_dynamic_ad()
│   ├── is_adset_dynamic_ad()
│   └── is_ad_dynamic()
│
├── Data Extraction
│   ├── get_ads_data_from_campaing()
│   ├── get_ads_creatives()
│   ├── get_creatives_from_response()
│   ├── get_dynamic_posts_for_ad_creatives()
│   └── get_ad_comments_from_posts()
│
├── Processing & Analysis
│   ├── get_comments_for_ads()
│   ├── analyse_all_comments()
│   └── get_ad_text()
│
└── Output & Reporting
    ├── generate_report()
    └── save_data_to_file()

Main Methods

`call_mapi_api(url, params, with_access_token, max_retries, wait_seconds)`

Core API wrapper with retry logic for rate limiting and server errors.

Features:

Automatic retry on 429, 500, 502, 503, 504 status codes
Cookie-based authentication support
Configurable retry attempts and wait time

`get_dynamic_posts_for_ad_creatives(ad_creatives, campaign_likely_dynamic_ad)`

Critical method for handling dynamic ads. Iterates through ads and fetches dynamic posts with pagination.

Flow:

Check if ad is dynamic
Prompt user for DCO/PAC ads (can slow down process)
Paginate through dynamic posts
Enforce per-ad and per-campaign limits
Add dynamic posts to ad_creatives structure

`get_ad_comments_from_posts(ad_id, posts, total_comments)`

Fetches comments for a list of posts with pagination.

Features:

Handles multiple posts per ad
Pagination support with cursor-based paging
Tags comments from dynamic posts
Enforces MAX_COMMENTS_PER_CAMPAIGN limit

`analyse_all_comments(ads_creatives)`

Post-processes comments and cleans up data structure.

Operations:

Calculates comment_score for ranking
Filters short/low-quality comments
Extracts ad text to separate field
Removes unnecessary API response fields
Sorts comments by engagement score

LLM Integration

This project mainly focus on extracting the comments so you can have your own falvor and prompts to analyse the comments. I might add a sentiment analysis code in the future, maybe to handle aggregation when we have a lot of comments. But for now, it is up to you to implement this section.

Converting to CSV for LLM Analysis

To mke things a bit easier when you don't want to invest inomplementing an LLM script to automatically analayse the comments, I have also created a script that comments_to_csv.py to convert the JSON output to a simplified CSV format. The script will convert the comments to a csv format and will give you a prompt that you can copy and paste into any AI platform that you like and upload the csv file to get an instant sentiment analysis.

Basic Usage

python3 comments_to_csv.py campaign_<CAMPAIGN_ID>.json

This will:

Sort all comments by comment_score (highest engagement first)
Create a CSV file with only essential fields (comment, comment_score, ad_id)
Filter out ignored comments
Print a suggested prompt for sentiment analysis

Advanced Usage - Limiting Comments

If you have a large number of comments and want to focus on the most engaging ones, you can limit the number of comments extracted:

# Extract only the top 100 most engaging comments
python3 comments_to_csv.py campaign_<CAMPAIGN_ID>.json --max-comments 100

# Extract only the top 50 most engaging comments
python3 comments_to_csv.py campaign_<CAMPAIGN_ID>.json --max-comments 50

Parameters:

json_file (required): The JSON file generated by get_comments.py
--max-comments (optional): Maximum number of comments to extract, sorted by engagement score (highest first)

Why limit comments?

LLM token limits: Many LLMs have context limits that may not handle thousands of comments
Cost optimization: Reducing input tokens can lower API costs for paid LLMs
Focus on quality: High-engagement comments often contain more valuable insights
Faster analysis: Smaller datasets process quicker

The script outputs a ready-to-use prompt that you can copy and paste into ChatGPT, Claude, Meta AI, or any other LLM interface. Simply upload the generated CSV file along with the prompt.

Example:

$ python3 comments_to_csv.py campaign_1234567890.json --max-comments 100

Loading JSON file: campaign_1234567890.json
Extracting comments...
Found 215 non-ignored comments
Comments sorted by comment_score (highest first)
Limited to top 100 comments (from 215 total)
Writing to CSV: campaign_1234567890.csv

✅ CSV file created successfully: campaign_1234567890.csv

================================================================================
SUGGESTED PROMPT FOR LLM ANALYSIS
================================================================================

[Detailed prompt text will be displayed here]

================================================================================

📋 Copy the prompt above and paste it into ChatGPT or Meta AI
📎 Then upload the CSV file: campaign_1234567890.csv
================================================================================

Using Output for Sentiment Analysis

The JSON output from this script is perfectly structured for LLM-based sentiment analysis. Each comment includes engagement metrics that can help prioritize which comments to analyze first.

Example Code

Batch Analysis with Aggregation

Here is an example code using the output file using local Llama model running using ollama.

import requests
import json
def analyze_campaign_sentiment(campaign_data):
    """Analyze all comments for a campaign using Ollama Llama model"""
    # Collect all comments with metadata
    all_comments = []
    for ad in campaign_data["ads"]:
        if "comments" not in ad or ad.get("skip_ad", False):
            continue
        for comment in ad["comments"]:
            if not comment["ignore_comment"]:
                all_comments.append({
                    "ad_id": ad["ad_id"],
                    "message": comment["message"],
                    "score": comment["comment_score"],
                    "is_dynamic": comment["is_dynamic"]
                })
    # Sort by engagement score
    all_comments.sort(key=lambda x: x["score"], reverse=True)
    # Analyze top 50 comments
    top_comments = all_comments[:50]
    comments_text = "\n".join([
        f"[Score: {c['score']}] {c['message']}"
        for c in top_comments
    ])
    prompt = f"""Analyze these {len(top_comments)} comments from a Facebook ad campaign.
Comments:
{comments_text}
Please provide:
1. Overall sentiment distribution (positive/negative/neutral percentages)
2. Top 5 recurring themes or topics
3. Key concerns or complaints
4. Positive feedback highlights
5. Actionable recommendations for the advertiser
Respond in JSON format."""
    # Call Ollama API
    ollama_url = "http://localhost:11434/api/chat"  # Change if using remote Ollama
    payload = {
        "model": "llama3",  # Or "llama2", depending on your Ollama setup
        "messages": [
            {"role": "user", "content": prompt}
        ]
    }
    response = requests.post(ollama_url, json=payload)
    response.raise_for_status()
    result = response.json()
    # Extract and parse the JSON response from the model
    try:
        # Ollama returns the model's reply in result['message']['content']
        return json.loads(result['message']['content'])
    except (KeyError, json.JSONDecodeError) as e:
        print("Failed to parse Ollama response:", e)
        print("Raw response:", result)
        return None
# Run analysis
results = analyze_campaign_sentiment(campaign_data)
print(json.dumps(results, indent=2))

Additional Resources

License

This tool is for demonstration/educational use. Ensure compliance with Meta's API terms of service and data privacy regulations when using this script.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
comments_to_csv.py		comments_to_csv.py
get_comments.py		get_comments.py
improvement.md		improvement.md
requirements.txt		requirements.txt

License

mrharel/facebook_ads_comments_analysis

Folders and files

Latest commit

History

Repository files navigation

Facebook Campaign Comments Extractor

Table of Contents

Using the Web Interface (New!)

Setup

Running the Server

Using the Script (CLI)

Prerequisites

Installation

Configuration

Running the Script

Interactive DCO/PAC Ad Handling

Output Format

Under the Hood

Extraction Process Overview

1. Authentication & Validation

2. Campaign Analysis

3. Adset Analysis (if not clearly dynamic)

4. Ad Retrieval

5. Creative Extraction

6. Dynamic Post Discovery

7. Comment Extraction

8. Comment Analysis

9. Report Generation & Save

Dynamic Ads Detection

Comment Filtering

Developer Section

Code Structure

Main Methods

call_mapi_api(url, params, with_access_token, max_retries, wait_seconds)

get_dynamic_posts_for_ad_creatives(ad_creatives, campaign_likely_dynamic_ad)

get_ad_comments_from_posts(ad_id, posts, total_comments)

analyse_all_comments(ads_creatives)

LLM Integration

Converting to CSV for LLM Analysis

Basic Usage

Advanced Usage - Limiting Comments

Using Output for Sentiment Analysis

Example Code

Batch Analysis with Aggregation

Additional Resources

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`call_mapi_api(url, params, with_access_token, max_retries, wait_seconds)`

`get_dynamic_posts_for_ad_creatives(ad_creatives, campaign_likely_dynamic_ad)`

`get_ad_comments_from_posts(ad_id, posts, total_comments)`

`analyse_all_comments(ads_creatives)`

Packages