Skip to content

199-biotechnologies/claude-skill-seo-geo-optimizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SEO/GEO/AEO Optimizer

Analysis and optimization toolkit for content visibility across search engines (SEO), AI platforms (GEO - Generative Engine Optimization), answer engines (AEO - Answer Engine Optimization), voice assistants, and social media.

Author: Boris Djordjevic, 199 Biotechnologies Status: Phase 2 Complete (Content Implementation) License: MIT Repository: github.com/199-biotechnologies/claude-skill-seo-geo-optimizer


Purpose

This skill audits and optimizes content for discoverability across:

  • Search engines: Google, Bing, Brave Search (traditional SEO)
  • AI platforms: ChatGPT (Search, Atlas browser), Perplexity, Claude, Gemini, Brave Leo (GEO - Generative Engine Optimization)
  • Answer engines: Google AI Overviews (Gemini 3, default Jan 2026), Google AI Mode (separate citation surface), Bing Copilot, featured snippets (AEO - Answer Engine Optimization)
  • Voice assistants: Google Assistant, Siri, Alexa
  • Social media: Facebook, X (Twitter), LinkedIn, WhatsApp, Instagram, Bluesky, Threads, Mastodon

Supports HTML, Markdown, and React/JSX files.


What Shifted (Nov 2025 → May 2026)

  • Google AI Overviews swapped to Gemini 3 on 27 Jan 2026. ~42% of previously cited domains were replaced; ~88% of AI Overview answers now cite 3+ sources (SE Ranking).
  • Google AI Mode is a separate citation surface from AI Overviews — URL overlap ~14%. It uses query fan-out, so each H2 needs to stand alone as an answerable sub-query.
  • Google retired FAQ rich results in Search on 7 May 2026. FAQ schema still earns its place for non-Google engines and as a structural signal for AI platforms.
  • ChatGPT's 7 May 2026 update gave referral links more prominence. Similarweb measured a step-change in traffic and ~60% of it now lands on brand homepages — make the homepage answer "what does this site do" cleanly.
  • schema.org v30.0 (19 Mar 2026) added Credential and Error; deprecated Attorney in favour of LegalService.
  • Schema-as-a-GEO-lever has been demoted. An Ahrefs test of 1,885 pages found ~+2.4% AI-Mode citation lift from adding JSON-LD — noise. Keep schema for entity / rich-result purposes; pair it with original data.
  • llms.txt is now a Lighthouse 13.3 "Agentic Browsing" audit, but no major AI crawler consumes it in production (as of May 2026). Useful for agent-facing developer docs; not a citation lever.

See reference/statistics-2026.md for sources and the full review pass.


Installation

Clone to Claude skills directory:

cd ~/.claude/skills/
git clone https://github.com/199-biotechnologies/claude-skill-seo-geo-optimizer.git

Requirements: Python 3.7+, no external dependencies (stdlib only).

Verify installation:

python ~/.claude/skills/seo-geo-optimizer/scripts/analyze_content.py --help

Usage

Phase 1: Analysis

Audit existing content, generate recommendations:

# Comprehensive audit with all reports
python scripts/audit_report.py ~/project/page.html --format all

# Analyze specific aspects
python scripts/analyze_content.py ~/project/page.html
python scripts/metadata_validator.py ~/project/page.html
python scripts/keyword_analyzer.py ~/project/page.html
python scripts/entity_extractor.py ~/project/page.html

Phase 2: Implementation

Apply optimizations automatically:

# Full optimization pipeline for specific platform
python scripts/auto_implementer.py page.html perplexity

# Individual optimization steps
python scripts/content_optimizer.py page.html
python scripts/platform_optimizer.py page.html chatgpt
python scripts/voice_optimizer.py page.html
python scripts/freshness_monitor.py page.html
python scripts/citation_enhancer.py page.html

Keyword Clustering

Group related keywords into semantic clusters for topical authority:

# Analyze with keyword clusters (default)
python scripts/keyword_analyzer.py ~/project/page.html

# Fast mode without clustering
python scripts/keyword_analyzer.py ~/project/page.html --no-clusters

Output includes:

  • Semantic clusters grouped by TF-IDF similarity
  • Topic identification for each cluster
  • Recommendations for pillar content strategy

IndexNow Instant Indexing

Submit URLs directly to search engines for immediate indexing (critical for GEO/AEO):

# Generate IndexNow key
python scripts/indexnow_submit.py --generate-key --output ./public

# Submit single URL
python scripts/indexnow_submit.py https://example.com/new-page --key YOUR_KEY

# Batch submit from file
python scripts/indexnow_submit.py --batch urls.txt --key YOUR_KEY

Benefits for GEO:

  • Bing index feeds AI platforms (ChatGPT, Perplexity, Claude)
  • Fresh content indexed in minutes vs weeks
  • Fresh content (<30 days old) is reliably preferred by Perplexity and the post-Gemini-3 AI Overviews

Generate Schema Markup

# FAQ schema (highest AI citation probability)
python scripts/schema_generator.py faq \
  --question "What is longevity medicine?" \
  --answer "Longevity medicine optimizes biomarkers like LDL <70 mg/dL to reduce cardiovascular risk by 30-40%."

# Article schema with E-E-A-T signals
python scripts/schema_generator.py article \
  --title "Understanding Biomarkers" \
  --author "Dr. Sarah Johnson" \
  --credentials "MD, PhD" \
  --date "2025-01-15"

Architecture

Phase 1: Analysis (Complete)

Six Python scripts analyze content and generate recommendations:

Script Purpose Output
analyze_content.py Extract metadata, schema, structure JSON analysis
metadata_validator.py Validate meta tags, OG, Twitter Validation report
keyword_analyzer.py Extract keywords (5 types) Keyword list
entity_extractor.py Extract persons, orgs, places Entity map
schema_generator.py Generate JSON-LD schemas Schema files
audit_report.py Orchestrate analysis, generate reports JSON, MD, HTML

Phase 2: Implementation (Complete)

Six Python scripts implement optimizations:

Script Purpose Impact
content_optimizer.py Meta descriptions, FAQ, data tables Structure optimization
platform_optimizer.py ChatGPT, Perplexity, Claude, Gemini Platform-specific
voice_optimizer.py Speakable schema, featured snippets Voice search ready
freshness_monitor.py Content age tracking Strong Perplexity / AI-Mode preference when <30 days
citation_enhancer.py Statistics, quotation opportunities +41% and +28% impact
auto_implementer.py Full pipeline orchestration Complete automation
indexnow_submit.py Instant search engine indexing Minutes vs weeks

Phase 3: Advanced Features (Planned)

  • Competitive analysis (SERP top 10 comparison)
  • Automated monitoring (SQLite time-series tracking)
  • A/B testing framework (statistical significance)
  • Analytics integration (Google Search Console, Plausible)

Features

Content Analysis

File Types:

  • HTML (.html)
  • Markdown (.md, .mdx)
  • React/JSX (.jsx, .tsx)

Extraction:

  • Meta tags (title, description, keywords)
  • Open Graph (og:title, og:description, og:image)
  • Twitter Cards (twitter:card, twitter:title)
  • JSON-LD schema markup
  • Content structure (headings, word count, author)

Keyword Analysis

Five keyword types extracted:

  1. Primary: Main topic (H1, meta title, URL, first 100 words)
  2. Semantic: Related terms (H2/H3, body)
  3. LSI: Co-occurring terms (natural language)
  4. Long-tail: 3-8 word phrases (FAQ, H3)
  5. Question: Who/what/where/when/why/how (FAQ schema)

Keyword Clustering: TF-IDF + cosine similarity groups keywords into semantic clusters for topical authority and pillar content strategy.

Schema Generation

JSON-LD schemas with validation:

  • FAQPage: Highest AI citation probability
  • Article: E-E-A-T signals (credentials, dates)
  • HowTo: Voice search optimized (ISO 8601 durations)
  • BreadcrumbList: Site hierarchy
  • Organization/LocalBusiness: Entity recognition
  • Person: Author profiles with credentials
  • Speakable: Voice assistant optimization

Platform Optimization

ChatGPT (40-60% of LLM traffic):

  • Authority and credentials (+40% citation boost)
  • 1500-2500 words comprehensive coverage
  • Primary source citations (PubMed, arXiv)
  • Answer-first structure

Perplexity (Freshness-focused):

  • Content updated within 30 days (Perplexity / AI-Mode preference)
  • Inline citations with [1], [2] format
  • H2→H3→bullets structure (40% more citations)
  • Update frequency: 2-3 days (aggressive) or 90 days (minimum)

Claude (Accuracy-focused):

  • Primary sources only (91.2% attribution accuracy)
  • 5-8 citations with publisher and year
  • Transparent methodology
  • Acknowledged limitations

Gemini (Community-focused):

  • Google Business Profile integration
  • User reviews and testimonials
  • Local citations (NAP consistency)
  • Traditional authority signals

Grokipedia (xAI, launched Oct 2025):

  • RAG-based citations (20-30% better factual consistency)
  • Transparent version history and licensing
  • Primary source attribution (publisher + year)
  • Wikipedia-derived content requires CC-BY-SA attribution

Voice Search

  • Featured snippet optimization (30-40 words)
  • Speakable schema (20-30 second segments)
  • FAQ schema (natural language questions)
  • Question keyword extraction

Social Media Previews

  • Open Graph tags (Facebook, LinkedIn, WhatsApp, Instagram Stories)
  • Twitter Cards (summary, summary_large_image)
  • Image specifications (1200×630px optimal, 1080×1920px for Instagram Stories)
  • Instagram: Limited OG support (Stories only), bio link optimization, 85%+ mobile users
  • iMessage optimization

Output

Report Location

~/Documents/SEO_Audit_[YYYY-MM-DD]_[HHMM]/
├── audit_report.json          # Structured data
├── audit_report.md            # Markdown report
├── audit_report.html          # Visual dashboard
└── generated_schemas/         # JSON-LD files
    ├── faq_schema.json
    ├── article_schema.json
    └── howto_schema.json

Report Structure

  1. Executive Summary: Overall score (0-100), top issues, top wins
  2. Metadata Analysis: Meta tags, Open Graph, Twitter Cards, schema
  3. Content Structure: Headings, word count, TL;DR, FAQ, author
  4. Keyword Analysis: Primary, semantic, LSI, long-tail, question
  5. Platform Optimization: ChatGPT, Perplexity, Claude, Gemini recommendations
  6. Action Items: Prioritized by criticality (critical, high, medium, low)
  7. Generated Assets: Copy-paste ready schema markup

Reference Documentation

Located in reference/ directory:

  • citation-optimization-guide.md: AI citation strategies (+35-40% boost)
  • entity-seo-guide.md: Knowledge Graph optimization
  • platform-strategies.md: Platform-specific tactics
  • voice-search-guide.md: Voice assistant optimization (29-word answers)
  • social-preview-guide.md: Open Graph, Twitter Cards (1200×630px)
  • schema-library.md: Complete JSON-LD reference

Templates in templates/ directory:

  • meta-tags-template.html: Complete meta tag set
  • Schema templates: FAQ, Article, HowTo, Breadcrumb, Organization, Person

Industry examples in examples/ directory:

  • medical-clinic/: Healthcare optimization (15/100 → 92/100)
  • consulting-firm/: B2B entity SEO (22/100 → 89/100)
  • saas-landing-page/: LLMO optimization (18/100 → 94/100)

Configuration

Optimization Profiles

Located in profiles/ directory:

chatgpt_authority.json:

  • Target: ChatGPT citation optimization
  • Focus: Authority, credentials, depth
  • Word count: 1500-2500
  • Expected: +40% citation probability

perplexity_fresh.json:

  • Target: Perplexity visibility
  • Focus: Freshness, inline citations, passage-level extractability
  • Update: Every 2-3 days

multi_platform.json:

  • Target: Balanced optimization
  • Platforms: ChatGPT, Perplexity, Claude, Gemini (Grokipedia optional)
  • Word count: 1200-2000
  • Update: Monthly

voice_optimized.json:

  • Target: Voice assistants
  • Focus: Featured snippets, FAQ
  • Answer length: 30-40 words
  • Schema: Speakable, FAQPage, HowTo

Performance

Script Execution Times

  • analyze_content.py: <1 second
  • metadata_validator.py: <1 second
  • keyword_analyzer.py: <2 seconds (with clustering)
  • entity_extractor.py: <1 second
  • schema_generator.py: <1 second
  • audit_report.py: 3-5 seconds
  • auto_implementer.py: 10-15 seconds (full pipeline)
  • indexnow_submit.py: 1-3 seconds (network dependent)

All scripts: Python stdlib only, no external dependencies, offline operation (except IndexNow which requires network).


Development Status

Phase 1: Core Functionality (Complete)

  • Content analysis and extraction
  • Schema markup generation
  • Metadata validation
  • Keyword and entity extraction
  • Multi-format report generation

Phase 2: AI Optimization (Complete)

  • Content rewriting for AI citation
  • Platform-specific optimization (ChatGPT, Perplexity, Claude, Gemini)
  • Voice search enhancement
  • Freshness monitoring
  • Citation enhancement (+41% statistics, +28% quotations)
  • Auto-implementation pipeline

Phase 3: Advanced Features (Planned)

  • Competitive analysis (SERP top 10 comparison)
  • Automated monitoring (SQLite time-series database)
  • Change detection and alerting (email, webhook, Slack)
  • A/B testing framework (statistical significance testing)
  • Analytics integration (Google Search Console API, Plausible)
  • Unified dashboard (multi-source data visualization)

Phase 4: Ecosystem Integration (Planned)

  • Seamless workflow with minimalist-website-mvp skill
  • Automatic PDF report generation via generating-pdf skill
  • Export to SEO tools (Ahrefs, SEMrush format)
  • API mode for CI/CD pipelines
  • MCP server for persistent multi-site monitoring

Research Foundation

The Princeton / Georgia Tech "Generative Engine Optimization" tactics (statistics addition, named-authority quotation, fluency, citing sources, authoritative phrasing) have held up across the 2026 follow-up work — notably AgenticGEO (arXiv) and the "Citation Selection vs Absorption" study. Specific percentage lifts from the original paper are kept off this README on purpose: the 2026 studies couldn't reproduce them, even though the direction of each tactic still holds.

What the 2026 studies did add:

  • Position bias is strong — material in the first ~30% of a page picks up a disproportionate share of AI citations (iPullRank).
  • Perplexity selects at the passage / sub-document level, not the page level (366k-citation arXiv study, July 2025).
  • Schema-as-a-GEO-lever was overstated — Ahrefs measured ~+2.4% AI Mode lift from adding JSON-LD across 1,885 pages.
  • Brave Search visibility correlates with Claude citation more closely than Google ranking does.

For sourced numbers and what changed in the last six months, see reference/statistics-2026.md.

  • Featured snippets: 40.7% of voice answers

Platform citation patterns:

  • ChatGPT: Wikipedia (1.3M citations), G2 (196K), Forbes (181K)
  • Perplexity: Update frequency critical (passage-level retrieval prefers fresh sections)
  • Claude: 91.2% correct source attribution (Q2 2025)
  • AI Overviews: 13.14% of queries (March 2025), up from 6.49% (January 2025)

Contributing

This project follows semantic versioning (major.minor.patch).

Report issues: GitHub Issues


License

MIT License - See LICENSE file for complete terms.

Copyright (c) 2025 Boris Djordjevic, 199 Biotechnologies

About

Production-ready Claude skill for comprehensive SEO/GEO optimization. Analyzes content for traditional search engines + AI platforms (ChatGPT, Perplexity, Claude, Gemini). Includes entity extraction, schema generation, and multi-format audit reports.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors