Skip to content

AndreiDrang/image_sitemap

Repository files navigation

🗺️ image_sitemap


PyPI version Python versions Downloads

Image & Website Sitemap Generator - SEO Tool for Better Visibility

Sitemap Images is a Python tool that generates a specialized XML sitemap file, allowing you to submit image URLs to search engines like Google, Bing, and Yahoo. This tool helps improve image search visibility, driving more traffic to your website and increasing engagement. To ensure search engines can discover your sitemap, simply add the following line to your robots.txt file:

Sitemap: https://example.com/sitemap-images.xml

By including image links in your sitemap and referencing it in your robots.txt file, you can enhance your website's SEO and make it easier for users to find your content.

Google image sitemaps standard description - Click.

📦 Features

  • Supports both website and image sitemap generation
  • Easy integration with existing Python projects
  • Helps improve visibility in search engine results
  • Boosts image search performance
  • Subdomain filtering with exclusion support
  • Configurable crawling depth and query parameters

✍️ Examples

  1. Set website page and crawling depth, run script
    import asyncio
    
    from image_sitemap import Sitemap
    from image_sitemap.instruments.config import Config
      
    images_config = Config(
        max_depth=3,
        accept_subdomains=True,
        excluded_subdomains={"blog", "api", "staging"},  # Exclude specific subdomains
        is_query_enabled=False,
        file_name="sitemap_images.xml",
        header={
           "User-Agent": "ImageSitemap Crawler",
           "Accept": "text/html",
        },
    )
    sitemap_config = Config(
        max_depth=3,
        accept_subdomains=True,
        excluded_subdomains={"blog", "api", "staging"},  # Exclude specific subdomains
        is_query_enabled=False,
        file_name="sitemap.xml",
        header={
           "User-Agent": "ImageSitemap Crawler",
           "Accept": "text/html",
        },
    )
    
    asyncio.run(Sitemap(config=images_config).run_images_sitemap(url="https://rucaptcha.com/"))
    asyncio.run(Sitemap(config=sitemap_config).run_sitemap(url="https://rucaptcha.com/"))
  2. Get sitemap images data in file
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset
        xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
        <url>
            <loc>https://rucaptcha.com/proxy/residential-proxies</loc>
            <image:image>
                <image:loc>https://rucaptcha.com/dist/web/assets/rotating-residential-proxies-NEVfEVLW.svg</image:loc>
            </image:image>
        </url>
    </urlset>
    Or just sitemap file
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset
       xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
       <url>
           <loc>https://rucaptcha.com/</loc>
       </url>
       <url>
           <loc>https://rucaptcha.com/h</loc>
       </url>
    </urlset>

🔧 Configuration Options

The Config class provides various options to customize sitemap generation:

Subdomain Control

  • accept_subdomains (bool): Enable/disable subdomain crawling (default: True)
  • excluded_subdomains (Set[str]): Set of subdomain names to exclude from parsing (default: set())
# Example: Include all subdomains except blog and api
config = Config(
    accept_subdomains=True,
    excluded_subdomains={"blog", "api", "staging", "dev"}
)

# This will include:
# - example.com
# - www.example.com  
# - shop.example.com
# But exclude:
# - blog.example.com
# - api.example.com
# - staging.example.com
# - dev.example.com

Other Options

  • max_depth (int): Maximum crawling depth (default: 1)
  • is_query_enabled (bool): Include URLs with query parameters (default: True)
  • file_name (str): Output sitemap filename (default: "sitemap_images.xml")
  • exclude_file_links (bool): Filter out file links from sitemap (default: True)
  • header (dict): Custom HTTP headers for requests

You can check examples file here - Click.

About

Image Sitemap Generator SEO tool

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •