This document describes how Algolia search indexing is configured for the atmos.tools documentation site.
The atmos.tools documentation uses Algolia DocSearch for search functionality. Search indexing is managed via the Algolia Crawler, which is triggered automatically on every deployment.
┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────────┐
│ GitHub Actions │────▶│ Algolia Crawler │────▶│ Algolia Index │
│ (website-deploy) │ │ (cloud-hosted) │ │ (atmos.tools) │
└─────────────────────┘ └──────────────────────┘ └─────────────────┘
-
GitHub Actions Workflow (
.github/workflows/website-deploy-prod.yml)- Triggers the Algolia Crawler after deploying the website to S3.
- Uses the official
algolia/algoliasearch-crawler-github-actionsaction.
-
Algolia Crawler (dashboard.algolia.com)
- Cloud-hosted crawler that fetches and indexes the documentation.
- Uses the official Docusaurus v3 template for optimal indexing.
- Runs on-demand (CI-triggered) and weekly (scheduled backup).
-
Docusaurus Frontend (
website/docusaurus.config.js)- Integrates with Algolia via the DocSearch plugin.
- Supports DocSearch v4 with Ask AI integration.
The following secrets must be configured in the GitHub repository:
| Secret | Description | Source |
|---|---|---|
ALGOLIA_CRAWLER_USER_ID |
Crawler authentication user ID | Algolia Crawler > Account Settings |
ALGOLIA_CRAWLER_API_KEY |
Crawler authentication API key | Algolia Crawler > Account Settings |
ALGOLIA_API_KEY |
API key for writing to index | Algolia Dashboard > API Keys |
The ALGOLIA_API_KEY must have the following ACL permissions:
| Permission | Purpose |
|---|---|
addObject |
Add records to the index |
editSettings |
Modify index settings |
deleteIndex |
Delete/recreate the index during full reindex |
browse |
Optional - needed for partial updates |
To create or edit an API key with these permissions:
- Go to Algolia Dashboard → Settings → API Keys → All API Keys.
- Click New API Key (or edit an existing key).
- Under ACL, select the permissions listed above.
- Under Indices, restrict to
atmos.toolsor use*for all indices. - Save and copy the key to GitHub secrets.
The crawler is configured in the Algolia dashboard (dashboard.algolia.com) with these settings:
- Template: Docusaurus v3
- Start URL:
https://atmos.tools/ - Sitemap URL:
https://atmos.tools/sitemap.xml - Index Name:
atmos.tools - Schedule: Weekly (as backup to CI-triggered crawls)
The Docusaurus frontend is configured in website/docusaurus.config.js:
algolia: {
appId: process.env.ALGOLIA_APP_ID || '32YOERUX83',
apiKey: process.env.ALGOLIA_SEARCH_API_KEY || '557985309adf0e4df9dcf3cb29c61928',
indexName: process.env.ALGOLIA_INDEX_NAME || 'atmos.tools',
contextualSearch: false,
askAi: {
assistantId: process.env.ALGOLIA_ASKAI_ASSISTANT_ID || 'xzgtsIXZSf7V',
// ... additional Ask AI config
}
}Crawls are automatically triggered via the algolia/algoliasearch-crawler-github-actions GitHub Action on:
- Push to
mainbranch - Release published
- Manual workflow dispatch
This is the primary indexing method. The GitHub Action is configured in .github/workflows/website-deploy-prod.yml.
For debugging or one-off reindexing:
- Log into dashboard.algolia.com.
- Navigate to the Crawler section.
- Select the
atmos.toolscrawler. - Click "Start crawl" or "Restart crawling".
For scripting or automation outside of GitHub Actions:
curl -X POST "https://crawler.algolia.com/api/1/crawlers/{CRAWLER_ID}/reindex" \
-H "Authorization: Basic $(echo -n '{USER_ID}:{API_KEY}' | base64)"- Check index status: Log into Algolia dashboard and verify the index has records.
- Check crawler logs: Review the crawler run logs for errors.
- Verify sitemap: Ensure
https://atmos.tools/sitemap.xmlis accessible and complete. - Test selectors: Use the URL Tester in the crawler dashboard to verify content extraction.
- Verify secrets: Ensure all required GitHub secrets are configured.
- Check credentials: Verify the Crawler User ID and API Key are correct.
- Review action logs: Check the GitHub Actions logs for specific error messages.
If the index has significantly fewer records than expected:
- Check sitemap: Verify all pages are included in the sitemap.
- Review crawler config: Ensure the start URL and sitemap URL are correct.
- Check for blocked pages: Review robots.txt for any blocked paths.
- Verify page linking: Ensure all pages are linked from the sitemap or other indexed pages.
January 2026: Migrated from deprecated algolia/docsearch-scraper Docker image to the official Algolia Crawler with GitHub Actions integration. The legacy scraper was deprecated in February 2022 and was causing indexing failures.