Skip to content

feat: Add robots.txt for search engine indexing#1943

Merged
aknysh merged 2 commits intomainfrom
osterman/robots-algolia-verif
Jan 8, 2026
Merged

feat: Add robots.txt for search engine indexing#1943
aknysh merged 2 commits intomainfrom
osterman/robots-algolia-verif

Conversation

@osterman
Copy link
Copy Markdown
Member

@osterman osterman commented Jan 7, 2026

what

  • Adds robots.txt to website/static/ for search engine indexing
  • Includes Algolia crawler verification token
  • Explicitly allows all crawlers with User-agent: * and Allow: /
  • References sitemap for efficient crawler discovery

why

  • Improves search engine discoverability of the Atmos documentation
  • Enables Algolia crawler verification for site search functionality
  • Maximizes indexing potential by explicitly allowing all crawlers

references

  • Algolia crawler verification: 10F61B92D9EB1214

Summary by CodeRabbit

  • Chores
    • Added web crawler configuration to improve site indexing and point crawlers to the sitemap.
    • Updated deployment configuration to make the site base URL configurable via an environment variable, enabling PR-specific preview hosts during builds.

✏️ Tip: You can customize this high-level summary in your review settings.

…erification

- Adds robots.txt to website/static/ for maximum search engine indexing
- Includes Algolia crawler verification token
- Explicitly allows all crawlers with User-agent: * and Allow: /
- References sitemap for efficient crawler discovery

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@osterman osterman requested a review from a team as a code owner January 7, 2026 21:43
@github-actions github-actions bot added the size/xs Extra small size PR label Jan 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 7, 2026

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

None

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.01%. Comparing base (5be12e5) to head (169138e).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1943      +/-   ##
==========================================
+ Coverage   73.99%   74.01%   +0.01%     
==========================================
  Files         769      769              
  Lines       69288    69288              
==========================================
+ Hits        51273    51281       +8     
+ Misses      14604    14597       -7     
+ Partials     3411     3410       -1     
Flag Coverage Δ
unittests 74.01% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jan 7, 2026

📝 Walkthrough

Walkthrough

Adds website/static/robots.txt; makes the site base URL configurable via a new DEPLOYMENT_HOST env var and updates website/docusaurus.config.js to use it; injects DEPLOYMENT_HOST into the website preview build workflow step.

Changes

Cohort / File(s) Summary
Static configuration
website/static/robots.txt
New file allowing all crawlers and declaring sitemap location.
CI / Build workflow
.github/workflows/website-preview-build.yml
Adds DEPLOYMENT_HOST env var to the "Install Dependencies and Build Website" step; value set to pr-<PR_NUMBER>.atmos-docs.ue2.dev.plat.cloudposse.org when PR number exists, otherwise empty.
Site configuration
website/docusaurus.config.js
Adds DEPLOYMENT_HOST (default atmos.tools) and changes exported url to https://${DEPLOYMENT_HOST} instead of a fixed https://atmos.tools.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • johncblandii
  • jamengual
  • aknysh
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: adding a robots.txt file for search engine indexing, which aligns with the primary objective and the new file added.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch osterman/robots-algolia-verif

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d944390 and 169138e.

📒 Files selected for processing (2)
  • .github/workflows/website-preview-build.yml
  • website/docusaurus.config.js
🚧 Files skipped from review as they are similar to previous changes (2)
  • website/docusaurus.config.js
  • .github/workflows/website-preview-build.yml
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Acceptance Tests (macos)
  • GitHub Check: Acceptance Tests (linux)
  • GitHub Check: Acceptance Tests (windows)
  • GitHub Check: Summary

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @website/static/robots.txt:
- Around line 1-8: robots.txt currently hardcodes
"https://atmos.tools/sitemap.xml", which breaks non-production deployments;
change robots.txt to produce an environment-aware sitemap URL by using the same
DEPLOYMENT_HOST env var used in the reindex script (or fallback to the
production host) so the Sitemap line points to `${DEPLOYMENT_HOST}/sitemap.xml`
in non-prod builds, or alternatively serve a template/SSR robots.txt that
injects DEPLOYMENT_HOST at runtime; update any build or deploy step that writes
website/static/robots.txt to use DEPLOYMENT_HOST accordingly and ensure the
fallback behavior for missing env var.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5be12e5 and f6d182b.

📒 Files selected for processing (1)
  • website/static/robots.txt
🧰 Additional context used
📓 Path-based instructions (1)
website/**

📄 CodeRabbit inference engine (.cursor/rules/atmos-rules.mdc)

website/**: Update website documentation in the website/ directory when adding new features, ensure consistency between CLI help text and website documentation, and follow the website's documentation structure and style
Keep website code in the website/ directory, follow the existing website architecture and style, and test website changes locally before committing
Keep CLI documentation and website documentation in sync and document new features on the website with examples and use cases

Files:

  • website/static/robots.txt
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Acceptance Tests (linux)
  • GitHub Check: Acceptance Tests (windows)
  • GitHub Check: Summary

@mergify
Copy link
Copy Markdown

mergify bot commented Jan 7, 2026

Important

Cloud Posse Engineering Team Review Required

This pull request modifies files that require Cloud Posse's review. Please be patient, and a core maintainer will review your changes.

To expedite this process, reach out to us on Slack in the #pr-reviews channel.

@mergify mergify bot added the needs-cloudposse Needs Cloud Posse assistance label Jan 7, 2026
@osterman osterman added the no-release Do not create a new release (wait for additional code changes) label Jan 7, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/website-preview-build.yml:
- Line 50: The DEPLOYMENT_HOST environment variable is being set unconditionally
using github.event.pull_request.number which is undefined on workflow_dispatch
runs; change the assignment so it is conditional and empty when there is no
pull_request number (so docusaurus.config.js can fall back). Replace the
existing DEPLOYMENT_HOST line with a conditional expression that yields an empty
string when github.event.pull_request.number is missing, for example using the
GitHub Actions expression: DEPLOYMENT_HOST: ${{ github.event.pull_request.number
&& format('pr-{0}.atmos-docs.ue2.dev.plat.cloudposse.org',
github.event.pull_request.number) }}, ensuring the variable is empty for manual
runs and populated only for PR-triggered runs.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f6d182b and d944390.

📒 Files selected for processing (2)
  • .github/workflows/website-preview-build.yml
  • website/docusaurus.config.js
🧰 Additional context used
📓 Path-based instructions (2)
.github/workflows/*.{yml,yaml}

📄 CodeRabbit inference engine (.cursor/rules/atmos-rules.mdc)

Configure CI to run unit tests, integration tests, golangci-lint, and coverage reporting on all pull requests

Files:

  • .github/workflows/website-preview-build.yml
website/**

📄 CodeRabbit inference engine (.cursor/rules/atmos-rules.mdc)

website/**: Update website documentation in the website/ directory when adding new features, ensure consistency between CLI help text and website documentation, and follow the website's documentation structure and style
Keep website code in the website/ directory, follow the existing website architecture and style, and test website changes locally before committing
Keep CLI documentation and website documentation in sync and document new features on the website with examples and use cases

Files:

  • website/docusaurus.config.js
🧠 Learnings (2)
📓 Common learnings
Learnt from: osterman
Repo: cloudposse/atmos PR: 1686
File: docs/prd/tool-dependencies-integration.md:58-64
Timestamp: 2025-12-13T06:07:37.766Z
Learning: cloudposse/atmos: For PRD docs (docs/prd/*.md), markdownlint issues like MD040/MD010/MD034 can be handled in a separate documentation cleanup commit and should not block the current PR.
📚 Learning: 2025-09-30T00:36:22.219Z
Learnt from: aknysh
Repo: cloudposse/atmos PR: 0
File: :0-0
Timestamp: 2025-09-30T00:36:22.219Z
Learning: In the Atmos website project using docusaurus-plugin-llms, the postbuild script intentionally copies llms.txt and llms-full.txt from build/ to static/ (reverse of typical Docusaurus flow). This is necessary because: (1) the plugin hardcodes output to build/ directory, (2) files must be in static/ for deployment and dev mode access, (3) the plugin doesn't support configuring output directory. The files are source-controlled in static/ and regenerated on each build.

Applied to files:

  • website/docusaurus.config.js
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Acceptance Tests (windows)
  • GitHub Check: Summary
🔇 Additional comments (1)
website/docusaurus.config.js (1)

16-16: DEPLOYMENT_HOST fallback handles environment-aware URLs correctly.

The DEPLOYMENT_HOST with fallback to atmos.tools is solid, and the dynamic URL at line 22 will be picked up by Docusaurus's built-in sitemap generation. Docusaurus 2 generates the sitemap using the url field from the config, so the dynamic URL will automatically be used without requiring explicit sitemap plugin configuration.

- Use DEPLOYMENT_HOST env var in docusaurus.config.js url setting
- Fallback to atmos.tools (production) when DEPLOYMENT_HOST is not set
- Docusaurus automatically generates sitemap with correct URLs
- robots.txt uses production sitemap URL (standard for static files)

This follows Docusaurus conventions rather than custom post-build scripts.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@aknysh aknysh merged commit 3f9227d into main Jan 8, 2026
57 checks passed
@aknysh aknysh deleted the osterman/robots-algolia-verif branch January 8, 2026 00:00
@mergify mergify bot removed the needs-cloudposse Needs Cloud Posse assistance label Jan 8, 2026
@github-actions
Copy link
Copy Markdown

These changes were released in v1.204.0-rc.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-release Do not create a new release (wait for additional code changes) size/s Small size PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants