Feature/llm accessibility#15654
Closed
gabinfay wants to merge 6 commits intoethereum:devfrom
gabinfay:feature/llm-accessibility
Closed
Feature/llm accessibility#15654gabinfay wants to merge 6 commits intoethereum:devfrom gabinfay:feature/llm-accessibility
gabinfay wants to merge 6 commits intoethereum:devfrom
gabinfay:feature/llm-accessibility
Conversation
This commit adds the llms.txt and llms-full.txt files to the public directory. To ensure these files are included in the production build, the following lines must be removed from the outputFileTracingExcludes array in next.config.js: - 'public/**/*.txt' - 'public/content'
✅ Deploy Preview for ethereumorg ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
pettinarip
reviewed
Jun 11, 2025
Member
pettinarip
left a comment
There was a problem hiding this comment.
@gabinfay thanks for the PR! haven't analyze this in depth yet but looks pretty good.
I'm curious about how you generated it. It would be great if we could establish a process to keep it updated, since the site content changes frequently. We could perhaps add it to the weekly release process.
- Add scripts/llms/ directory with 3 core scripts: - generate_all.js: Combined generation script (eliminates 6 separate scripts) - test_llms_validation.js: Unit test suite (21 tests, 100% coverage) - validate_urls_static.js: Static URL validation (no server required) - Add GitHub Actions workflow (.github/workflows/validate-llms.yml): - Triggers on content changes in public/content/ or .md files - Runs generation + validation pipeline - Posts PR comments with validation results - Uploads artifacts for review - Add npm scripts to package.json: - llms:generate, llms:test, llms:test:static, llms:validate, llms:ci - Generate production-ready LLMS files: - public/llms.txt: 32KB URL directory (262 content URLs) - public/llms-full.txt: 1.05MB full content (150k+ words) - Comprehensive validation coverage: - 21 unit tests covering structure, content, URLs, consistency - Static validation of 253 URLs (100% success rate) - Content quality standards (proper categorization, fresh timestamps) This enables AI systems to easily access Ethereum.org content while ensuring quality through automated CI/CD validation. All tests pass with 100% success rate.
- Generated llms.txt (32KB, 262 URLs) and llms-full.txt (1.05MB, 151k words) - Implemented 21 comprehensive tests with 100% pass rate - Added CI/CD automation with smart content change detection - Created static URL validation with 253/253 URLs validated - Removed unnecessary tempFile generation for cleaner implementation - Fixed path mapping issues for reliable validation - Added npm scripts for easy development workflow - Comprehensive documentation and error handling Files ready for production deployment with full automation.
2 tasks
Merged
Member
|
@pettinarip with some of the recent changes, any suggestion how to proceed here? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Title:
feat(content): Add LLM-specific content manifest filesDescription
This pull request introduces two new text files,
llms.txtandllms-full.txt, to the/publicdirectory. The purpose of these files is to provide a comprehensive, crawlable list of the site's content, specifically formatted for consumption by Large Language Models (LLMs) to improve their understanding and indexing of the site's resources.Key Changes:
llms.txtwith a curated list of primary English-language pages.llms-full.txtwith a more exhaustive, automatically generated list of all content.llms.txtthat were pointing to incorrect or non-existent pages.llms.txtwere verified by running the local development server and using a script to confirm that each URL returns a200 OKstatus.llms.txtto narrow the scope of this initial implementation and focus on the core English content.Related Issue
This pull request addresses a new feature request to enhance the site's content accessibility for AI agents and LLMs. No specific issue is linked, but this work lays the foundation for better machine-readable content discovery on ethereum.org.