-
Notifications
You must be signed in to change notification settings - Fork 18
Add tagging support for documents and sources #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Update document compiler to include tags in output - Update document parser to handle tags in JSON configuration - Add tags to JSON schema - Update documentation with tag usage examples
WalkthroughThe changes introduce several new GitHub issue templates for bug reports, feature requests, and support questions. Additionally, the configuration and JSON schema now support a new “tags” property for documents and sources. The document management system, including the Document class and compiler, has been enhanced to handle tags. The renderers for both Markdown and plain text have been updated to conditionally format text blocks based on tags. Furthermore, multiple source implementations and the source interface have been modified to accept, manage, and serialize tags. Changes
Sequence Diagram(s)sequenceDiagram
participant D as Document
participant DC as DocumentCompiler
participant CB as ContentBuilder
participant R as Renderer
D->>DC: Create Document (with tags)
DC->>D: Check hasTags() & getTags()
DC->>CB: Add a new TextBlock with tag details
CB->>R: Render content using tag-aware logic
R-->>CB: Return formatted content
sequenceDiagram
participant DS as Data Source
participant P as Parser (DocumentsParserPlugin)
participant S as Source Object
DS->>P: Provide source data (including tags)
P->>S: Instantiate Source with parsed tags
S->>S: Process tags (getTags / hasTags)
S-->>P: Return serialized source data with tags
Poem
Tip ⚡🧪 Multi-step agentic review comment chat (experimental)
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
src/Lib/Content/Renderer/MarkdownRenderer.php (1)
47-49: Tag rendering implementation looks good, but consider Markdown-native approach.The implementation correctly handles text blocks with tags by wrapping them in HTML-like tags. While this works, consider whether a more Markdown-native approach might be preferable, such as using blockquotes or custom Markdown syntax instead of HTML tags.
- if ($block->hasTag()) { - $content = \sprintf("<%s>\n%s\n</%s>\n", $block->getTag(), $content, $block->getTag()); - } + if ($block->hasTag()) { + // Option 1: Using Markdown blockquotes with tag as header + $content = \sprintf("**%s**\n\n> %s\n\n", $block->getTag(), \str_replace("\n", "\n> ", $content)); + + // Option 2: Using Markdown code blocks with tag as language + // $content = \sprintf("```%s\n%s\n```\n", $block->getTag(), $content); + }src/Source/BaseSource.php (1)
31-39: Consider optimizing hasTags() methodWhile the implementation is correct, calling
getTags()insidehasTags()is less efficient as it creates a new array and reindexes for each check.public function hasTags(): bool { - return !empty($this->getTags()); + return !empty($this->tags); }src/Document/Document.php (1)
91-123: Tag management methods are comprehensive.The implementation includes all necessary methods for tag management:
addTagfor adding tags,getTagsfor retrieving tags (with an option to include source tags), andhasTagsfor checking if tags exist.However, there's a potential performance concern in the
hasTagsmethod that callsgetTags(), which creates and merges arrays unnecessarily when simply checking existence.Consider optimizing the
hasTagsmethod to avoid unnecessary array operations:- public function hasTags(): bool - { - return !empty($this->getTags()); - } + public function hasTags(): bool + { + if (!empty($this->tags)) { + return true; + } + + // Only check sources if document tags are empty + foreach ($this->sources as $source) { + if ($source->hasTags()) { + return true; + } + } + + return false; + }src/Source/GitDiff/CommitDiffSource.php (2)
27-27: Consider adding validation for tagsWhile the PHPDoc specifies
array<non-empty-string>for tags, there's no validation in the constructor to ensure this constraint is met.Consider adding validation similar to how other parameters are validated, perhaps something like:
if (isset($data['tags'])) { foreach ($data['tags'] as $tag) { if (!\is_string($tag) || empty($tag)) { throw new \RuntimeException('Tags must be non-empty strings'); } } }Also applies to: 39-41
96-102: Consider adding a getter method for tagsFor consistency with other properties that have getter methods, consider adding a getter method for tags.
+ /** + * Get the tags for this source + * + * @return array<non-empty-string> + */ + public function getTags(): array + { + return $this->getTags(); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
.github/ISSUE_TEMPLATE/1_Bug_report.md(1 hunks).github/ISSUE_TEMPLATE/2_Feature_request.md(1 hunks).github/ISSUE_TEMPLATE/3_Support_question.md(1 hunks)README.md(8 hunks)json-schema.json(2 hunks)src/Document/Compiler/DocumentCompiler.php(3 hunks)src/Document/Document.php(2 hunks)src/Document/DocumentsParserPlugin.php(2 hunks)src/Lib/Content/ContentBuilder.php(1 hunks)src/Lib/Content/Renderer/MarkdownRenderer.php(1 hunks)src/Lib/Content/Renderer/PlainTextRenderer.php(1 hunks)src/Source/BaseSource.php(2 hunks)src/Source/File/FileSource.php(4 hunks)src/Source/GitDiff/CommitDiffSource.php(4 hunks)src/Source/Github/GithubSource.php(4 hunks)src/Source/SourceWithModifiers.php(1 hunks)src/Source/Text/TextSource.php(2 hunks)src/Source/Text/TextSourceFetcher.php(2 hunks)src/Source/Url/UrlSource.php(3 hunks)src/SourceInterface.php(1 hunks)
🧰 Additional context used
🧬 Code Definitions (8)
src/Lib/Content/ContentBuilder.php (10)
src/Lib/Content/Block/CodeBlock.php (1) (1)
render(45:48)src/Lib/Content/Block/AbstractBlock.php (1) (1)
render(24:24)src/Lib/Content/Block/CommentBlock.php (1) (1)
render(17:20)src/Lib/Content/Block/BlockInterface.php (1) (1)
render(17:17)src/Lib/Content/Block/DescriptionBlock.php (1) (1)
render(17:20)src/Lib/Content/Block/SeparatorBlock.php (1) (1)
render(38:41)src/Lib/Content/Block/TreeViewBlock.php (1) (1)
render(17:20)src/Lib/Content/Block/TitleBlock.php (1) (1)
render(36:39)src/Lib/Content/Block/TextBlock.php (1) (1)
render(22:25)src/Lib/Content/ContentBlock.php (1) (1)
render(47:50)
src/Lib/Content/Renderer/MarkdownRenderer.php (1)
src/Lib/Content/Block/TextBlock.php (2) (2)
hasTag(32:35)getTag(27:30)
src/SourceInterface.php (2)
src/Source/BaseSource.php (2) (2)
getTags(31:34)hasTags(36:39)src/Document/Document.php (2) (2)
getTags(106:115)hasTags(120:123)
src/Document/Compiler/DocumentCompiler.php (4)
src/Document/Document.php (2) (2)
hasTags(120:123)getTags(106:115)src/Source/BaseSource.php (2) (2)
hasTags(36:39)getTags(31:34)src/SourceInterface.php (2) (2)
hasTags(28:28)getTags(23:23)src/Lib/Content/ContentBuilder.php (2) (2)
addBlock(68:72)addText(90:93)
src/Lib/Content/Renderer/PlainTextRenderer.php (1)
src/Lib/Content/Block/TextBlock.php (2) (2)
hasTag(32:35)getTag(27:30)
src/Source/File/FileSource.php (5)
src/Source/GitDiff/CommitDiffSource.php (1) (1)
__construct(29:43)src/Source/Github/GithubSource.php (1) (1)
__construct(27:43)src/Source/BaseSource.php (1) (1)
__construct(16:19)src/Source/SourceWithModifiers.php (1) (1)
__construct(17:23)src/Source/File/FileSourceFetcher.php (1) (1)
__construct(27:32)
src/Source/Github/GithubSource.php (4)
src/Source/GitDiff/CommitDiffSource.php (1) (1)
__construct(29:43)src/Source/File/FileSource.php (1) (1)
__construct(31:47)src/Source/BaseSource.php (1) (1)
__construct(16:19)src/Source/SourceWithModifiers.php (1) (1)
__construct(17:23)
src/Document/DocumentsParserPlugin.php (1)
src/Document/Document.php (1) (1)
Document(10:146)
🪛 LanguageTool
.github/ISSUE_TEMPLATE/1_Bug_report.md
[style] ~20-~20: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 415 characters long)
Context: ...-----------| | Library Version | x.y.z | | PHP Version | x.y.z | | Operating System | Linux ...
(EN_EXCESSIVE_EXCLAMATION)
🔇 Additional comments (51)
.github/ISSUE_TEMPLATE/1_Bug_report.md (4)
1-7: YAML Header Structure is Clear and Consistent
The frontmatter is well-formatted with clear metadata entries (name, about, labels). This provides a standardized starting point for users submitting bug reports.
8-11: "Description" Section is Appropriately Documented
The description section includes a placeholder comment that encourages users to provide a clear and concise explanation of the problem. This should streamline user input without causing confusion.
12-15: "How To Reproduce" Section Provides Clear Guidance
This segment clearly instructs users to supply code or configuration snippets to reproduce the issue. The inline comment serves as a helpful prompt.
16-24: "Additional Info" Section with a Markdown Table is Concise and Informative
The table layout effectively organizes key information such as library version, PHP version, and operating system. The inline comments remind users to update placeholder values, which is ideal for maintaining up-to-date bug reports.🧰 Tools
🪛 LanguageTool
[style] ~20-~20: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 415 characters long)
Context: ...-----------| | Library Version | x.y.z | | PHP Version | x.y.z | | Operating System | Linux ...(EN_EXCESSIVE_EXCLAMATION)
.github/ISSUE_TEMPLATE/3_Support_question.md (2)
1-6: YAML Header for Support Questions is Well Defined
The header correctly names the template ("❓ Question") and provides a succinct description along with an appropriate label. This ensures that support questions are immediately identifiable.
8-10: Inline Prompt in Comment Block is Sufficient
The inline comment effectively instructs users to describe their question, offering minimal yet clear guidance without being intrusive..github/ISSUE_TEMPLATE/2_Feature_request.md (3)
1-6: Feature Request Template Header is Correctly Specified
The YAML header clearly identifies the file as a feature request template ("🚀 Feature Request") with appropriate metadata. This aids in automating issue categorization.
7-10: "Description" Section Provides Clear Guidance
The "Description" section contains a helpful placeholder comment prompting users to detail the new feature, which encourages clarity and thoroughness in submissions.
11-14: "Example" Section Enhances Clarity for Submissions
Including a section for examples with inline instructions to show code, YAML, or configuration changes is beneficial. It guides the user in providing a compelling demonstration of the feature request.src/Lib/Content/ContentBuilder.php (1)
154-154: Code simplification looks good!The direct return of the render result eliminates an unnecessary variable assignment while maintaining the same functionality. This is a good refactoring for code conciseness.
src/SourceInterface.php (1)
18-29: Well-structured tag interface methods.The addition of
getTags()andhasTags()methods properly implements the tagging functionality described in the PR objectives. The PHPDoc comments are clear, and the array return type with the non-empty-string generic type provides good type safety.src/Source/SourceWithModifiers.php (4)
7-7: Good addition of Modifier import.The added import for the Modifier class is used in the PHPDoc for better type documentation.
13-24: Well-structured constructor with proper parameter documentation.The constructor properly initializes the class with description, tags, and modifiers, correctly passing values to the parent constructor. The PHPDoc annotations clearly define the expected parameter types.
31-31: Good simplification of modifier application.The code now directly applies modifiers without checking if they exist, which is safe because the constructor ensures $this->modifiers is always set (even if empty).
36-42: Proper implementation of jsonSerialize method.The jsonSerialize method correctly extends the parent implementation to include the modifiers property, ensuring complete serialization of the object's state.
src/Source/Text/TextSourceFetcher.php (2)
8-8: New import added for TextBlock classThe TextBlock class import has been added to support the new tagging functionality.
62-62: Improved content handling with TextBlockThe code now uses a single
addBlockcall with TextBlock instead of multipleaddTextcalls, which is a cleaner approach. The TextBlock constructor properly includes both the processed content and the source tag.json-schema.json (2)
66-72: Document tagging schema properly definedThe document tagging schema has been correctly implemented as an array of strings with a clear description. This aligns well with the PR objective of supporting document tagging.
99-105: Source tagging schema properly definedThe source tagging schema mirrors the document tagging implementation, maintaining consistency in the API design. This allows for proper validation of source tags in the configuration.
src/Document/Compiler/DocumentCompiler.php (3)
11-11: Added TextBlock importThe TextBlock class import has been added to support the new tag rendering functionality.
95-100: Well-implemented document tags renderingThe document tags handling is well implemented with proper checks, logging, and formatting. The 'DOCUMENT_TAGS' identifier will help distinguish these tags in the rendered output.
135-135: Simplified content additionThe content addition has been simplified to just use
addTextwithout chaining a separator. This aligns with the TextBlock approach used elsewhere.src/Document/DocumentsParserPlugin.php (3)
60-64: Document tags parsing implementationThe document tags parsing logic is well-implemented, using
\array_map(\strval(...), $docData['tags'])to ensure all tags are properly converted to strings, which prevents type-related issues.
71-71: Tags parameter added to Document::createThe parsed tags are correctly passed to the Document::create method, enabling document-level tagging.
101-109: Source tags parsing implementationThe source tags parsing follows the same pattern used for document tags, ensuring consistency. The tags are correctly added back to the source data array before source creation.
src/Lib/Content/Renderer/PlainTextRenderer.php (1)
33-42: Well-structured enhancement for tag supportThis implementation properly handles tag-based rendering in plain text format. The empty content check prevents adding blank lines to the output, and the tag wrapping adds clear visual distinction to tagged content with HTML-like tags.
src/Source/BaseSource.php (2)
18-18: Good addition of tags propertyThe readonly array with default empty value is appropriate for storing tags.
48-54: Clean implementation of jsonSerializeGood use of
array_filterwith a static anonymous function to remove null, empty string, and empty array values.src/Source/Text/TextSource.php (3)
16-18: Good PHPDoc and constructor parameter updatesThe type constraints (non-empty-string) are well defined, and the constructor maintains backward compatibility while adding the new tags functionality.
Also applies to: 23-25
38-38: Proper fromArray handling for tagsThe implementation correctly handles tags from the input array with a sensible default.
46-49: Clean jsonSerialize implementationGood use of the spread operator to include parent's serialized data, and consistent filtering logic.
src/Source/Url/UrlSource.php (3)
18-19: Good PHPDoc and constructor parameter updatesThe type constraints and parameter addition are well implemented, maintaining backward compatibility.
Also applies to: 24-26
39-39: Proper fromArray handling for tagsThe implementation correctly extracts tags from the input array with a sensible default.
68-71: Clean jsonSerialize implementationGood use of the spread operator for parent's serialized data and consistent filtering approach.
README.md (4)
271-275: Documentation clearly illustrates tag usage for documents.The addition of tags to the document properties is well-documented with a clear example showing how to categorize API documentation with relevant tags like "api", "documentation", and "v1".
285-288: Source tagging is clearly documented in both examples and reference tables.The implementation consistently shows how to apply tags to file sources, and the Document Properties table is appropriately updated to include the new
tagsparameter with its description.Also applies to: 311-311
289-294: Good example of URL source with tags.This example clearly demonstrates how to tag URL sources with descriptive labels like "reference" and "external", helping users understand the practical application of tags for external resources.
358-358: Consistent documentation across all source types.The tags parameter has been systematically added to all source type parameter tables with consistent descriptions, ensuring users have a clear understanding of how to use tags regardless of source type.
Also applies to: 639-639, 804-804, 925-925, 948-948
src/Source/File/FileSource.php (3)
29-30: Constructor properly updated for tag support.The constructor has been updated to include support for tags while maintaining backward compatibility through default parameter values. The parent constructor is correctly called with the new parameters using named arguments for clarity.
Also applies to: 43-47
104-105: Tags properly handled in factory method.The
fromArraymethod correctly extracts tags from the input array, maintaining consistency with other parameters by using the null coalescing operator for default values.
246-247: Improved serialization with parent class delegation.Using
...parent::jsonSerialize()simplifies the code by leveraging inheritance and ensures that any future properties added to the parent class will automatically be included in the serialized output.src/Source/Github/GithubSource.php (3)
25-26: Constructor properly implements tag support.The constructor has been updated with tag support in a manner consistent with other source classes, using named arguments when calling the parent constructor for clarity.
Also applies to: 39-43
91-92: Factory method correctly handles tags.The
fromArraymethod properly passes tags to the constructor, maintaining consistency with the implementation in other source classes.
204-205: Improved serialization with parent class delegation.Using
...parent::jsonSerialize()simplifies the serialization code and ensures consistency across all source classes for common properties like description, tags, and modifiers.src/Document/Document.php (4)
20-21: Constructor properly updated for tag support.The constructor has been updated with clear PHPDoc for the tags parameter and a private property to store the tags.
Also applies to: 27-28
35-43: Factory method enhanced with tag support.The
createstatic factory method has been updated to support tags, maintaining a consistent approach with the constructor and preserving backward compatibility.Also applies to: 49-50
125-134: Restored getSources method is appropriately implemented.The getSources method has been re-added with proper return type documentation and implementation to return an indexed array of source objects.
142-144: Serialization properly includes tags.The
jsonSerializemethod now includes tags in the output and uses accessor methods instead of direct property access, which is a good practice for encapsulation.src/Source/GitDiff/CommitDiffSource.php (3)
27-27: Constructor updated to support tagsThe constructor has been modified to include support for tags with proper PHPDoc annotation. The
$modifiersparameter has changed frompublic readonlyto a regular parameter, and the parent constructor is now called with named parameters including the newtagsparameter.While these changes correctly implement the tagging feature, note the visibility change for
$modifierswhich may affect code directly accessing this property.Also applies to: 39-41, 42-42
92-92: Tags support added to fromArray factory methodThe
fromArraymethod now correctly handles thetagsparameter from the input data with a sensible default of an empty array when not provided.
211-214: jsonSerialize method enhanced to include tagsThe
jsonSerializemethod has been improved by:
- Including parent class serialization data (which includes tags) using the spread operator
- Adding a filter to remove null, empty string, and empty array values for cleaner output
This is a good approach to ensure tags are properly serialized while keeping the JSON output clean.
Also applies to: 222-222
This PR adds support for tagging both documents and sources in the context generator. Tags provide a flexible way to categorize and organize context elements, which enhances the utility of the generated documentation.
Features
Documentation
Added a new section to the README explaining the tagging feature with examples for both configuration and programmatic usage.
Using Tags in Configuration
Document-Level Tags
{ "documents": [ { "description": "API Documentation", "outputPath": "docs/api.md", "tags": ["api", "documentation", "v1"], "sources": [ // sources here ] } ] }Source-Level Tags
Similarly, you can add tags to individual sources:
{ "documents": [ { "description": "API Documentation", "outputPath": "docs/api.md", "sources": [ { "type": "file", "description": "API Controllers", "sourcePaths": ["src/Controller"], "filePattern": "*.php", "tags": ["controllers", "php"] }, { "type": "url", "description": "API Reference", "urls": ["https://api.example.com/docs"], "tags": ["reference", "external"] } ] } ] }Summary by CodeRabbit
New Features
Documentation
Improvements