Skip to content

Move llmAsJury folder to patterns directory #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 11, 2025

Conversation

joel13samuel
Copy link
Contributor

@joel13samuel joel13samuel commented Jun 9, 2025

Summary by CodeRabbit

  • New Features

    • Introduced a multi-agent system for generating and evaluating blog posts using AI models, including ContentWriter and Jury agents.
    • ContentWriter agent generates structured blog posts, with optional evaluation by the Jury agent.
    • Jury agent evaluates content across multiple criteria using OpenAI and Anthropic models, providing detailed feedback and consensus scoring.
  • Documentation

    • Added comprehensive README with setup, usage, and architecture details.
    • Included guidelines and API reference documents for building and configuring agents.
  • Chores

    • Added configuration files for project setup, formatting, linting, TypeScript, and environment management.
    • Added .gitignore to exclude unnecessary files from version control.

Copy link

coderabbitai bot commented Jun 9, 2025

Walkthrough

This update introduces a new Agentuity-based multi-agent system called "LLM as Jury System." The changes add configuration, documentation, and source files for two AI agents: ContentWriter and Jury. The project includes setup files for development and deployment, TypeScript and linting configurations, and detailed markdown documentation for agent implementation and SDK usage.

Changes

File(s) Change Summary
.editorconfig, .gitignore, biome.json, tsconfig.json, package.json Added project configuration, formatting, linting, and dependency management files.
agentuity.yaml Introduced Agentuity project configuration, specifying agents, dev/deploy settings, and bundler options.
README.md Added project overview, architecture, setup, usage instructions, and support information.
.cursor/rules/agent.mdc, .cursor/rules/agentuity.mdc, .cursor/rules/sdk.mdc Added markdown documentation for agent coding guidelines, configuration, and SDK API reference.
index.ts Added entry script to initialize and run the Agentuity project with environment checks and error handling.
src/agents/ContentWriter/index.ts Implemented ContentWriter agent for generating blog posts and optionally handing off to Jury agent for evaluation.
src/agents/Jury/index.ts Implemented Jury agent for multi-model evaluation of text content, aggregating results from OpenAI and Anthropic.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ContentWriter
    participant Jury

    User->>ContentWriter: Submit blog topic (optional: evaluate flag)
    ContentWriter->>ContentWriter: Generate blog post with GPT-4o Mini
    alt Evaluate flag is false
        ContentWriter-->>User: Return blog post text
    else Evaluate flag is true or omitted
        ContentWriter->>Jury: Handoff blog post for evaluation
        Jury->>Jury: Evaluate content with multiple AI models (GPT-4o Mini, GPT-4, Claude)
        Jury-->>ContentWriter: Return evaluation report
        ContentWriter-->>User: Return evaluation report
    end
Loading

Suggested reviewers

  • rblalock

Poem

In the meadow of code, two agents awake,
One pens a blog post for readers’ sake.
The other, a jury, with models so wise,
Judges each paragraph, offers a prize.
Together they dance, AI in review—
A system of rabbits, making something new!
🐇✨

✨ Finishing Touches
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Commit Unit Tests in branch feature/move-folder-to-patterns
  • Post Copyable Unit Tests in Comment

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai bot requested a review from rblalock June 9, 2025 14:24
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🔭 Outside diff range comments (1)
patterns/llmAsJury/.agentuity-crash-1749478389.json (1)

1-20: ⚠️ Potential issue

Remove crash report from version control
Committing autogenerated crash dumps can clutter the repo and expose sensitive details. Exclude .agentuity-crash-*.json from source control and add an appropriate pattern to .gitignore.

🧹 Nitpick comments (10)
patterns/llmAsJury/.editorconfig (1)

6-12: Consider adding file-type specific overrides.
You might want to tailor settings for TypeScript, JSON, Markdown, etc., for example using spaces for JSON and ensuring final newlines in Markdown:

+[*.{ts,tsx}]
+indent_style = space
+indent_size = 2
+
+[*.json]
+indent_style = space
+indent_size = 2
+
+[*.md]
+trim_trailing_whitespace = false
+insert_final_newline = true
patterns/llmAsJury/.gitignore (2)

16-16: Broaden report file ignore
report.[0-9]_.[0-9]_.[0-9]_.[0-9]_.json is overly specific and the underscores likely aren’t intended literal characters. Use a wildcard to catch all report JSON files:

- report.[0-9]_.[0-9]_.[0-9]_.[0-9]_.json
+ report*.json

36-37: Expand Agentuity ignore rules

  • Use a trailing slash to clearly denote the directory: .agentuity/
  • Add crash files generated by Agentuity at the repo root: .agentuity-crash-*.json
  • Also ignore the Agentuity cursor folder: .cursor/
-.agentuity
+.agentuity/
+.agentuity-crash-*.json
+.cursor/
patterns/llmAsJury/package.json (1)

4-4: Consider updating main entry point to match module field.

There's an inconsistency between the main field (index.js) and module field (index.ts). Consider aligning them or clarifying the intended entry point.

-  "main": "index.js",
+  "main": "index.ts",
patterns/llmAsJury/README.md (3)

41-50: Specify language for fenced code block
The CLI usage snippet is fenced without a language. Adding bash (or shell) will enable syntax highlighting and improve readability.


80-87: Specify language for fenced code block
The project structure listing is a code block without a language. Use bash or text to clarify formatting.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

80-80: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


66-68: Refine informal tone
The description for Claude (“Pretty cool model I can't lie”) is overly casual. Consider revising to maintain a professional documentation style.

patterns/llmAsJury/.cursor/rules/agent.mdc (1)

28-32: Fix grammatical error in description
The sentence is missing a connector. Update accordingly:

-The AgentRequest interface provides a set of helper methods and public variables which can be used for working with data has been passed to the Agent.
+The AgentRequest interface provides a set of helper methods and public variables for working with data that has been passed to the Agent.
patterns/llmAsJury/src/agents/ContentWriter/index.ts (1)

11-11: Fix the grammatical error in the welcome message.

There's a typo with double periods at the end of the sentence.

Apply this diff to fix the typo:

-Enter in a topic and I will generate a blog post about it, along with a score from the jury!.`,
+Enter in a topic and I will generate a blog post about it, along with a score from the jury!`,
patterns/llmAsJury/src/agents/Jury/index.ts (1)

8-13: Consider using the context logger instead of console.error.

For consistency with the rest of the codebase, consider using the context logger if available during initialization.

Since the Anthropic client initialization happens at the module level before the agent context is available, the current approach is acceptable. However, you could log this error inside the agent function for better observability:

export default async function JuryAgent(
  req: AgentRequest,
  resp: AgentResponse,
  ctx: AgentContext
) {
+  if (!anthropicClient) {
+    ctx.logger.warn('Anthropic client not available - Claude evaluation will be skipped');
+  }
  try {
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 260cd00 and f391c21.

⛔ Files ignored due to path filters (1)
  • patterns/llmAsJury/bun.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • patterns/llmAsJury/.agentuity-crash-1749478389.json (1 hunks)
  • patterns/llmAsJury/.cursor/rules/agent.mdc (1 hunks)
  • patterns/llmAsJury/.cursor/rules/agentuity.mdc (1 hunks)
  • patterns/llmAsJury/.cursor/rules/sdk.mdc (1 hunks)
  • patterns/llmAsJury/.editorconfig (1 hunks)
  • patterns/llmAsJury/.gitignore (1 hunks)
  • patterns/llmAsJury/README.md (1 hunks)
  • patterns/llmAsJury/agentuity.yaml (1 hunks)
  • patterns/llmAsJury/biome.json (1 hunks)
  • patterns/llmAsJury/index.ts (1 hunks)
  • patterns/llmAsJury/package.json (1 hunks)
  • patterns/llmAsJury/src/agents/ContentWriter/index.ts (1 hunks)
  • patterns/llmAsJury/src/agents/Jury/index.ts (1 hunks)
  • patterns/llmAsJury/tsconfig.json (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
patterns/llmAsJury/src/agents/ContentWriter/index.ts (1)
patterns/llmAsJury/src/agents/Jury/index.ts (1)
  • welcome (27-38)
patterns/llmAsJury/src/agents/Jury/index.ts (1)
patterns/llmAsJury/src/agents/ContentWriter/index.ts (1)
  • welcome (5-14)
🪛 LanguageTool
patterns/llmAsJury/README.md

[misspelling] ~62-~62: This word is normally spelled as one.
Context: ...adings - Strong conclusions ### Jury A multi-model evaluation system that provides balance...

(EN_COMPOUNDS_MULTI_MODEL)


[misspelling] ~65-~65: Possible spelling mistake found.
Context: ...essment using: Default Models: - GPT-4o Mini: Precise and thorough evaluator - **G...

(EN_MULTITOKEN_SPELLING_TWO)

🪛 markdownlint-cli2 (0.17.2)
patterns/llmAsJury/README.md

80-80: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🔇 Additional comments (11)
patterns/llmAsJury/.editorconfig (2)

1-4: Confirm root scope behavior.
Setting root = true will stop EditorConfig from walking up to parent directories for additional configs. If you intend this to be a standalone config for only this subfolder, keep it. Otherwise, consider removing it or setting root = false to inherit global rules.


6-12: Basic EditorConfig settings look solid.
Indent style, charset, line endings, and whitespace rules align well with a TypeScript project in this directory.

patterns/llmAsJury/.cursor/rules/sdk.mdc (1)

1-98: Excellent SDK documentation structure.

The documentation is comprehensive and well-organized, covering all major SDK interfaces with clear method descriptions and organized sections for different functionalities.

patterns/llmAsJury/.cursor/rules/agentuity.mdc (1)

1-10: LGTM - Clear and appropriate configuration.

The configuration appropriately sets guidelines for the Agentuity YAML file and clearly communicates that it shouldn't be edited, which is sensible for platform-managed configuration files.

patterns/llmAsJury/biome.json (1)

1-27: LGTM - Well-configured Biome setup.

The configuration follows good practices with sensible formatting rules (2-space indentation, single quotes, semicolons) and appropriately excludes the generated .agentuity directory.

patterns/llmAsJury/package.json (1)

29-37: LGTM - Appropriate dependencies for AI agent project.

The dependencies are well-chosen for a multi-agent AI system, including the Agentuity SDK, AI provider SDKs (OpenAI, Anthropic), and supporting libraries like Zod for validation.

patterns/llmAsJury/index.ts (1)

1-39: Excellent entry point implementation with comprehensive error handling.

The script provides thorough environment validation, clear error messages with helpful suggestions, and proper error handling. The runtime environment detection and appropriate guidance for different scenarios is particularly well done.

patterns/llmAsJury/tsconfig.json (1)

1-28: TypeScript configuration looks solid
The strict options and module settings align well with the project’s needs. No changes required.

patterns/llmAsJury/src/agents/ContentWriter/index.ts (2)

60-67: LGTM! The handoff implementation is well-structured.

The handoff correctly uses an object with the agent name and properly structures the data payload with content type and metadata. This aligns well with the Jury agent's expectations.


48-48:

❓ Verification inconclusive

Verify the req.get() method signature and behavior.

The code uses req.get('evaluate', true) with a default value. Please verify that this method signature is correct according to the Agentuity SDK documentation.

Let me check if the req.get() method supports a default value parameter:


🏁 Script executed:

#!/bin/bash
# Description: Search for the AgentRequest interface definition and req.get method usage

# Find the AgentRequest interface definition
ast-grep --pattern 'interface AgentRequest {
  $$$
}'

# Search for other uses of req.get with default values
rg -A 2 'req\.get\(' --type ts

Length of output: 726


🌐 Web query:

In the Agentuity SDK documentation, what is the method signature for `AgentRequest.get`, and does it support a default value parameter?

💡 Result:

I couldn't locate specific documentation for the AgentRequest.get method within the Agentuity SDK. It's possible that this method is part of a proprietary or less-documented API. To obtain accurate information about its method signature and support for default value parameters, I recommend the following steps:

  1. Consult Official Documentation: Review the official Agentuity SDK documentation or API reference guides. These resources typically provide detailed information about method signatures, parameters, and usage examples.

  2. Examine Source Code: If the SDK is open-source or if you have access to its source code, inspect the implementation of the AgentRequest class and its get method. This will reveal the method signature and any default parameters it may support.

  3. Reach Out to Support: Contact Agentuity's support team or community forums. They can offer guidance, answer specific questions, and provide insights into the SDK's features and functionalities.

By following these steps, you should be able to gather comprehensive information about the AgentRequest.get method and its capabilities within the Agentuity SDK.


Verify req.get('evaluate', true) Supports a Default Value

I wasn’t able to find an AgentRequest.get definition in our codebase or public docs that accepts a fallback parameter. Please double-check the Agentuity SDK’s type definitions or source to confirm:

  • Open the SDK’s AgentRequest interface/class in node_modules/@agentuity/sdk (or wherever it’s installed) and inspect the get method signature.
  • Ensure it declares a second defaultValue?: T argument.
  • If it doesn’t, consider using a nullish-coalescing fallback instead:
    const evaluateFlag = req.get('evaluate') as boolean ?? true;
patterns/llmAsJury/src/agents/Jury/index.ts (1)

158-170: LGTM! Excellent defensive programming for Claude response parsing.

The nested error handling and careful type checking for Claude's response structure demonstrates robust defensive coding practices. This prevents runtime errors from unexpected API response formats.

@@ -0,0 +1,19 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this @joel13samuel

<br />
</div>

# 🤖 LLM as Jury System
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs deploy button.

@coderabbitai coderabbitai bot requested a review from rblalock June 11, 2025 14:40
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
patterns/llmAsJury/README.md (2)

64-71: Refine casual tone and phrasing.
The Jury section is informative, but the Claude description is overly informal. Consider:

- **Claude**: Pretty cool model I can't lie
+ **Claude**: Versatile evaluator known for nuanced language understanding

Also, adjust “provides balance” to “provides a balanced assessment” for grammatical consistency.

🧰 Tools
🪛 LanguageTool

[misspelling] ~65-~65: This word is normally spelled as one.
Context: ...adings - Strong conclusions ### Jury A multi-model evaluation system that provides balance...

(EN_COMPOUNDS_MULTI_MODEL)


[misspelling] ~68-~68: Possible spelling mistake found.
Context: ...essment using: Default Models: - GPT-4o Mini: Precise and thorough evaluator - **G...

(EN_MULTITOKEN_SPELLING_TWO)


82-90: Specify language for project-structure code block.
The fenced code block should declare a language (e.g., bash or text) to satisfy markdown linters:

- ```
+ ```bash
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

83-83: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 41b5f20 and c38cecd.

📒 Files selected for processing (1)
  • patterns/llmAsJury/README.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
patterns/llmAsJury/README.md

[misspelling] ~65-~65: This word is normally spelled as one.
Context: ...adings - Strong conclusions ### Jury A multi-model evaluation system that provides balance...

(EN_COMPOUNDS_MULTI_MODEL)


[misspelling] ~68-~68: Possible spelling mistake found.
Context: ...essment using: Default Models: - GPT-4o Mini: Precise and thorough evaluator - **G...

(EN_MULTITOKEN_SPELLING_TWO)

🪛 markdownlint-cli2 (0.17.2)
patterns/llmAsJury/README.md

83-83: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🔇 Additional comments (12)
patterns/llmAsJury/README.md (12)

1-6: Skip: HTML header block is well-formed.
The centered logo and tagline effectively introduce the project.


8-11: Solid introduction with deploy button.
The “Deploy with Agentuity” badge at the top makes it easy for users to get started immediately.


12-25: Clear overview and workflow.
The “Overview” and “How It Works” sections succinctly explain the multi-agent system architecture and flow.


26-35: Concise Quick Start.
Prerequisites and setup steps are straightforward and actionable.


38-54: Comprehensive usage examples.
Both DevMode UI and CLI invocation snippets are well-documented and easy to follow.


55-63: Detailed ContentWriter agent description.
The breakdown of output structure (titles, intros, subheadings, conclusions) matches expected behavior.


72-80: Optional extensibility note.
Instructions for adding other models (Grok, Llama, Mistral) are clear and helpful.


91-107: Development and deployment commands.
The DevMode, agent creation, and deployment sections clearly outline the CLI commands.


109-113: Skip: Environment variable commands are correct.
The examples for setting and securing variables are explicit and accurate.


115-119: Good external documentation links.
Linking to the Agentuity SDK docs aids users needing deeper reference material.


120-124: Support channels are well-defined.
Providing both documentation and community links covers typical user support needs.


125-128: License section is correctly placed.
The reference to the LICENSE file is clear and complete.

@joel13samuel joel13samuel merged commit 05512ab into main Jun 11, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants