Skip to content

Configurable QA agent#116

Open
lalaliat wants to merge 2 commits intoagentscope-ai:mainfrom
lalaliat:main
Open

Configurable QA agent#116
lalaliat wants to merge 2 commits intoagentscope-ai:mainfrom
lalaliat:main

Conversation

@lalaliat
Copy link
Collaborator

@lalaliat lalaliat commented Feb 2, 2026

Supports user-customizable QA agents, including:

  • name
  • system_prompt
  • tools
  • model
  • collection_name
  • file

@lalaliat lalaliat requested review from a team and xieyxclack February 2, 2026 07:20
@cla-assistant
Copy link

cla-assistant bot commented Feb 2, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


料料 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@gemini-code-assist
Copy link

Summary of Changes

Hello @lalaliat, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the agent framework by introducing a specialized QAAgent for question answering, complete with RAG capabilities and GitHub tool integration. It also refactors the base agent class with a new factory method for more flexible agent creation and provides a command-line utility for simplified agent setup and interaction. These changes aim to make agents more customizable and powerful for diverse use cases.

Highlights

  • Introduction of QAAgent: A new QAAgent class is added, specializing in question answering with Retrieval-Augmented Generation (RAG) capabilities, leveraging Qdrant for knowledge storage.
  • Simplified Agent Creation: The AliasAgentBase now features an asynchronous create class method, streamlining the instantiation of agents with configurable models, system prompts, tools, and long-term memory.
  • Command-Line Agent Utility: A new create_agent.py script provides a command-line interface for easily creating and interacting with both AliasAgentBase and QAAgent instances, supporting various customizations.
  • RAG and GitHub Tool Integration: The QAAgent is designed to process user-provided files into a Qdrant knowledge base for RAG and integrates GitHub MCP tools for enhanced functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new customizable QAAgent with RAG capabilities, along with a factory method in AliasAgentBase and a command-line script for agent creation. The changes are extensive and well-structured. My review focuses on improving code clarity, robustness, and maintainability. I've pointed out a duplicate import, an unused import, and suggested simplifications for complex logic in a few places. I've also highlighted an issue with unsafe environment variable access that could lead to a crash and recommended using the logger for consistent error handling instead of printing to stdout. The provided rule regarding assert for LLM responses did not apply to any of the comments.

Comment on lines +285 to +300
try:
knowledge = SimpleKnowledge(
embedding_store=QdrantStore(
location=None,
client_kwargs={
"host": QDRANT_HOST, # Qdrant server address
"port": QDRANT_PORT, # Qdrant server port
},
collection_name=collection_name,
dimensions=1024, # The dimension of the embedding vectors
),
embedding_model=DashScopeTextEmbedding(
api_key=os.environ["DASHSCOPE_API_KEY"],
model_name="text-embedding-v4",
),
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Directly accessing os.environ["DASHSCOPE_API_KEY"] on line 297 will raise a KeyError and crash the application if the environment variable is not set. This should be handled more gracefully by checking for the variable's existence first and logging an informative error, similar to how GITHUB_TOKEN is handled elsewhere in the file.

        try:
            dashscope_api_key = os.getenv("DASHSCOPE_API_KEY")
            if not dashscope_api_key:
                logger.error(
                    "Missing DASHSCOPE_API_KEY; RAG tool 'retrieve_knowledge' cannot be used. "
                    "Please export DASHSCOPE_API_KEY in your environment.",
                )
                return

            knowledge = SimpleKnowledge(
                embedding_store=QdrantStore(
                    location=None,
                    client_kwargs={
                        "host": QDRANT_HOST,  # Qdrant server address
                        "port": QDRANT_PORT,  # Qdrant server port
                    },
                    collection_name=collection_name,
                    dimensions=1024,  # The dimension of the embedding vectors
                ),
                embedding_model=DashScopeTextEmbedding(
                    api_key=dashscope_api_key,
                    model_name="text-embedding-v4",
                ),
            )

DataScienceAgent,
init_ds_toolkit,
)
from alias.agent.agents._qa_agent import QAAgent

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

QAAgent is imported twice in this file (also on line 3). This redundant import should be removed to improve code clarity.

"""
import hashlib
import os
import re

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The re module is imported but not used within this file. It should be removed to keep the code clean.

Comment on lines +127 to +140
# Resolve (files to process, collection_name) for initial load
if file is None and collection_name is None:
files_to_process = [DEFAULT_RAG_FILE_PATH]
init_collection = DEFAULT_COLLECTION_NAME
elif file is not None and collection_name is None:
files_to_process = file
init_collection = DEFAULT_COLLECTION_NAME
elif file is None and collection_name is not None:
files_to_process = [DEFAULT_RAG_FILE_PATH]
init_collection = collection_name
else:
files_to_process = file
init_collection = collection_name
await cls._process_files(files_to_process, init_collection)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The conditional logic for determining files_to_process and init_collection is unnecessarily complex. It can be simplified for better readability and maintainability.

                # Resolve (files to process, collection_name) for initial load
                files_to_process = file if file is not None else [DEFAULT_RAG_FILE_PATH]
                init_collection = collection_name if collection_name is not None else DEFAULT_COLLECTION_NAME
                await cls._process_files(files_to_process, init_collection)

)
logger.info(f"Registered retrieve_knowledge tool with collection '{collection_name}'")
except Exception as e:
print(traceback.format_exc())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using print(traceback.format_exc()) writes to standard output, which is not ideal for application logging. It's better to use the configured logger (e.g., logger.error()) for consistent log handling. A similar issue exists on line 354.

            logger.error(traceback.format_exc())

prompt_text = resolve_system_prompt(system_prompt)
tools_list = normalize_tools(tools)
agent_kind = normalize_agent_type(agent_type)
file_list = normalize_file_list(file) if file is not None else None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The normalize_file_list function already handles None input by returning an empty list. The conditional if file is not None is therefore redundant and can be removed to simplify the code.

Suggested change
file_list = normalize_file_list(file) if file is not None else None
file_list = normalize_file_list(file)

Comment on lines +160 to +163
except Exception as e:
import traceback
print(f"Error: {e}")
traceback.print_exc()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using print() and traceback.print_exc() for error handling writes directly to standard output. It's better to use a logger for consistent error reporting, which allows for better control over log levels, formatting, and destinations. You can add from loguru import logger at the top of the file.

Suggested change
except Exception as e:
import traceback
print(f"Error: {e}")
traceback.print_exc()
except Exception as e:
import traceback
from loguru import logger
logger.error(f"Error: {e}\n{traceback.format_exc()}")

Comment on lines +235 to +241
file_arg = args.file
if file_arg is not None and isinstance(file_arg, list) and len(file_arg) == 0:
file_arg = None
if file_arg is not None and isinstance(file_arg, list) and len(file_arg) == 1:
file_arg = file_arg[0] if file_arg[0] else None
if file_arg is not None and isinstance(file_arg, list):
file_arg = [p for p in file_arg if p]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to normalize the --file argument is complex and can be simplified to more cleanly handle both space-separated and comma-separated file paths. The current implementation has multiple checks and reassignments which are hard to follow.

Suggested change
file_arg = args.file
if file_arg is not None and isinstance(file_arg, list) and len(file_arg) == 0:
file_arg = None
if file_arg is not None and isinstance(file_arg, list) and len(file_arg) == 1:
file_arg = file_arg[0] if file_arg[0] else None
if file_arg is not None and isinstance(file_arg, list):
file_arg = [p for p in file_arg if p]
file_arg = args.file
if file_arg:
processed_files = []
for path_or_paths in file_arg:
processed_files.extend([p.strip() for p in path_or_paths.split(',') if p.strip()])
file_arg = processed_files

@xieyxclack xieyxclack changed the title create_agent Configurable QA agent Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant