These are the agents used to power the IBM Granite Playground.
Implemented using beeai-framework, the agents can be exposed with the A2A (or the deprecated ACP) protocol:
- basic chat with Granite (no external sources)
- chat with thinking (no external sources, the LLM will perform additional reasoning steps)
- chat plus search (uses an external search source DuckDuckGo|Google|Tavily)
- deep research (uses external search with additional planning and recursion)
- Python (version range specified in individual
pyproject.tomlfiles) - UV package manager: https://docs.astral.sh/uv/
- An A2A client e.g. Agent Stack recommended
- Access to an LLM e.g. Ollama
This guide will get you started with the out-of-the-box agent configuration. This may not be optimally configured for your desired use case and as such, we recommend looking at the configuration after you have successfully run the agents using the defaults. The configuration options can be overridden with environment variables (including via a .env file).
You can use Agent Stack with a variety of LLM services. If you don't have access to one, you can run an LLM locally using Ollama on your machine. You need to have Ollama downloaded and running with the Granite 4 model. Follow these steps:
- Go to Ollama and download the installer for your system
- Start the Ollama server
- Pull the
ibm/granite4and thenomic-embed-textmodelsollama pull ibm/granite4:latest ollama pull nomic-embed-text:latest
An A2A client is required to use the agents. The agents are designed to work with the Agent Stack Platform and take advantage of several A2A extensions offered by the platform. Follow these steps:
-
Refer to the Agent Stack quick start guide to download and install and run the platform
-
If you didn't opt to start the platform when installing, you can run it like this:
agentstack platform start
-
Wait for the platform to fully - it takes a few minutes the first time
-
Configure LLM provider(s) - connect to your local Ollama server or add credentials for other LLM services you have access to.
There is a agentstack-cli reference you can use for further commands but the above is sufficient to get started.
Important
You must configure access to an LLM and an Embedding model
Select which agent you would like to run and start the agent:
# pick one of these
uv --directory a2a run -m a2a_agents.agents.agent
uv --directory a2a run -m a2a_agents.agents.chat.agent_chat
uv --directory a2a run -m a2a_agents.agents.chat.agent_search
uv --directory a2a run -m a2a_agents.agents.chat.agent_researchAfter starting the agent, you will see lots of log output. If you're running the Agent Stack Platform then the agent will register itself with the platform and you will see the following log message that indicates success:
INFO | agentstack_sdk | Agent registered successfully
You can now interact with the agents via the Agent Stack Platform user interface in your web browser. Run the following to start your web browser at the appropriate page:
agentstack uiThe UI will start in your web browser. Select the ☰ hamburger menu (top left) and click on the Granite agent that you are running. Once selected, you can type your prompt into the input box to run the agent.
Tip
If you did not configure one during Agent Stack installation, the first time you start the Agent Stack Platform UI, you will need to select an LLM back end and embedding model.
Run the ACP agent
We do not recommend using the ACP version of the agents since the ACP protocol has been deprecated. Access to the ACP agents is via direct HTTP connection since these agents will not work with the Agent Stack Platform.
Instructions for running and connecting to the ACP agents are available below:
uv --directory acp run -m acp_agent.agentUse the agent via an HTTP GET request
curl -X POST \
--url http://localhost:8000/runs \
-H 'content-type: application/json' \
-d '{
"agent_name": "granite-chat",
"input": [
{
"role": "user",
"parts": [
{
"content": "Hello"
}
]
}
],
"mode": "stream"
}'The core library is designed to use Granite models that can be served from a variety of back ends. To configure the library, ensure environment variables are in place when running the code (this can be done via a .env file). The full configuration options available are documented in the granite_core config.py file where you will find a brief description of each option, the data type it expects, potential limitations on values and a default value.
The agents are configured in a similar way to the core, via environment variables (that can also be set via a .env file). The configurations are in the relevant config.py files for each agent. The agents will start without any additional configuration by adopting default values such as using Granite models served via a local Ollama and search provided by simple a DuckDuckGo implementation. This is sufficient for initial/early experimental usage. However, you are encouraged to explore the options to achieve better performance for your use case.
The following table illustrates some of the main options:
| Option | Default | Notes |
|---|---|---|
| USE_AGENTSTACK_LLM | True |
Automatically configures LLM and Embedding model access via Agent Stack |
| OLLAMA_BASE_URL | http://localhost:11434 |
Update this if running Ollama on a non standard port or alternate host |
| LLM_PROVIDER | ollama |
Alternate providers are watsonx or openai. |
| LLM_MODEL | ibm/granite4 |
Update to the ID required by the LLM_PROVIDER. Granite 3 also supported. |
| EMBEDDINGS_PROVIDER | ollama |
Alternate providers are watsonx or openai. |
| EMBEDDINGS_MODEL | nomic-embed-text |
Use an appropriate long context embedding model from your provider. |
| RETRIEVER | duckduckgo |
Alternate retrievers are google and tavily. Used in search/research. |
| LOG_LEVEL | INFO |
You can get more verbose logs by setting to DEBUG. |
Retriever options don't have default values but must be used if configuring an alternate retriever:
| Option | Type | Notes |
|---|---|---|
| GOOGLE_API_KEY | Secret | The API key used to access Google search |
| GOOGLE_CX_KEY | Secret | The CX key used to access Google search |
| TAVILY_API_KEY | Secret | The API key used to access Tavily search |
LLM provider options don't have default values but must be used if configuring an alternate LLM provider:
| Option | Type | Notes |
|---|---|---|
| WATSONX_API_BASE | URL | Watsonx URL e.g. https://us-south.ml.cloud.ibm.com |
| WATSONX_REGION | Str | Watsonx Region e.g. us-south |
| WATSONX_PROJECT_ID | Secret | Required if setting LLM_PROVIDER to watsonx |
| WATSONX_API_KEY | Secret | Required if setting LLM_PROVIDER to watsonx |
| LLM_API_BASE | URL | OpenAI base URL for access to the LLM |
| LLM_API_KEY | Secret | Required if the LLM_API_BASE is authenticated |
| EMBEDDINGS_OPENAI_API_BASE | URL | OpenAI base URL for access to the embedding model |
| EMBEDDINGS_OPENAI_API_KEY | Secret | Required if EMBEDDINGS_OPENAI_API_BASE is authenticated |
Note
The embeddings provider will use the same watsonx credentials as the LLM if configured to use watsonx.
For development work on the agents, you must install pre-commit and the pre-commit hooks prior to modifying the code:
pre-commit installAll pre-commit hooks must be run and pass before code is accepted into the repository.
Unit tests verify that individual components of the codebase work as expected, helping catch bugs early and ensuring long-term reliability.
Unit tests run on the default agent configuration. To run unit tests you need to have ollama installed and the ibm/granite4 model pulled.
uv --directory granite_core run pytestGuidelines
- Place all core library test files under granite_core/tests/.
- Name test files as test_*.py.
- Use clear, isolated test cases with minimal dependencies.
- Run tests regularly before commits to maintain code quality.
podman build -t granite-playground-agents:latest -f a2a/Dockerfiles/agent/Dockerfile .podman run --env-file .env --name granite-playground-agents -p 8000:8000 --rm granite-playground-agents:latest