Skip to content

Conversation

@majamassarini
Copy link
Member

@majamassarini majamassarini commented Jul 8, 2025

My feedback on the experience:

Cons:

  • Complexity especially if compared with Goose.
  • I couldn't experiment with the containers. You need to build your own server distribution to run the llama server in a container. I tried multiple existing images but none had everything I need.
  • The configuration is ready to use the MCP Jira server, but you need to change this line:
    -        jira_agent = Agent(client, **jira_config)
    +        jira_agent = Agent(client, tools=["mcp::jira"], **jira_config)

I can see requests going to the MCP Jira server from the Llama Stack (with a 200 status code). However, after that the llama stack server stops with the following error (as if it is invoking the tool without the needed parameters - I can go a bit further with specifying the tool binding Agent(client, tools=[{"name": "mcp::jira/jira_get_issue", "args": {"issue_key": issue}}], **jira_config) - but then again I have an error because the request has a session id parameter that the mcp server does not expect):

litellm.exceptions.BadRequestError: litellm.BadRequestError: VertexAIException BadRequestError - b'{\n  "error": {\n    "code": 400,\n    "message": "* GenerateContentRequest.tools[0].function_declarations[15].parameters.properties[fields].items: missing field.\\n* GenerateContentRequest.tools[0].function_declarations[15].parameters.properties[issue_ids_or_keys].items: missing field.\\n",\n    "status": "INVALID_ARGUMENT"\n  }\n}\n'   

Pros:

  • The feedback from Llama Stack is much more understandable/readable than the one from Goose. Also, while experimenting with Llama Stack, I never experienced hanging issues (as with Goose) or performance problems.
  • Multiple solutions are already available. For example, you can use the llama session management instead of a custom Redis queue mechanism. In the dir session-based-container-workflow there is an example generated with Cursor but not tested yet (since the llama-stack image does not exist, I generated that code to showcase the usage of pre-built session management in Llama Stack compared to a queue implemented with Redis).
  • Most of the configuration is done in a declarative way (YAML config file). The workflow_runner.py Python file is needed only for chaining the agents with the preferred workflow style, see https://github.com/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Agent_Workflows.ipynb.
  • Even though the server approach is complex to set up, it has its advantages (a llama-stack agent can be used without the server, but you would lose these advantages):
    • Configuration: The YAML server configuration is quite simple; without the server, you need to set up the providers through Python code.
    • Resource management: Without the server, the agents are responsible for managing memory, models, and connections to external services.
    • Scalability: The server can distribute load across multiple instances.
    • State management: When using the server, the sessions are persistent across multiple requests.
    • Development: The server supports config changes without restarts, provides built-in metrics and logging, and handles multiple users/developers.
    • Feature limitations: Without the server, there is no routing between providers, built-in safety guards, automatic failover between models, or complex workflow orchestration.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a basic configuration for llama-stack, including a Makefile, Python scripts, and a container-based workflow. Several critical and high-severity issues were identified, primarily related to environment variable mismatches, dependency management, missing dependencies, and incomplete container setup. Addressing these issues will improve the code's correctness and robustness.

python -m venv ~/venv/$(VENV)
@echo "📦 Installing Llama Stack..."
. ~/venv/$(VENV)/bin/activate && pip install --upgrade pip
. ~/venv/$(VENV)/bin/activate && pip install llama-stack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Installing llama-stack here conflicts with the requirements.txt file, creating a redundant installation that will be overridden. Remove this line to rely solely on requirements.txt for dependency management.

# ========================================
# Gemini Configuration
# ========================================
GOOGLE_API_TOKEN=your_google_api_token
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The environment variable GOOGLE_API_TOKEN does not match the expected GOOGLE_API_KEY in llama_stack_config.yaml (line 17). Rename this variable to ensure correct authentication.

GOOGLE_API_KEY=your_google_api_token

jsonschema-specifications==2025.4.1
litellm==1.73.6.post1
llama_cpp_python==0.3.9
-e git+ssh://[email protected]/meta-llama/llama-stack.git@4d0d2d685f9ea9948a0a7ff6da376501291d45c7#egg=llama_stack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The llama-stack installation uses git+ssh, requiring users to configure SSH keys for GitHub. Use git+https for better portability.

-e git+https://github.com/meta-llama/llama-stack.git@4d0d2d685f9ea9948a0a7ff6da376501291d45c7#egg=llama_stack

ISSUE ?= RHEL-78418
VENV ?= ai-workflows

.PHONY: bootstrap mcp-server llama-server workflow-runner run-all clean clean-all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The .PHONY declaration includes a clean-all target, but the target itself is not implemented in the Makefile. This will cause an error for users trying to run make clean-all.

Please either implement the clean-all target or remove it from the .PHONY list and the documentation to avoid confusion.

Comment on lines +64 to +115
instructions: |
You are an agent for updating a CentOS package to the provided version. You will prepare the content of the update
and then create a commit with the changes. Follow exactly these steps:

A couple of rules that you must follow and useful information for you:
* All packages are in separate Git repositories under the Gitlab project {{ git_url }}
* You can find the package at {{ git_url }}/{{ package }}
* The Git user name is {{ git_user }}
* The Git user's email address is {{ git_email }}
* Use {{ gitlab_user }} as the GitLab user.
* Work only in a temporary directory that you can create with the mktemp tool.
* To create forks and open merge requests, always use GitLab's `glab` CLI tool.
* You can find packaging guidelines at https://docs.fedoraproject.org/en-US/packaging-guidelines/
* You can find the RPM packaging guide at https://rpm-packaging-guide.github.io/.
* Do not run the `centpkg new-sources` command for now (testing purposes), just write down the commands you would run.

1. Find the location of the {{ package }} package at {{ git_url }}. Always use the {{ dist_git_branch }} branch.

2. Check if the {{ package }} was not already updated to version {{ version }}. That means comparing the current version and provided version.
* The current version of the package can be found in the 'Version' field of the RPM .spec file.
* If there is nothing to update, print a message and exit. Otherwise follow the instructions below.
* Do not clone any repository for detecting the version in .spec file.

3. Create a local Git repository by following these steps:
* Check if the fork already exists for {{ gitlab_user }} as {{ gitlab_user }}/{{ package }} and if not, create a fork of the {{ package }} package using the glab tool.
* Clone the fork using git and HTTPS into the temp directory.

4. Update the {{ package }} to the newer version:
* Create a new Git branch named `automated-package-update-{{ version }}`.
* Update the local package by:
* Updating the 'Version' and 'Release' fields in the .spec file as needed (or corresponding macros), following packaging documentation.
* Make sure the format of the .spec file remains the same.
* Updating macros related to update (e.g., 'commit') if present and necessary; examine the file's history to see how updates are typically done.
* You might need to check some information in upstream repository, e.g. the commit SHA of the new version.
* Creating a changelog entry, referencing the Jira issue as "Resolves: <jira_issue>" for each issue in {{ jira_issues }}.
* Downloading sources using `spectool -g -S {{ package }}.spec` (you might need to copy local sources, e.g. if the .spec file loads some macros from them, to a directory where spectool expects them).
* Uploading the new sources using `centpkg --release {{ dist_git_branch }} new-sources`.
* IMPORTANT: Only performing changes relevant to the version update: Do not rename variables, comment out existing lines, or alter if-else branches in the .spec file.

5. Verify and adjust the changes:
* Use `rpmlint` to validate your .spec file changes and fix any new errors it identifies.
* Generate the SRPM using `rpmbuild -bs` (ensure your .spec file and source files are correctly copied to the build environment as required by the command).

6. Commit the changes:
* The title of the Git commit should be in the format "[DO NOT MERGE: AI EXPERIMENTS] Update to version {{ version }}"
* Include the reference to Jira as "Resolves: <jira_issue>" for each issue in {{ jira_issues }}.
* Commit just the specfile change.

7. Open a merge request:
* Authenticate using `glab`
* Push the commit to the fork.
* Open a merge request against the upstream repository of the {{ package }} in {{ git_url }} with previously created commit.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The rebase_package agent instructions rely on command-line tools (mktemp, glab, git, etc.) not guaranteed to be present in the execution environment. Document these dependencies in README.md and add installation steps to the bootstrap target or Dockerfile.

services:
# Llama Stack server with persistent storage
llama-stack:
image: llama-stack:latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The service references llama-stack:latest, but no provided file builds this image. Include instructions or a Dockerfile to build the llama-stack service image.

Comment on lines +145 to +148
if "ConnectionError" in str(type(e)):
print("❌ Cannot connect to Llama Stack. Run 'make start-local-stack' first.")
else:
print(f"❌ Error: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Checking exception type via str(type(e)) is brittle. Catch httpx.ConnectError in a separate except block for robust error handling.

. ~/venv/$(VENV)/bin/activate && pip install -r requirements.txt; \
fi
@echo "✅ Virtual environment $(VENV) is ready!"
@echo "💡 You can now run: make mcp-server && make llama-server && make agent"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The help message suggests running make agent, but there is no agent target defined. Correct this to workflow-runner to align with the available targets.

	@echo "💡 You can now run: make mcp-server && make llama-server && make workflow-runner"

Comment on lines +16 to +17
volumes:
valkey_data:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The volume valkey_data is defined but not used by any service. Remove this unused volume definition to avoid confusion.

Comment on lines +146 to +147
'session_id': jira_session_id if 'jira_session_id' in locals() else None,
'agent_id': jira_agent.agent_id if 'jira_agent' in locals() else None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of if 'variable' in locals(), initialize jira_session_id = None and jira_agent = None before the try block for clearer intent.

@majamassarini majamassarini force-pushed the research-llama-stack branch from e2f4563 to 1ae6de3 Compare July 8, 2025 13:23
@majamassarini majamassarini changed the title llama-stack basic configuration [Prototype] llama-stack basic configuration Jul 8, 2025
@majamassarini majamassarini changed the title [Prototype] llama-stack basic configuration [Prototype] workflows using Llama Stack Jul 8, 2025
@majamassarini majamassarini force-pushed the research-llama-stack branch 3 times, most recently from 1105beb to e522c9e Compare July 9, 2025 08:02
@majamassarini majamassarini force-pushed the research-llama-stack branch from e522c9e to f97c2cb Compare July 9, 2025 08:02
Copy link
Collaborator

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on that, @majamassarini !

The amount of code is quite surprising to me.

@majamassarini
Copy link
Member Author

majamassarini commented Jul 9, 2025

Thanks for working on that, @majamassarini !

The amount of code is quite surprising to me.

Probably I should have created a new commit for the code inside the session-based-container-workflow 😅
The only code really needed is in llama-stack/workflow_runner.py.

Compared to ADK and BeeAI, I think it is a bit less. I tried to the explain why in my last PRO point. Since there is a server with a yaml configuration, most of the setup is done there. The code in the file I shared is needed only for defining the way the agents will interact and error handling (plus the loading of prompts from the config file - when Cursor suggested me to put the prompts inside the server config, it was hallucinating, still I think it was a nice idea, this is why there is a method for loading them from the server config file 😅).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants