[Prototype] workflows using Llama Stack #30

majamassarini · 2025-07-08T12:28:45Z

My feedback on the experience:

Cons:

Complexity especially if compared with Goose.
I couldn't experiment with the containers. You need to build your own server distribution to run the llama server in a container. I tried multiple existing images but none had everything I need.

The configuration is ready to use the MCP Jira server, but you need to change this line:

-        jira_agent = Agent(client, **jira_config)
+        jira_agent = Agent(client, tools=["mcp::jira"], **jira_config)

I can see requests going to the MCP Jira server from the Llama Stack (with a 200 status code). However, after that the llama stack server stops with the following error (as if it is invoking the tool without the needed parameters - I can go a bit further with specifying the tool binding Agent(client, tools=[{"name": "mcp::jira/jira_get_issue", "args": {"issue_key": issue}}], **jira_config) - but then again I have an error because the request has a session id parameter that the mcp server does not expect):

litellm.exceptions.BadRequestError: litellm.BadRequestError: VertexAIException BadRequestError - b'{\n  "error": {\n    "code": 400,\n    "message": "* GenerateContentRequest.tools[0].function_declarations[15].parameters.properties[fields].items: missing field.\\n* GenerateContentRequest.tools[0].function_declarations[15].parameters.properties[issue_ids_or_keys].items: missing field.\\n",\n    "status": "INVALID_ARGUMENT"\n  }\n}\n'

Pros:

The feedback from Llama Stack is much more understandable/readable than the one from Goose. Also, while experimenting with Llama Stack, I never experienced hanging issues (as with Goose) or performance problems.
Multiple solutions are already available. For example, you can use the llama session management instead of a custom Redis queue mechanism. In the dir session-based-container-workflow there is an example generated with Cursor but not tested yet (since the llama-stack image does not exist, I generated that code to showcase the usage of pre-built session management in Llama Stack compared to a queue implemented with Redis).
Most of the configuration is done in a declarative way (YAML config file). The workflow_runner.py Python file is needed only for chaining the agents with the preferred workflow style, see https://github.com/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Agent_Workflows.ipynb.
Even though the server approach is complex to set up, it has its advantages (a llama-stack agent can be used without the server, but you would lose these advantages):
- Configuration: The YAML server configuration is quite simple; without the server, you need to set up the providers through Python code.
- Resource management: Without the server, the agents are responsible for managing memory, models, and connections to external services.
- Scalability: The server can distribute load across multiple instances.
- State management: When using the server, the sessions are persistent across multiple requests.
- Development: The server supports config changes without restarts, provides built-in metrics and logging, and handles multiple users/developers.
- Feature limitations: Without the server, there is no routing between providers, built-in safety guards, automatic failover between models, or complex workflow orchestration.

gemini-code-assist

Code Review

This pull request introduces a basic configuration for llama-stack, including a Makefile, Python scripts, and a container-based workflow. Several critical and high-severity issues were identified, primarily related to environment variable mismatches, dependency management, missing dependencies, and incomplete container setup. Addressing these issues will improve the code's correctness and robustness.

gemini-code-assist · 2025-07-08T12:31:18Z

llama-stack/Makefile

+	python -m venv ~/venv/$(VENV)
+	@echo "📦 Installing Llama Stack..."
+	. ~/venv/$(VENV)/bin/activate && pip install --upgrade pip
+	. ~/venv/$(VENV)/bin/activate && pip install llama-stack


Installing llama-stack here conflicts with the requirements.txt file, creating a redundant installation that will be overridden. Remove this line to rely solely on requirements.txt for dependency management.

gemini-code-assist · 2025-07-08T12:31:18Z

llama-stack/env.template

+# ========================================
+# Gemini Configuration
+# ========================================
+GOOGLE_API_TOKEN=your_google_api_token


The environment variable GOOGLE_API_TOKEN does not match the expected GOOGLE_API_KEY in llama_stack_config.yaml (line 17). Rename this variable to ensure correct authentication.

GOOGLE_API_KEY=your_google_api_token

gemini-code-assist · 2025-07-08T12:31:19Z

llama-stack/requirements.txt

+jsonschema-specifications==2025.4.1
+litellm==1.73.6.post1
+llama_cpp_python==0.3.9
+-e git+ssh://[email protected]/meta-llama/llama-stack.git@4d0d2d685f9ea9948a0a7ff6da376501291d45c7#egg=llama_stack


The llama-stack installation uses git+ssh, requiring users to configure SSH keys for GitHub. Use git+https for better portability.

-e git+https://github.com/meta-llama/llama-stack.git@4d0d2d685f9ea9948a0a7ff6da376501291d45c7#egg=llama_stack

gemini-code-assist · 2025-07-08T12:31:19Z

llama-stack/Makefile

+ISSUE ?= RHEL-78418
+VENV ?= ai-workflows
+
+.PHONY: bootstrap mcp-server llama-server workflow-runner run-all clean clean-all


The .PHONY declaration includes a clean-all target, but the target itself is not implemented in the Makefile. This will cause an error for users trying to run make clean-all.

Please either implement the clean-all target or remove it from the .PHONY list and the documentation to avoid confusion.

gemini-code-assist · 2025-07-08T12:31:19Z

llama-stack/llama_stack_config.yaml

+          instructions: |
+            You are an agent for updating a CentOS package to the provided version. You will prepare the content of the update
+            and then create a commit with the changes. Follow exactly these steps:
+
+            A couple of rules that you must follow and useful information for you:
+              * All packages are in separate Git repositories under the Gitlab project {{ git_url }}
+              * You can find the package at {{ git_url }}/{{ package }}
+              * The Git user name is {{ git_user }}
+              * The Git user's email address is {{ git_email }}
+              * Use {{ gitlab_user }} as the GitLab user.
+              * Work only in a temporary directory that you can create with the mktemp tool.
+              * To create forks and open merge requests, always use GitLab's `glab` CLI tool.
+              * You can find packaging guidelines at https://docs.fedoraproject.org/en-US/packaging-guidelines/
+              * You can find the RPM packaging guide at https://rpm-packaging-guide.github.io/.
+              * Do not run the `centpkg new-sources` command for now (testing purposes), just write down the commands you would run.
+
+            1. Find the location of the {{ package }} package at {{ git_url }}.  Always use the {{ dist_git_branch }} branch.
+
+            2. Check if the {{ package }} was not already updated to version {{ version }}.  That means comparing the current version and provided version.
+              * The current version of the package can be found in the 'Version' field of the RPM .spec file.
+              * If there is nothing to update, print a message and exit. Otherwise follow the instructions below.
+              * Do not clone any repository for detecting the version in .spec file.
+
+            3. Create a local Git repository by following these steps:
+              * Check if the fork already exists for {{ gitlab_user }} as {{ gitlab_user }}/{{ package }} and if not, create a fork of the {{ package }} package using the glab tool.
+              * Clone the fork using git and HTTPS into the temp directory.
+
+            4. Update the {{ package }} to the newer version:
+              * Create a new Git branch named `automated-package-update-{{ version }}`.
+              * Update the local package by:
+                * Updating the 'Version' and 'Release' fields in the .spec file as needed (or corresponding macros), following packaging documentation.
+                  * Make sure the format of the .spec file remains the same.
+                * Updating macros related to update (e.g., 'commit') if present and necessary; examine the file's history to see how updates are typically done.
+                  * You might need to check some information in upstream repository, e.g. the commit SHA of the new version.
+                * Creating a changelog entry, referencing the Jira issue as "Resolves: <jira_issue>" for each issue in {{ jira_issues }}.
+                * Downloading sources using `spectool -g -S {{ package }}.spec` (you might need to copy local sources, e.g. if the .spec file loads some macros from them, to a directory where spectool expects them).
+                * Uploading the new sources using `centpkg --release {{ dist_git_branch }} new-sources`.
+                * IMPORTANT: Only performing changes relevant to the version update: Do not rename variables, comment out existing lines, or alter if-else branches in the .spec file.
+
+            5. Verify and adjust the changes:
+              * Use `rpmlint` to validate your .spec file changes and fix any new errors it identifies.
+              * Generate the SRPM using `rpmbuild -bs` (ensure your .spec file and source files are correctly copied to the build environment as required by the command).
+
+            6. Commit the changes:
+              * The title of the Git commit should be in the format "[DO NOT MERGE: AI EXPERIMENTS] Update to version {{ version }}"
+              * Include the reference to Jira as "Resolves: <jira_issue>" for each issue in {{ jira_issues }}.
+              * Commit just the specfile change.
+
+            7. Open a merge request:
+              * Authenticate using `glab`
+              * Push the commit to the fork.
+              * Open a merge request against the upstream repository of the {{ package }} in {{ git_url }} with previously created commit.


The rebase_package agent instructions rely on command-line tools (mktemp, glab, git, etc.) not guaranteed to be present in the execution environment. Document these dependencies in README.md and add installation steps to the bootstrap target or Dockerfile.

gemini-code-assist · 2025-07-08T12:31:19Z

llama-stack/session-based-container-workflow/docker-compose.simple.yaml

+services:
+  # Llama Stack server with persistent storage
+  llama-stack:
+    image: llama-stack:latest


The service references llama-stack:latest, but no provided file builds this image. Include instructions or a Dockerfile to build the llama-stack service image.

gemini-code-assist · 2025-07-08T12:31:19Z

llama-stack/workflow_runner.py

+        if "ConnectionError" in str(type(e)):
+            print("❌ Cannot connect to Llama Stack. Run 'make start-local-stack' first.")
+        else:
+            print(f"❌ Error: {e}")


Checking exception type via str(type(e)) is brittle. Catch httpx.ConnectError in a separate except block for robust error handling.

gemini-code-assist · 2025-07-08T12:31:19Z

llama-stack/Makefile

+		. ~/venv/$(VENV)/bin/activate && pip install -r requirements.txt; \
+	fi
+	@echo "✅ Virtual environment $(VENV) is ready!"
+	@echo "💡 You can now run: make mcp-server && make llama-server && make agent"


The help message suggests running make agent, but there is no agent target defined. Correct this to workflow-runner to align with the available targets.

@echo "💡 You can now run: make mcp-server && make llama-server && make workflow-runner"

gemini-code-assist · 2025-07-08T12:31:19Z

llama-stack/compose.yaml

+volumes:
+  valkey_data:


The volume valkey_data is defined but not used by any service. Remove this unused volume definition to avoid confusion.

gemini-code-assist · 2025-07-08T12:31:19Z

llama-stack/session-based-container-workflow/simple_container_workflow.py

+            'session_id': jira_session_id if 'jira_session_id' in locals() else None,
+            'agent_id': jira_agent.agent_id if 'jira_agent' in locals() else None,


Instead of if 'variable' in locals(), initialize jira_session_id = None and jira_agent = None before the try block for clearer intent.

vrothberg

Thanks for working on that, @majamassarini !

The amount of code is quite surprising to me.

majamassarini · 2025-07-09T10:13:16Z

Thanks for working on that, @majamassarini !

The amount of code is quite surprising to me.

Probably I should have created a new commit for the code inside the session-based-container-workflow 😅
The only code really needed is in llama-stack/workflow_runner.py.

Compared to ADK and BeeAI, I think it is a bit less. I tried to the explain why in my last PRO point. Since there is a server with a yaml configuration, most of the setup is done there. The code in the file I shared is needed only for defining the way the agents will interact and error handling (plus the loading of prompts from the config file - when Cursor suggested me to put the prompts inside the server config, it was hallucinating, still I think it was a nice idea, this is why there is a method for loading them from the server config file 😅).

gemini-code-assist bot reviewed Jul 8, 2025

View reviewed changes

majamassarini force-pushed the research-llama-stack branch from e2f4563 to 1ae6de3 Compare July 8, 2025 13:23

majamassarini changed the title ~~llama-stack basic configuration~~ [Prototype] llama-stack basic configuration Jul 8, 2025

majamassarini changed the title ~~[Prototype] llama-stack basic configuration~~ [Prototype] workflows using Llama Stack Jul 8, 2025

majamassarini force-pushed the research-llama-stack branch 3 times, most recently from 1105beb to e522c9e Compare July 9, 2025 08:02

llama-stack basic configuration

f97c2cb

majamassarini force-pushed the research-llama-stack branch from e522c9e to f97c2cb Compare July 9, 2025 08:02

vrothberg reviewed Jul 9, 2025

View reviewed changes

		'session_id': jira_session_id if 'jira_session_id' in locals() else None,
		'agent_id': jira_agent.agent_id if 'jira_agent' in locals() else None,

		volumes:
		valkey_data:

[Prototype] workflows using Llama Stack #30

Are you sure you want to change the base?

[Prototype] workflows using Llama Stack #30

Uh oh!

Conversation

majamassarini commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Cons:

Pros:

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

vrothberg left a comment

Choose a reason for hiding this comment

Uh oh!

majamassarini commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

majamassarini commented Jul 8, 2025 •

edited

Loading

majamassarini commented Jul 9, 2025 •

edited

Loading