Minimal agent with no framework: only the OpenAI Python client and an Action/Observation loop with tools. Requires the OpenAI Responses API — works with OpenAI or any endpoint that supports the Responses API.
- uv — Python package manager
- Podman or Docker — for local container builds (Option A)
- oc — for OpenShift deployment
- Helm — for deploying to Kubernetes/OpenShift
- GNU Make and a bash-compatible shell — on Windows, use WSL (recommended) or Git Bash
Note: This agent uses the OpenAI Responses API, which is specific to OpenAI. It does not use Ollama or Llama Stack for local model serving.
make init creates a .env file from .env.example. Set your environment variables in the .env file.
cd agents/vanilla_python/openai_responses_agent
make initTracing is optional. If MLflow tracing is required, enable it by uncommenting and setting the following environment variables in the .env file.
MLFLOW_TRACKING_URI="http://localhost:5000"
MLFLOW_EXPERIMENT_NAME="openai-responses-agent"
MLFLOW_HTTP_REQUEST_TIMEOUT=2
MLFLOW_HTTP_REQUEST_MAX_RETRIES=0Then start the MLflow server in a separate terminal:
# Start the MLflow server
uv run --extra tracing mlflow server --port 5000When MLFLOW_TRACKING_URI is set, make run-app and make run-cli will automatically install the tracing dependency.
To enable tracing and logging with MLflow on your OpenShift cluster, add the following environment variables to your .env file:
MLFLOW_TRACKING_URI="https://<openshift-dashboard-url>/mlflow"
MLFLOW_TRACKING_TOKEN="<your-openshift-token>"
MLFLOW_EXPERIMENT_NAME="openai-responses-agent"
MLFLOW_TRACKING_INSECURE_TLS="true"
MLFLOW_WORKSPACE="default"Notes:
-
MLFLOW_TRACKING_URI- URL of your MLflow server. For local development, usehttp://localhost:5000. If using MLflow on an OpenShift cluster, replace<openshift-dashboard-url>with your cluster's data science gateway URL. -
MLFLOW_TRACKING_TOKEN- Required for OpenShift only. Your OpenShift authentication token, obtained from the OpenShift console. -
MLFLOW_EXPERIMENT_NAME- A descriptive name for your experiment (e.g., "OpenAI Responses Demo") -
MLFLOW_TRACKING_INSECURE_TLS- Required for OpenShift only. Set to"true"if your cluster does not use trusted certificates. -
MLFLOW_WORKSPACE- Required for OpenShift only. Project name. -
Tracing is optional; if you do not set
MLFLOW_TRACKING_URI, the application will run without MLflow logging. -
If
MLFLOW_TRACKING_URIis set, the application will attempt to connect to the MLflow server at startup. If the server is unreachable, the application will log a warning and continue running without tracing. -
You can control how long the application waits for the MLflow server by setting
MLFLOW_HEALTH_CHECK_TIMEOUT(in seconds, default:5).
Now you will remove old .venv and create new. Next dependencies will be installed.
make envKeep this terminal open – the app needs to keep running. You should see output indicating the app started on
http://localhost:8000.
cd agents/vanilla_python/openai_responses_agent
make run-app # fails if port is already in use and print steps TO-DOFor terminal-based testing without a browser:
cd agents/vanilla_python/openai_responses_agent
make run-cliThis launches an interactive prompt where you can pick predefined questions or type your own. Tool calls and results are displayed inline with colored output.
cd agents/vanilla_python/openai_responses_agent
make initEdit .env with your model endpoint and container image:
API_KEY = your-openai-api-key
BASE_URL = https://api.openai.com/v1
MODEL_ID = gpt-4o-mini
CONTAINER_IMAGE = quay.io/your-username/openai-responses-agent:latestNotes:
-
API_KEY- your OpenAI API key -
BASE_URL- should end with/v1 -
MODEL_ID- model identifier available on your endpoint -
CONTAINER_IMAGE– full image path where the agent container will be pushed and pulled from. The image is built locally, pushed to this registry, and then deployed to OpenShift.Format:
<registry>/<namespace>/<image-name>:<tag>Examples:
- Quay.io:
quay.io/your-username/openai-responses-agent:latest - Docker Hub:
docker.io/your-username/openai-responses-agent:latest - GHCR:
ghcr.io/your-org/openai-responses-agent:latest
Note: OpenShift must be able to pull the container image. Make the image public, or configure an image pull secret for private registries.
- Quay.io:
Login to OC
oc login -u "login" -p "password" https://super-link-to-cluster:111Login ex. Docker
docker login -u='login' -p='password' quay.ioRequires Podman (or Docker) and a registry account (e.g., Quay.io).
make build # builds the image locally
make push # pushes to the registry specified in CONTAINER_IMAGENo Podman, Docker, or registry account needed — just the oc CLI.
make build-openshiftAfter the build completes, set CONTAINER_IMAGE in your .env to the internal registry URL printed after the build.
make dry-run # preview rendered Helm manifests (secrets redacted)make deployAfter deploying, the application may take about a minute to become available while the pod starts up.
The route URL is printed after make deploy. You can also retrieve it manually:
oc get route openai-responses-agent -o jsonpath='{.spec.host}'make undeploySee OpenShift Deployment for more details.
make testNon-streaming:
curl -X POST http://localhost:8000/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "How much does a Lenovo Laptop cost and what are the reviews?"}], "stream": false}'Streaming:
curl -sN -X POST http://localhost:8000/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "How much does a Lenovo Laptop cost and what are the reviews?"}], "stream": true}'Pretty Printed Stream:
curl -sN -X POST http://localhost:8000/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "How much does a Lenovo Laptop cost and what are the reviews?"}], "stream": true}' |
jq -R -r -j 'scan("^data:(.*)") | .[0] | select(. != " [DONE]") | fromjson.choices[0].delta.content // empty'curl http://localhost:8000/health