Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 174 additions & 11 deletions docs/docs/services.md
Original file line number Diff line number Diff line change
Expand Up @@ -279,31 +279,194 @@ archi create [...] --services chatbot,redmine-mailer

## Mattermost Interface

Reads posts from a Mattermost forum and posts draft responses to a specified channel.
Connects Archi to a Mattermost channel. Supports two operating modes:

### Configuration
- **Webhook mode** — Mattermost pushes outgoing webhooks to Archi (recommended)
- **Polling mode** — Archi polls a channel periodically via the Mattermost API

**Default port:** `5000`

### Setup

#### Secrets

```bash
# Required for webhook mode
MATTERMOST_WEBHOOK=https://mattermost.example.com/hooks/... # Incoming webhook URL
MATTERMOST_OUTGOING_TOKEN=... # Outgoing webhook token for request validation

# Required for polling mode only
MATTERMOST_PAK=... # Personal Access Token for the bot account
MATTERMOST_CHANNEL_ID_READ=... # Channel to read posts from
MATTERMOST_CHANNEL_ID_WRITE=... # Channel to post responses to

# Required for SSO auth (db mode)
SSO_CLIENT_ID=...
SSO_CLIENT_SECRET=...
BYOK_ENCRYPTION_KEY=... # Used to encrypt stored refresh tokens
PG_PASSWORD=...
```

#### Basic Configuration

```yaml
services:
mattermost:
update_time: 60
update_time: 60 # polling interval in seconds (polling mode only)
port: 5000
external_port: 5000
```

### Secrets
#### Running

```bash
MATTERMOST_WEBHOOK=...
MATTERMOST_PAK=...
MATTERMOST_CHANNEL_ID_READ=...
MATTERMOST_CHANNEL_ID_WRITE=...
archi create [...] --services chatbot,mattermost
```

### Running
---

```bash
archi create [...] --services chatbot,mattermost
### Authentication

By default auth is disabled and the bot responds to all users. Two auth modes are available.

#### Mode 1: Config (Static Allowlist)

Roles are assigned to Mattermost users via a static map in the config. No SSO or database required.

```yaml
services:
mattermost:
auth:
enabled: true
token_store: config
default_role: mattermost-restricted # role for users not in user_roles
user_roles:
jsmith: [archi-expert] # Mattermost username → list of roles
ahmedmu: [archi-admins]
someuser: [archi-expert, base-user]
```

- Users in `user_roles` get the specified roles.
- Users not in `user_roles` get `default_role`.
- If `default_role` is not defined in `auth_roles`, those users have no permissions and are denied.

#### Mode 2: DB / SSO (Recommended)

Roles come from the CERN SSO JWT token. On first message, the bot sends the user a login link. After authenticating, their roles are stored in the database and reused on subsequent messages — no re-login required until the session expires.

```yaml
services:
mattermost:
auth:
enabled: true
token_store: db
session_lifetime_days: 30 # full re-login required after this period
roles_refresh_hours: 24 # silent background role refresh interval
login_base_url: "https://your-mattermost-service-host:5000"
sso:
server_metadata_url: "https://auth.cern.ch/auth/realms/cern/.well-known/openid-configuration"
token_endpoint: "https://auth.cern.ch/auth/realms/cern/protocol/openid-connect/token"
```

**SSO registration requirement:** The callback URL `<login_base_url>/mattermost-auth/callback` must be registered as a valid redirect URI in your SSO client (Keycloak / CERN Auth).

**Login flow:**

```
1. User sends message to bot (no token stored)
2. Bot replies: "Please login: https://<host>:5000/mattermost-auth?state=<user_id>&username=<username>"
3. User clicks link → redirected to CERN SSO
4. After SSO login → redirected to /mattermost-auth/callback
5. Roles extracted from JWT, stored in mattermost_tokens table
6. User sees success page, closes tab, returns to Mattermost
7. Future messages use stored roles (silent refresh every 24h)
```

**Session lifecycle:**

| Event | Behaviour |
|-------|-----------|
| First message | Login link sent |
| Token valid, roles fresh | Respond normally |
| Roles stale (`> roles_refresh_hours`) | Silent refresh via stored refresh token |
| Session expired (`> session_lifetime_days`) | Login link sent again |
| Admin invalidates token | Login link sent on next message |

---

### Role-Based Access Control

Mattermost auth integrates with the same RBAC system used by the chat app. Roles are defined under `services.chat_app.auth.auth_roles`.

#### Restricting Access

To allow only users with a specific role (e.g. `archi-expert` and above), add the `mattermost:access` permission to those roles and **not** to `base-user`:

```yaml
services:
chat_app:
auth:
auth_roles:
roles:
base-user:
permissions:
- chat:query
- chat:history
# no mattermost:access here

archi-expert:
inherits: [base-user]
permissions:
- mattermost:access # grants access to the Mattermost bot
- documents:view
- config:view
# ...

archi-admins:
permissions:
- "*" # wildcard includes mattermost:access

permissions:
mattermost:access:
description: "Access the Mattermost bot"
category: "mattermost"
```

- `base-user` only → denied with "you don't have permission" message
- `archi-expert` → allowed (has `mattermost:access`)
- `archi-admins` → allowed (wildcard)

#### Tool-Level Permissions

Tool permissions work the same as in the chat app. Add permissions like `tools:http_get` to roles that should be able to use specific agent tools. The Mattermost user context is propagated through the full call stack so tool checks apply correctly.

```yaml
archi-expert:
permissions:
- mattermost:access
- tools:http_get # allow HTTP GET tool for this role
```

#### Database

A `mattermost_tokens` table is required when using `token_store: db`. It is created automatically by `init.sql` on first deploy. For existing deployments, run the migration manually:

```sql
CREATE TABLE IF NOT EXISTS mattermost_tokens (
mattermost_user_id VARCHAR(255) PRIMARY KEY,
mattermost_username VARCHAR(255),
email VARCHAR(255),
roles JSONB NOT NULL DEFAULT '[]',
refresh_token BYTEA,
token_expires_at TIMESTAMPTZ,
roles_refreshed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
```

Refresh tokens are encrypted at rest using `pgp_sym_encrypt` (requires `BYOK_ENCRYPTION_KEY`).

---

## Grafana Monitoring
Expand Down
25 changes: 24 additions & 1 deletion src/archi/pipelines/agents/tools/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,30 @@ def check_tool_permission(required_permission: str) -> tuple[bool, Optional[str]
try:
from flask import session, has_request_context
from src.utils.rbac.registry import get_registry


# Check Mattermost context first — covers webhook mode (Flask context, no session)
# and polling mode (no Flask context). ContextVar is set by mattermost_user_context().
try:
from src.utils.rbac.mattermost_context import get_mattermost_context
mm_ctx = get_mattermost_context()
if mm_ctx is not None:
registry = get_registry()
if registry.has_permission(mm_ctx.roles, required_permission):
logger.debug(
f"Mattermost user @{mm_ctx.username} granted '{required_permission}'"
)
return True, None
logger.info(
f"Mattermost user @{mm_ctx.username} denied '{required_permission}' "
f"(roles: {mm_ctx.roles})"
)
return False, (
f"Permission denied for @{mm_ctx.username}: "
f"requires '{required_permission}'."
)
except Exception as mm_exc:
logger.debug(f"Mattermost context check skipped: {mm_exc}")

# If we're not in a request context, allow the tool (for testing/CLI usage)
if not has_request_context():
logger.debug("No request context, allowing tool access")
Expand Down
50 changes: 50 additions & 0 deletions src/archi/pipelines/agents/tools/local_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import os
import re
import time
from dataclasses import dataclass, field
from pathlib import Path
from typing import Callable, Dict, Iterable, List, Optional, Sequence, Tuple
Expand Down Expand Up @@ -31,6 +32,8 @@ def __init__(
port: int = 7871,
external_port: Optional[int] = None,
timeout: float = 30.0,
retry_attempts: int = 3,
retry_backoff_seconds: float = 1.0,
api_token: Optional[str] = None,
):
host_mode_flag = self._resolve_host_mode(host_mode)
Expand All @@ -42,10 +45,57 @@ def __init__(
final_port = external_port if host_mode_flag and external_port else port
self.base_url = f"http://{host}:{final_port}"
self.timeout = timeout
self.retry_attempts = max(int(retry_attempts), 1)
self.retry_backoff_seconds = max(float(retry_backoff_seconds), 0.0)
self._headers: Dict[str, str] = {}
if api_token:
self._headers["Authorization"] = f"Bearer {api_token}"

def _get(self, path: str, *, params: Optional[Dict[str, object]] = None) -> requests.Response:
last_exc: Optional[Exception] = None
for attempt in range(1, self.retry_attempts + 1):
try:
resp = requests.get(
f"{self.base_url}{path}",
params=params,
headers=self._headers,
timeout=self.timeout,
)
except (requests.Timeout, requests.ConnectionError) as exc:
last_exc = exc
if attempt >= self.retry_attempts:
raise
sleep_s = self.retry_backoff_seconds * (2 ** (attempt - 1))
logger.warning(
"Catalog request failed (%s/%s) %s: %s; retrying in %.2fs",
attempt,
self.retry_attempts,
path,
exc,
sleep_s,
)
time.sleep(sleep_s)
continue

if resp.status_code >= 500 and attempt < self.retry_attempts:
sleep_s = self.retry_backoff_seconds * (2 ** (attempt - 1))
logger.warning(
"Catalog request got HTTP %s (%s/%s) %s; retrying in %.2fs",
resp.status_code,
attempt,
self.retry_attempts,
path,
sleep_s,
)
time.sleep(sleep_s)
continue

return resp

if last_exc is not None:
raise last_exc
raise RuntimeError(f"Catalog request exhausted retries for {path}")

@classmethod
def from_deployment_config(cls, config: Optional[Dict[str, object]]) -> "RemoteCatalogClient":
"""Create a client using the standard archi deployment config structure."""
Expand Down
32 changes: 26 additions & 6 deletions src/bin/service_mattermost.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,21 @@
import multiprocessing as mp
import os
import time
from threading import Thread

from src.interfaces import mattermost
from src.utils.env import read_secret
from src.utils.logging import setup_logging
from src.utils.postgres_service_factory import PostgresServiceFactory

# set basicConfig for logging
setup_logging()

def run_polling(mattermost_agent, update_time):
while True:
mattermost_agent.process_posts()
time.sleep(update_time)

def main():
# set openai
os.environ['OPENAI_API_KEY'] = read_secret("OPENAI_API_KEY")
Expand All @@ -18,13 +25,26 @@ def main():

time.sleep(30) # temporary hack to prevent mattermost from starting at the same time as other services; eventually replace this with more robust solution

print("Initializing Mattermost Service")
mattermost_agent = mattermost.Mattermost()
update_time = int(mattermost_agent.mattermost_config["update_time"])
# Initialize Postgres config service (required before any get_full_config() call)
factory = PostgresServiceFactory.from_env(password_override=read_secret("PG_PASSWORD"))
PostgresServiceFactory.set_instance(factory)

while True:
mattermost_agent.process_posts()
time.sleep(update_time)
# Start webhook server first — its __init__ initializes the config service via MattermostAIWrapper
print("Initializing Mattermost webhook server")
webhook_server = mattermost.MattermostWebhookServer()

# Start polling loop in background thread if PAK is available (config service now ready)
pak = read_secret("MATTERMOST_PAK")
if pak:
print("Initializing Mattermost polling service")
mattermost_agent = mattermost.Mattermost()
update_time = int(mattermost_agent.mattermost_config.get("update_time", 60))
polling_thread = Thread(target=run_polling, args=(mattermost_agent, update_time), daemon=True)
polling_thread.start()
else:
print("MATTERMOST_PAK not set — skipping polling mode")

webhook_server.run(host='0.0.0.0', port=webhook_server.port)

if __name__ == "__main__":
mp.set_start_method("spawn", force=True)
Expand Down
7 changes: 5 additions & 2 deletions src/cli/service_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,11 @@ def _register_default_services(self):
name='mattermost',
description='Integration service for Mattermost channels',
category='integration',
required_secrets=['MATTERMOST_WEBHOOK', 'MATTERMOST_CHANNEL_ID_READ',
'MATTERMOST_CHANNEL_ID_WRITE', 'MATTERMOST_PAK']
requires_volume=True,
required_secrets=['MATTERMOST_WEBHOOK',
# 'MATTERMOST_CHANNEL_ID_READ',
# 'MATTERMOST_CHANNEL_ID_WRITE', 'MATTERMOST_PAK'
'MATTERMOST_OUTGOING_TOKEN']
))

self.register(ServiceDefinition(
Expand Down
Loading
Loading