Add e2e tests using Python / Typescript clients

Most clients are likely to be using the Python or Typescript SDKs and they're official so will be closely aligned with the spec. We should add some e2e tests to ensure that an example application (a test 'host') can connect and list/use tools, esp with some custom headers.

We could also use this for evaluation using [LangEval](https://docs.langwatch.ai/langevals/tutorials/extensive-unit-testing) (related to LangWatch), by having some standard questions, running them against multiple models with MCP tools enabled, and asserting that we obtain the correct answers within X tokens or API calls.

TODO:

- [x] Require contributor approval before running actions on PRs
- [x] Create a new cloud instance for tests
- [x] Create a `tests` subdirectory
- [x] Create a Python package in there
- [x] Add langeval
- [x] Add tests
- [x] Run in CI (#117)

Workflows to test:

- [x] [docker-compose] what are the most recent log lines from Grafana?
- [x] [docker-compose] what are the values for the "container" label in the logs?
- [x] [docker-compose] Can you list the panel queries for the dashboard with UID <>?
- [ ] [docker-compose] what dashboard should I look at to see container CPU?
- [ ] [cloud]: who is on call for team X?
- [ ] [cloud]: what incidents are active?
- [ ] (add more)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add e2e tests using Python / Typescript clients #43

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add e2e tests using Python / Typescript clients #43

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions