Skip to content

Commit fc8e161

Browse files
Merge pull request #2103 from tenstorrent/rc-v0.9.0
Release v0.9.0
2 parents 72d1512 + 31e2e67 commit fc8e161

File tree

321 files changed

+41024
-11453
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

321 files changed

+41024
-11453
lines changed

.cursor/rules/coding-standards.mdc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,4 @@ alwaysApply: true
1111
- Minimize new dependencies that are not in the Python 3.8 standard library and not already used in the project.
1212
- Do not provide verbose explainations, writting code that correctly meets the requirements is more important, UNLESS there is a bug found in the code. If there is a bug it should be explained and then a fix to the bug proposed.
1313
- Use clear and professional names for all variables, functions, classes, and files.
14+
- Python string formatting uses F-string style (f"{}") not C printf style ("%s")

.cursor/rules/project-files.mdc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -199,7 +199,7 @@
199199
│ ├── scripts
200200
│ │ └── download_sdxl_weights.py
201201
│ ├── security
202-
│ │ └── api_key_cheker.py
202+
│ │ └── api_key_checker.py
203203
│ ├── static
204204
│ │ ├── data
205205
│ │ │ ├── audio_test.json

.cursor/rules/testing.mdc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ globs:
55
alwaysApply: true
66
---
77

8-
- source .venv/bin/activate to run pytest locally. IMPORTANT If that environment does not exist you must SKIP trying to run the tests. Do not try to install the python venv.
8+
- source .pre-commit/bin/activate to run pytest locally. IMPORTANT If that environment does not exist you must SKIP trying to run the tests. Do not try to install the python venv.
99
- Ensure new functionality has test coverage that is meaningful and tests full components with minimal mocking only of external calls.
1010
- Make sure to run tests via `pytest` to check for breaking changes.
1111
- If you find a bug in code that is being tested, call it out, don't change tests to go around it.

.cursorignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,4 @@ dist/
1515
persistent_volume/
1616
workflow_logs/
1717
model_specs_output.json
18+
docs/model_support/**

.github/CODEOWNERS

Lines changed: 41 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,45 @@
22
# the repo. Unless a later match takes precedence. This will be
33
# requested for review when someone opens a pull request.
44
* @tenstorrent/tt-inference-server-codeowners
5-
workflows/run_reports.py @tenstorrent/tt-inference-server-codeowners @acvejicTT @mjeremicTT @mdobrosavljevicTT @vmaksimovicTT
5+
6+
# tt-media-server
7+
tt-media-server/ @idjuricTT @ztorlakTT @fivanovicTT @ljovanovicTT @dmadicTT @knovokmetTT @vpetrovicTT
68
tt-media-server/tt_model_runners/forge_runners/ @vmilosevic
9+
10+
# tt-vllm-plugin
11+
tt-vllm-plugin/ @idjuricTT @ztorlakTT @fivanovicTT @ljovanovicTT @dmadicTT @bgoelTT @knovokmetTT @vpetrovicTT
12+
13+
# vllm-tt-metal
14+
vllm-tt-metal-llama3/ @tstescoTT @bgoelTT @stisiTT
15+
16+
# docker entrypoint
17+
docker-entrypoint.sh @tstescoTT @bgoelTT @idjuricTT @acvejicTT
18+
19+
# workflows
20+
run.py @tstescoTT @bgoelTT @stisiTT @tenstorrent/tt-inference-server-codeowners
21+
workflows/ @tstescoTT @tenstorrent/tt-inference-server-codeowners
22+
workflows/model_spec.py @tstescoTT @bgoelTT @idjuricTT @tenstorrent/tt-inference-server-codeowners
23+
workflows/run_reports.py @acvejicTT @mjeremicTT @mdobrosavljevicTT @vmaksimovicTT @tenstorrent/tt-inference-server-codeowners
24+
25+
# benchmarking
26+
benchmarking/ @tstescoTT @bgoelTT @stisiTT @gtobarTT @arobergeTT @ssanjayTT @tenstorrent/tt-inference-server-codeowners
27+
28+
# evals
29+
evals/ @tstescoTT @bgoelTT @stisiTT @gtobarTT @arobergeTT @ssanjayTT @tenstorrent/tt-inference-server-codeowners
30+
31+
# utils
32+
utils/ @tenstorrent/tt-inference-server-codeowners
33+
34+
# tests
35+
tests/ @tenstorrent/tt-inference-server-codeowners
36+
37+
# scripts
38+
scripts/* @tstescoTT @tenstorrent/tt-inference-server-codeowners
39+
scripts/release/* @vmaksimovicTT @tstescoTT @bgoelTT
40+
41+
# github actions
42+
.github/workflows/ @tstescoTT @bgoelTT @idjuricTT @acvejicTT @vmaksimovicTT @mjeremicTT @tenstorrent/tt-inference-server-codeowners
43+
44+
# docs
45+
docs/ @tenstorrent/tt-inference-server-codeowners
46+
README.md @tstescoTT @bgoelTT @idjuricTT

.github/ISSUE_TEMPLATE/model-readiness.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
name: Add a Model to Model Readiness
2-
description: Add support for a new model in the Model Readiness suite.
1+
name: Add Model Readiness Support for a New Model
2+
description: Add support for a new model in the Model Readiness test suite.
33
title: "[Model Readiness Support]: "
44
labels: ["model_readiness_support"]
55
projects: ["tenstorrent/130"]

.github/workflows/test-gate.yml

Lines changed: 201 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ jobs:
1212
runs-on: ubuntu-latest
1313
outputs:
1414
should-test: ${{ steps.filter.outputs.code }}
15+
cpp-server: ${{ steps.filter.outputs.cpp_server }}
1516
steps:
1617
- uses: actions/checkout@v4
1718
- uses: dorny/paths-filter@v3
@@ -23,6 +24,8 @@ jobs:
2324
- '**/requirements*.txt'
2425
- 'pyproject.toml'
2526
- '.github/workflows/*.yml'
27+
cpp_server:
28+
- 'tt-media-server/cpp_server/**'
2629
2730
lint:
2831
needs: detect-changes
@@ -38,7 +41,7 @@ jobs:
3841
python-version: "3.10"
3942

4043
- name: Install ruff
41-
run: pip install ruff
44+
run: pip install ruff==0.15.0
4245

4346
- name: Run ruff linter
4447
run: ruff check .
@@ -434,13 +437,16 @@ jobs:
434437
pytest performance_tests/test_llm_streaming.py -sv 2>&1 | tee test_output.txt
435438
TEST_EXIT_CODE=${PIPESTATUS[0]}
436439
437-
# Parse CI report metrics
438-
TOKENS=$(grep 'tokens_received=' test_output.txt | cut -d= -f2 || echo "N/A")
439-
TOTAL_TIME=$(grep 'total_time_ms=' test_output.txt | cut -d= -f2 || echo "N/A")
440-
MEAN_INTERVAL=$(grep 'mean_interval_ms=' test_output.txt | cut -d= -f2 || echo "N/A")
441-
THROUGHPUT=$(grep 'throughput_tps=' test_output.txt | cut -d= -f2 || echo "N/A")
442-
OVERHEAD=$(grep 'overhead_ms=' test_output.txt | cut -d= -f2 || echo "N/A")
443-
THRESHOLD=$(grep 'threshold_ms=' test_output.txt | cut -d= -f2 || echo "N/A")
440+
# Extract only the CI report section (between markers) to avoid matching source code
441+
sed -n '/::CI_REPORT_START::/,/::CI_REPORT_END::/p' test_output.txt > ci_report.txt
442+
443+
# Parse CI report metrics from the extracted section
444+
TOKENS=$(grep '^tokens_received=' ci_report.txt | cut -d= -f2 || echo "N/A")
445+
TOTAL_TIME=$(grep '^total_time_ms=' ci_report.txt | cut -d= -f2 || echo "N/A")
446+
MEAN_INTERVAL=$(grep '^mean_interval_ms=' ci_report.txt | cut -d= -f2 || echo "N/A")
447+
THROUGHPUT=$(grep '^throughput_tps=' ci_report.txt | cut -d= -f2 || echo "N/A")
448+
OVERHEAD=$(grep '^overhead_ms=' ci_report.txt | cut -d= -f2 || echo "N/A")
449+
THRESHOLD=$(grep '^threshold_ms=' ci_report.txt | cut -d= -f2 || echo "N/A")
444450
445451
# Determine status
446452
if [ $TEST_EXIT_CODE -eq 0 ]; then
@@ -463,8 +469,194 @@ jobs:
463469
echo "| **Overhead/Token** | ${OVERHEAD}ms (threshold: ${THRESHOLD}ms) |"
464470
} >> $GITHUB_STEP_SUMMARY
465471
466-
exit $TEST_EXIT_CODE
472+
# Store exit code for later
473+
echo "TEST_EXIT_CODE=$TEST_EXIT_CODE" >> $GITHUB_ENV
474+
475+
- name: Show server logs
476+
if: always()
477+
run: |
478+
echo "=== Server Logs ==="
479+
if [ -f performance_tests/server.log ]; then
480+
cat performance_tests/server.log
481+
else
482+
echo "No server.log file found"
483+
fi
484+
485+
- name: Upload server logs
486+
if: always()
487+
uses: actions/upload-artifact@v4
488+
with:
489+
name: llm-streaming-server-logs
490+
path: tt-media-server/performance_tests/server.log
491+
retention-days: 1
492+
if-no-files-found: warn
493+
494+
- name: Check test result
495+
run: exit ${{ env.TEST_EXIT_CODE }}
496+
497+
cpp-server-ttnn-build:
498+
needs: detect-changes
499+
if: ${{ needs.detect-changes.outputs.cpp-server == 'true' }}
500+
name: C++ Server TTNN Build
501+
runs-on: ubuntu-latest
502+
permissions:
503+
contents: read
504+
env:
505+
PYTHONPATH: ${{ github.workspace }}/tt-media-server
506+
defaults:
507+
run:
508+
working-directory: tt-media-server
509+
510+
steps:
511+
- name: Checkout repository
512+
uses: actions/checkout@v4
513+
514+
- name: Install C++ build dependencies
515+
run: |
516+
sudo apt-get update -qq
517+
sudo apt-get install -y -qq cmake g++ pkg-config \
518+
libjsoncpp-dev uuid-dev zlib1g-dev
519+
520+
- name: Clone Drogon for build
521+
run: |
522+
mkdir -p cpp_server/deps
523+
git clone --depth 1 --branch v1.9.8 https://github.com/drogonframework/drogon.git cpp_server/deps/drogon
524+
cd cpp_server/deps/drogon
525+
git submodule update --init
526+
cd ../../..
527+
528+
- name: Build C++ server
529+
run: |
530+
cd cpp_server
531+
./build.sh --ttnn
532+
533+
cpp-server-llm-streaming:
534+
needs: detect-changes
535+
if: ${{ needs.detect-changes.outputs.cpp-server == 'true' }}
536+
name: C++ Server LLM Streaming Performance Test
537+
runs-on: ubuntu-latest
538+
permissions:
539+
contents: read
540+
env:
541+
PYTHONPATH: ${{ github.workspace }}/tt-media-server
542+
defaults:
543+
run:
544+
working-directory: tt-media-server
545+
546+
steps:
547+
- name: Checkout repository
548+
uses: actions/checkout@v4
549+
550+
- name: Install C++ build dependencies
551+
run: |
552+
sudo apt-get update -qq
553+
sudo apt-get install -y -qq cmake g++ pkg-config \
554+
libjsoncpp-dev uuid-dev zlib1g-dev
555+
556+
- name: Clone Drogon for build
557+
run: |
558+
mkdir -p cpp_server/deps
559+
git clone --depth 1 --branch v1.9.8 https://github.com/drogonframework/drogon.git cpp_server/deps/drogon
560+
cd cpp_server/deps/drogon
561+
git submodule update --init
562+
cd ../../..
563+
564+
- name: Build C++ server
565+
run: |
566+
cd cpp_server
567+
./build.sh --test
568+
cd ..
569+
570+
- name: Run C++ unit tests
571+
run: |
572+
cd cpp_server/build
573+
ctest --output-on-failure
574+
cd ../..
575+
576+
- name: Start C++ server (test runner mode)
577+
run: |
578+
cd cpp_server/build
579+
export TT_RUNNER_TYPE=llm_test
580+
export TEST_RUNNER_FREQUENCY_MS=1
581+
./tt_media_server_cpp -p 8000 > ../../cpp_server.log 2>&1 &
582+
echo $! > ../../cpp_server.pid
583+
cd ../..
584+
for i in $(seq 1 30); do
585+
curl -sf http://127.0.0.1:8000/health | grep -q '"status"' && break
586+
sleep 1
587+
done
588+
curl -sf http://127.0.0.1:8000/health || exit 1
589+
590+
- name: Set up Python
591+
uses: actions/setup-python@v5
592+
with:
593+
python-version: "3.10"
594+
cache: 'pip'
595+
596+
- name: Install Python test dependencies
597+
run: |
598+
pip install --upgrade pip
599+
pip install pytest pytest-asyncio aiohttp requests
600+
601+
- name: Run LLM Streaming test against C++ server
602+
env:
603+
EXTERNAL_LLM_SERVER: "1"
604+
SERVER_BASE_URL: "http://127.0.0.1:8000"
605+
TEST_RUNNER_FREQUENCY_MS: "1"
606+
run: |
607+
pytest performance_tests/test_llm_streaming.py -sv 2>&1 | tee test_output.txt
608+
TEST_EXIT_CODE=${PIPESTATUS[0]}
609+
610+
# Extract CI report and write job summary (same as llm-streaming-performance)
611+
sed -n '/::CI_REPORT_START::/,/::CI_REPORT_END::/p' test_output.txt > ci_report.txt
612+
TOKENS=$(grep '^tokens_received=' ci_report.txt | cut -d= -f2 || echo "N/A")
613+
TOTAL_TIME=$(grep '^total_time_ms=' ci_report.txt | cut -d= -f2 || echo "N/A")
614+
MEAN_INTERVAL=$(grep '^mean_interval_ms=' ci_report.txt | cut -d= -f2 || echo "N/A")
615+
THROUGHPUT=$(grep '^throughput_tps=' ci_report.txt | cut -d= -f2 || echo "N/A")
616+
OVERHEAD=$(grep '^overhead_ms=' ci_report.txt | cut -d= -f2 || echo "N/A")
617+
THRESHOLD=$(grep '^threshold_ms=' ci_report.txt | cut -d= -f2 || echo "N/A")
618+
if [ $TEST_EXIT_CODE -eq 0 ]; then STATUS="✅ PASSED"; else STATUS="❌ FAILED"; fi
619+
{
620+
echo "## 🚀 C++ Server LLM Streaming Test"
621+
echo ""
622+
echo "| Metric | Value |"
623+
echo "|--------|-------|"
624+
echo "| **Status** | $STATUS |"
625+
echo "| **Tokens Received** | $TOKENS |"
626+
echo "| **Total Time** | ${TOTAL_TIME}ms |"
627+
echo "| **Mean Interval** | ${MEAN_INTERVAL}ms |"
628+
echo "| **Throughput** | ${THROUGHPUT} tokens/s |"
629+
echo "| **Overhead/Token** | ${OVERHEAD}ms (threshold: ${THRESHOLD}ms) |"
630+
} >> $GITHUB_STEP_SUMMARY
631+
632+
echo "TEST_EXIT_CODE=$TEST_EXIT_CODE" >> $GITHUB_ENV
633+
634+
- name: Stop C++ server
635+
if: always()
636+
run: |
637+
[ -f cpp_server.pid ] && kill $(cat cpp_server.pid) 2>/dev/null || true
638+
639+
- name: Show server logs
640+
if: always()
641+
run: |
642+
echo "=== C++ Server Logs ==="
643+
if [ -f cpp_server.log ]; then
644+
cat cpp_server.log
645+
else
646+
echo "No cpp_server.log file found"
647+
fi
648+
649+
- name: Upload server logs
650+
if: always()
651+
uses: actions/upload-artifact@v4
652+
with:
653+
name: cpp-server-llm-streaming-logs
654+
path: tt-media-server/cpp_server.log
655+
retention-days: 1
656+
if-no-files-found: warn
467657

658+
- name: Check test result
659+
run: exit ${{ env.TEST_EXIT_CODE }}
468660

469661
forge-runner-changes:
470662
name: Detect Forge Runner Changes

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ db.sqlite3
4747
scripts/examples/example_data
4848

4949
# unignore
50+
!**/CMakeLists.txt
5051
!requirements*.txt
5152
!workflows/model_performance_reference.json
5253
!tests/server_tests/test_config.json
@@ -64,3 +65,7 @@ tt-media-server/huggingface
6465

6566
# ignore downloaded datsets
6667
tests/server_tests/datasets/*
68+
69+
# ignore generated model support docs
70+
# use 'git add -f docs/model_support/**' to commit updates
71+
docs/model_support/**

.pre-commit-config.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,21 +4,21 @@ repos:
44
# Run the linter.
55
- id: ruff
66
name: ruff
7-
entry: bash -c 'source .venv/bin/activate && ruff check "$@"' --
7+
entry: bash -c 'source .pre-commit/bin/activate && ruff check "$@"' --
88
language: system
99
types: [python]
1010
# Run the formatter.
1111
- id: ruff-format
1212
name: ruff-format
13-
entry: bash -c 'source .venv/bin/activate && ruff format "$@"' --
13+
entry: bash -c 'source .pre-commit/bin/activate && ruff format "$@"' --
1414
language: system
1515
types: [python]
1616

1717
- repo: local
1818
hooks:
1919
- id: pytest
2020
name: pytest
21-
entry: bash -c 'source .venv/bin/activate && pytest'
21+
entry: bash -c 'source .pre-commit/bin/activate && pytest'
2222
language: system
2323
pass_filenames: false
2424
always_run: true

0 commit comments

Comments
 (0)