Fix Ollama test script: resolve port conflicts and add dynamic CUDA version detection #1323

tokk-nv · 2025-08-24T07:45:59Z

Problem

The Ollama test script was failing when Ollama was natively installed on the host system. This caused test failures due to:

Port conflicts between the containerized Ollama and the host-installed Ollama
Hardcoded port assignments that didn't handle occupied ports

Side Issues

Ollama PID detection ( was getting the grep process as well)
Missing dynamic CUDA version detection

Solution

This PR improves the Ollama test script with:

Dynamic port detection: Automatically finds available ports starting from 11435
Port conflict resolution: Handles cases where multiple ports are occupied
Dynamic CUDA version detection: Automatically detects CUDA version from nvidia-smi/nvcc
Improved error handling: Better process verification and cleanup
Robust process management: Fixed PID extraction and verification

Testing

✅ Tested on systems with native Ollama installation
- Both on Thor (native installer failed, but left the service) and Orin
✅ Verified port conflict resolution works correctly
✅ Confirmed CUDA version detection functions properly
✅ Validated container-safe operation

Impact

Fixes test failures on systems with native Ollama
Makes the test script more robust and portable
Improves CI/CD reliability across different environments

…ersion detection - Fix process management: use proper PID extraction (column 2 instead of 1) - Add dynamic port detection: automatically find available ports starting from 11435 - Add dynamic CUDA version detection: automatically detect CUDA version from nvidia-smi/nvcc - Remove systemctl dependencies: make script container-safe - Improve error handling: better process verification and cleanup - Fix CUDA library selection: use detected version instead of undefined variable - Add robust port conflict resolution: handles cases where multiple ports are occupied

Copilot

Pull Request Overview

This PR fixes critical issues in the Ollama test script that were causing failures when Ollama was already installed on the host system. The changes implement dynamic port detection to avoid conflicts and add automatic CUDA version detection.

Implements dynamic port allocation starting from 11435 to avoid conflicts with existing Ollama instances
Adds automatic CUDA version detection from nvidia-smi/nvcc instead of relying on hardcoded values
Improves process management with better PID detection and cleanup procedures

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-08-24T15:52:14Z

packages/llm/ollama/test.sh

+        if timeout 1 bash -c "echo >/dev/tcp/127.0.0.1/$port" 2>/dev/null; then
+            # Port is in use, try next
+            port=$((port + 1))
+        else
+            # Port is available
+            echo $port
+            return 0


The port availability check is incorrect. This bash redirection will succeed when the port is in use (connection established), but the logic treats success as 'port in use'. The condition should be inverted - if the connection succeeds, the port is occupied.

Suggested change

if timeout 1 bash -c "echo >/dev/tcp/127.0.0.1/$port" 2>/dev/null; then

# Port is in use, try next

port=$((port + 1))

else

# Port is available

echo $port

return 0

if ! timeout 1 bash -c "echo >/dev/tcp/127.0.0.1/$port" 2>/dev/null; then

# Port is available

echo $port

return 0

else

# Port is in use, try next

port=$((port + 1))

Copilot · 2025-08-24T15:52:15Z

packages/llm/ollama/test.sh

+OLLAMA_PID=$(ps -ef | grep 'ollama serve' | grep -v grep | awk '{ print $2 }')
+
+if [ -n "$OLLAMA_PID" ]; then
+    echo "Stopping existing ollama process: $OLLAMA_PID"
+    kill $OLLAMA_PID


This will only capture the first PID if multiple ollama processes are running. Consider using pgrep -f 'ollama serve' for more reliable process detection, or handle multiple PIDs appropriately.

@tokk-nv what do you think? Sounds more efficient and clean, no?

tokk-nv marked this pull request as ready for review August 24, 2025 07:46

OriNachum requested a review from Copilot August 24, 2025 15:51

Copilot AI reviewed Aug 24, 2025

View reviewed changes

OriNachum approved these changes Aug 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Ollama test script: resolve port conflicts and add dynamic CUDA version detection #1323

Fix Ollama test script: resolve port conflicts and add dynamic CUDA version detection #1323

Uh oh!

tokk-nv commented Aug 24, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 24, 2025

Uh oh!

Copilot AI Aug 24, 2025

Uh oh!

OriNachum Aug 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix Ollama test script: resolve port conflicts and add dynamic CUDA version detection #1323

Are you sure you want to change the base?

Fix Ollama test script: resolve port conflicts and add dynamic CUDA version detection #1323

Uh oh!

Conversation

tokk-nv commented Aug 24, 2025

Problem

Side Issues

Solution

Testing

Impact

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Aug 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 24, 2025

Choose a reason for hiding this comment

Uh oh!

OriNachum Aug 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants