Skip to content

Latest commit

 

History

History
702 lines (486 loc) · 24.9 KB

File metadata and controls

702 lines (486 loc) · 24.9 KB

Provider Feature Compatibility Reference

Last Updated: 2025-12-31 Test Suite: continuous-test-suite.ts (19 comprehensive tests) Providers Tested: 11 providers across CSV, PDF, MCP tools, business tools, and enterprise features


Executive Summary

After comprehensive testing across 11 AI providers (plus 4 newly integrated providers), we have identified 4 production-ready providers with 100% feature compatibility and documented specific technical limitations and configuration requirements for all others.

Production-Ready Providers (100% Compatibility) ⭐⭐⭐

Provider Test Score Duration Status Best For
Google AI Studio 19/19 (100%) 401s ✅ Perfect Fast prototyping, full multimodal support
Vertex AI 19/19 (100%) 449s ✅ Perfect Enterprise deployments, excellent performance
OpenAI 19/19 (100%) 1413s ✅ Perfect Industry standard, comprehensive features
LiteLLM 19/19 (100%) 552s ✅ Perfect Universal proxy for 100+ models

All features supported:

  • ✅ CSV processing (6/6 tests)
  • ✅ PDF processing (6/6 tests)
  • ✅ MCP external tools (4/4 tests)
  • ✅ Business tools (2/2 tests)
  • ✅ Enterprise features (1/1 test)

Complete Feature Support Matrix

Provider CSV PDF MCP Tools Business Tools Structured Output Enterprise Score Status
Google AI Studio ✅ 6/6 ✅ 6/6 ✅ 4/4 ✅ 2/2 ⚠️ Partial* ✅ 1/1 19/19 Production
Vertex AI ✅ 6/6 ✅ 6/6 ✅ 4/4 ✅ 2/2 ⚠️ Partial* ✅ 1/1 19/19 Production
LiteLLM ✅ 6/6 ✅ 6/6 ✅ 4/4 ✅ 2/2 ✅ Full ✅ 1/1 19/19 Production
OpenAI ✅ 6/6 ✅ 6/6 ✅ 4/4 ✅ 2/2 ✅ Full ✅ 1/1 19/19 Production
Azure OpenAI ✅ 6/6 ❌ 0/6 ✅ 4/4 ✅ 2/2 ✅ Full ✅ 1/1 13/19 Production*
Mistral ✅ 6/6 ❌ 0/6 ⚠️ 2/4 ❌ 0/2 ✅ Full ✅ 1/1 9/19 Development
Ollama ⚠️ 3/6 ⚠️ 1/6 ❌ 0/4 ❌ 0/2 ⚠️ Limited ✅ 1/1 7/19 Development
Anthropic 🔧 0/6 🔧 0/6 🔧 0/4 🔧 0/2 ✅ Full ✅ 1/1 2/19** Config
Bedrock 🔧 0/6 🔧 0/6 🔧 0/4 🔧 0/2 ✅ Full ✅ 1/1 2/19** Config
Hugging Face 🔧 0/6 🔧 0/6 🔧 0/4 🔧 0/2 ⚠️ Limited ✅ 1/1 2/19** Config
SageMaker 🔧 0/6 🔧 0/6 🔧 0/4 🔧 0/2 ⚠️ Limited ✅ 1/1 2/19** Config
DeepSeek ✅ 6/6 ❌ 0/6 ✅ 4/4 ✅ 2/2 ✅ Full ✅ 1/1 N/A† Cloud
NVIDIA NIM ✅ 6/6 ❌ 0/6 ⚠️ Model ⚠️ Model ⚠️ Model ✅ 1/1 N/A† Cloud
LM Studio ⚠️ Model ❌ 0/6 ⚠️ Model ⚠️ Model ⚠️ Model ✅ 1/1 N/A† Local
llama.cpp ⚠️ Model ❌ 0/6 ⚠️ Model ⚠️ Model ⚠️ Model ✅ 1/1 N/A† Local

*Google providers: Cannot combine tools + schemas (use disableTools: true). Google API limitation, not NeuroLink bug.

†Not yet run through the standard 19-test suite. Capability flags based on provider source code audit (PR #997).

Legend:

  • ✅ Fully supported
  • ⚠️ Partially supported
  • ❌ Not supported (technical limitation)
  • 🔧 Configuration/billing issue
  • * Production-ready for non-PDF workloads
  • ** Configuration issue, not technical limitation

Model-Level Feature Compatibility

Gemini 3 Models

Model Streaming Tools Vision Extended Thinking JSON Schema
gemini-3-flash ✓†
gemini-3-pro ✓†

JSON Schema Limitation: Gemini 3 models support JSON Schema for structured output, but cannot combine tools with JSON Schema in the same request. When using structured output with a schema, you must disable tools by setting disableTools: true. This is a Google API limitation, not a NeuroLink bug.

Example Usage:

// Structured output with JSON Schema (tools must be disabled)
await neurolink.generate({
  prompt: "Extract user information",
  schema: UserSchema,
  provider: "google-ai-studio",
  model: "gemini-3-flash",
  disableTools: true, // Required when using schema
});

// Tools work normally without schema
await neurolink.generate({
  prompt: "Search for documents",
  provider: "google-ai-studio",
  model: "gemini-3-pro",
  // Tools enabled by default
});

Provider Tier Classification

Tier 1: Perfect (100%) - Production Ready for All Features ⭐⭐⭐

Recommended for production use with full feature support

Google AI Studio

  • Score: 19/19 (100%)
  • Duration: 401 seconds
  • Strengths: Fastest test execution, reliable, full multimodal support
  • Use Cases:
    • Rapid prototyping with free tier
    • Production deployments requiring speed
    • Full CSV + PDF + image processing
    • MCP tool integration
  • Setup: Simple API key configuration

Vertex AI

  • Score: 19/19 (100%)
  • Duration: 449 seconds
  • Strengths: Enterprise-grade, excellent performance, Google Cloud integration
  • Use Cases:
    • Enterprise deployments with SLA requirements
    • Google Cloud Platform integration
    • Multi-region deployments
    • Advanced analytics pipelines
  • Setup: GCP service account or ADC

OpenAI

  • Score: 19/19 (100%)
  • Duration: 1413 seconds (slower due to rate limits)
  • Strengths: Industry standard, comprehensive ecosystem, extensive documentation
  • Use Cases:
    • Production applications requiring proven stability
    • Integration with OpenAI ecosystem
    • GPT-4o and o1 model access
  • Setup: API key configuration
  • Note: Longer duration due to conservative rate limiting (30,000 TPM)

LiteLLM

  • Score: 19/19 (100%)
  • Duration: 552 seconds
  • Strengths: Universal proxy for 100+ models, automatic load balancing
  • Use Cases:
    • Multi-provider routing and fallback
    • Access to 100+ models through single interface
    • Cost optimization across providers
    • Load balancing and caching
  • Setup: LiteLLM proxy server + provider credentials

Structured Output Support Details

Full Support (✅):

  • OpenAI, Anthropic, Azure OpenAI, Bedrock, Mistral, LiteLLM
  • Can use tools and schemas simultaneously
  • No configuration required

Partial Support (⚠️):

  • Google AI Studio and Vertex AI (Gemini models)
  • Limitation: Cannot combine tools with schemas
  • Solution: Use disableTools: true when using schemas
  • Reason: Google API limitation (documented by Google)
  • Future: Future Gemini versions may support both - check official documentation for updates

Example:

// Google providers require disableTools
await neurolink.generate({
  schema: MySchema,
  provider: "vertex",
  disableTools: true, // Required for Google
});

// Other providers work without restriction
await neurolink.generate({
  schema: MySchema,
  provider: "openai", // No restriction
});

Tier 2: Good (68%) - Production Ready for CSV + Tools ⭐⭐

Recommended for production use when PDF support is not required

Azure OpenAI

  • Score: 13/19 (68.4%)
  • Duration: 351 seconds
  • Status: ⚠️ Production-ready with limitations

✅ Passing Tests (13/19):

  • ✅ CSV processing (6/6) - All CSV tests pass
  • ✅ MCP external tools (4/4) - Full tool integration support
  • ✅ Business tools (2/2) - Custom tool execution works
  • ✅ Enterprise features (1/1) - Proxy and compliance support

❌ Failing Tests (6/19):

  • ❌ All PDF tests (6/6) - Model limitation
    • CLI Generate PDF
    • CLI Stream PDF
    • CLI Stream Two PDF Comparison
    • CLI Stream PDF and CSV
    • SDK Generate PDF
    • SDK Stream PDF

Root Cause:

Error: Invalid Value: 'file'. This model does not support file content types.

Technical Explanation: Azure OpenAI models do not support the file content type required for PDF processing in the Vercel AI SDK. This is a model architecture limitation, not a configuration issue.

Production Recommendation:

  • ✅ Use for: CSV data analysis, MCP tool integration, business logic
  • ❌ Avoid for: PDF processing
  • 🔄 Fallback strategy: Use Vertex AI or Google AI Studio for PDF requirements

Tier 3: Partial (36-47%) - Development/Testing Only ⭐

NOT recommended for production use - limited feature support

Mistral AI

  • Score: 9/19 (47.4%)
  • Duration: 363 seconds
  • Status: ⚠️ Development/testing only

✅ Passing Tests (9/19):

  • ✅ CSV processing (6/6) - All CSV tests pass
  • ✅ SDK tools (2/2) - SDK Generate and Stream work
  • ✅ Enterprise features (1/1) - Proxy support

❌ Failing Tests (10/19):

  • ❌ All PDF tests (6/6) - API limitation
  • ❌ CLI external tools (2/2) - CLI tool integration issues
  • ❌ Business tools (2/2) - Limited tool support

Root Cause (PDF failures):

Error: UnsupportedFunctionalityError: 'File content parts in user messages' functionality not supported.

Technical Explanation: Mistral's API fundamentally does not support file content parts in user messages. This is a core API limitation, not a bug or configuration issue.

Production Recommendation:

  • ✅ Use for: CSV data analysis in SDK mode
  • ❌ Avoid for: PDF processing, CLI tool integration
  • 📚 Reference: See MISTRAL_PDF_FIX_SUMMARY.md for detailed investigation

Ollama

  • Score: 7/19 (36.8%)
  • Duration: 1236 seconds
  • Status: ⚠️ Local development only

✅ Passing Tests (7/19):

  • ✅ Some CSV tests (3/6) - Partial support
  • ✅ SDK tools (2/2) - Basic tool execution
  • ✅ CLI Stream PDF and CSV (1/1) - Limited multimodal
  • ✅ Enterprise features (1/1) - Local proxy support

❌ Failing Tests (12/19):

  • ❌ Most CSV tests (3/6) - Inconsistent results
  • ❌ Most PDF tests (5/6) - Model-dependent
  • ❌ CLI external tools (2/2) - Tool integration issues
  • ❌ Business tools (2/2) - Limited support

Technical Explanation: Ollama is designed for local model execution. Performance and feature support varies significantly based on the specific model being used (Llama, Mistral, etc.).

Production Recommendation:

  • ✅ Use for: Local development, privacy-critical testing
  • ❌ Avoid for: Production workloads, consistent behavior requirements
  • 🎯 Best for: Experimentation with local models

Tier 4: Limited (10.5%) - Configuration Issues Only 🔧

Configuration/billing issues preventing testing - NOT technical limitations

These providers are currently limited to 2/19 tests passing due to configuration or billing issues, not technical capabilities. With proper setup, they are expected to achieve much higher compatibility scores.

Provider Score Issue Type Fix Required Expected Score After Fix
Anthropic 2/19 (10.5%) 💳 Billing Add API credits 90%+ (full multimodal)
Bedrock 2/19 (10.5%) 🔑 Credentials Fix AWS token 70%+ (model-dependent)
Hugging Face 2/19 (10.5%) 💳 Billing Add payment method 60%+ (model-dependent)
SageMaker 2/19 (10.5%) 🔑 Credentials Fix AWS token 60%+ (model-dependent)

Anthropic (Claude) - API Credit Exhaustion

Error:

APICallError: Your credit balance is too low to access the Anthropic API.
Please go to Plans & Billing to upgrade or purchase credits.

Status: All 17 test failures are due to insufficient API credits, NOT technical limitations.

Passing Tests (2/19):

  • ✅ CLI Stream CSV and Screenshot (skipped - no fixture available)
  • ✅ Enterprise Proxy Support (no API call required)

Expected Capability: Anthropic Claude models (3.5 Sonnet, 3.7 Sonnet) support multimodal content including images and PDFs. Expected to achieve 90%+ compatibility once credits are added.

Fix: Add credits at https://console.anthropic.com/settings/plans

AWS Bedrock - Credential Issue

Error:

BedrockServiceException: The security token included in the request is invalid
Region: ap-south-1

Status: AWS credentials are invalid or expired.

Passing Tests (2/19):

  • ✅ CLI Stream CSV and Screenshot (skipped)
  • ✅ Enterprise Proxy Support

Expected Capability: Bedrock provides access to multiple foundation models (Claude, Llama, Titan) and should support multimodal features once credentials are configured. Expected 70%+ compatibility (varies by model).

Fix:

# Check current credentials
aws sts get-caller-identity

# Configure valid credentials
aws configure

Hugging Face - Payment Required

Error:

APICallError: Payment Required

Status: Payment/billing configuration needed.

Passing Tests (2/19):

  • ✅ CLI Stream CSV and Screenshot (skipped)
  • ✅ Enterprise Proxy Support

Expected Capability: Hugging Face provides access to open-source models via inference endpoints. Multimodal support depends on selected model. Expected 60%+ compatibility after billing setup.

Fix: Add payment method to Hugging Face account

AWS SageMaker - Credential Issue

Error:

SageMaker endpoint invocation failed: The security token included in the request is invalid

Status: AWS credentials are invalid or expired (same as Bedrock).

Passing Tests (2/19):

  • ✅ CLI Stream CSV and Screenshot (skipped)
  • ✅ Enterprise Proxy Support

Expected Capability: SageMaker allows deployment of custom models. Feature support depends on the deployed model. Expected 60%+ compatibility after credential fix.

Fix: Update AWS credentials (same as Bedrock)


Technical Limitations Summary

Azure OpenAI

  • Limitation: Model does not support file content type for PDFs
  • Impact: Cannot process PDF documents natively
  • Workaround: Extract text from PDFs before sending to Azure, or use fallback provider
  • Affected Features: All PDF processing (6 tests)

Mistral

  • Limitation: API does not support file content parts in user messages
  • Impact: Cannot process PDF documents at all
  • Workaround: None available - fundamental API limitation
  • Affected Features: All PDF processing (6 tests), CLI tool integration (2 tests)
  • Reference: See MISTRAL_PDF_FIX_SUMMARY.md for investigation details

Ollama

  • Limitation: Local model performance varies significantly by model
  • Impact: Inconsistent results across different models and operations
  • Workaround: Carefully select models, use for development/testing only
  • Affected Features: Various tests show inconsistent behavior

DeepSeek

  • Limitation: No vision / multimodal support; no PDF support
  • Impact: Cannot process images or documents
  • Workaround: Use OpenAI or Anthropic for vision/PDF workflows; DeepSeek for text-only tasks
  • Affected Features: All PDF tests (6/6), image processing

NVIDIA NIM

  • Limitation: Tool calling and vision are model-dependent; no PDF support
  • Impact: Not all hosted models support tools or vision
  • Workaround: Choose a tool-capable model (e.g., Llama 3.3 70B Instruct); use a vision-capable model for multimodal tasks
  • Affected Features: Model-dependent — check https://build.nvidia.com/models for capabilities per model

LM Studio

  • Limitation: All capabilities depend on the currently loaded model; requires LM Studio app running
  • Impact: ECONNREFUSED error if app is not started or no model is loaded
  • Workaround: Start LM Studio, load a model, click "Start Server"
  • Affected Features: CSV/PDF/tool support all model-dependent; structured output reliability varies on small models

llama.cpp

  • Limitation: Tool calling requires --jinja server flag; all capabilities depend on loaded GGUF model; requires llama-server process running
  • Impact: 400 error on tool calls if server was not started with --jinja; ECONNREFUSED if server is not running
  • Workaround: Start llama-server with: ./llama-server -m model.gguf --port 8080 --jinja
  • Affected Features: Tool support model + flag dependent; structured output reliability varies on small quantized models

Production Deployment Recommendations

For Maximum Feature Compatibility (100%)

Recommended Providers:

  • Google AI Studio - Best for: Speed, free tier, prototyping
  • Vertex AI - Best for: Enterprise, GCP integration, SLA requirements
  • OpenAI - Best for: Proven stability, ecosystem integration
  • LiteLLM - Best for: Multi-provider routing, 100+ model access

All features available:

  • ✅ CSV data analysis
  • ✅ PDF document processing
  • ✅ Image analysis
  • ✅ MCP external tool integration
  • ✅ Custom business tools
  • ✅ Enterprise proxy support

For CSV + Tools (No PDFs Required)

Recommended Providers:

  • Azure OpenAI - Best for: Microsoft ecosystem, enterprise security, Azure integration

Features available:

  • ✅ CSV data analysis (68% compatibility)
  • ✅ MCP external tools
  • ✅ Custom business tools
  • ✅ Enterprise features
  • ❌ PDF processing (use fallback provider)

Fallback Strategy:

// Primary provider for CSV and tools
const primaryProvider = "azure";

// Fallback to Vertex for PDF processing
const pdfProvider = "vertex";

if (hasPDFFiles(input)) {
  result = await neurolink.generate({ ...options, provider: pdfProvider });
} else {
  result = await neurolink.generate({ ...options, provider: primaryProvider });
}

For Development/Testing

Recommended Providers:

  • Mistral - Best for: CSV-only workflows, European compliance
  • Ollama - Best for: Local development, privacy testing

Use Cases:

  • CSV data analysis only
  • Privacy-critical testing
  • Local development without cloud dependencies
  • Experimentation with different models

Not Recommended For:

  • Production deployments
  • PDF processing requirements
  • Critical business workflows

Test Suite Details

Test Categories (19 total tests)

CSV Processing Tests (6 tests)

  1. CLI Generate CSV - Generate mode with CSV input
  2. CLI Stream CSV - Streaming mode with CSV input
  3. CLI Stream Two CSV Comparison - Compare multiple CSV files
  4. CLI Stream CSV and Screenshot - Mixed CSV and image analysis
  5. SDK Generate CSV - SDK generate with CSV
  6. SDK Stream CSV - SDK streaming with CSV

PDF Processing Tests (6 tests)

  1. CLI Generate PDF - Generate mode with PDF input
  2. CLI Stream PDF - Streaming mode with PDF input
  3. CLI Stream Two PDF Comparison - Compare multiple PDF files
  4. CLI Stream PDF and CSV - Mixed PDF and CSV analysis
  5. SDK Generate PDF - SDK generate with PDF
  6. SDK Stream PDF - SDK streaming with PDF

MCP External Tools Tests (4 tests)

  1. CLI Generate - External MCP tools via CLI generate
  2. CLI Stream - External MCP tools via CLI stream
  3. SDK Generate - External MCP tools via SDK generate
  4. SDK Stream - External MCP tools via SDK stream

Business Tools Tests (2 tests)

  1. SDK Business Tools - Custom tool registration and execution
  2. CLI Business Tools - Custom tools via CLI interface

Enterprise Features Tests (1 test)

  1. Enterprise Proxy Support - Proxy configuration and environment handling

Test Execution

Sequential Execution: Tests run one provider at a time to avoid resource contention and rate limit issues.

Rate Limiting:

  • OpenAI: 60-second delay between tests (30,000 TPM limit)
  • Other providers: 10-second delay between tests

Total Duration: Approximately 30-40 minutes for all 11 providers


Configuration Fixes Needed

Immediate Actions Required

  1. Anthropic: Add API credits

  2. Bedrock: Fix AWS credentials

    aws configure
    # Or update AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
    • Expected improvement: 2/19 → 13+/19 (70%+)
  3. SageMaker: Fix AWS credentials (same as Bedrock)

    • Expected improvement: 2/19 → 11+/19 (60%+)
  4. Hugging Face: Add payment method

No Fix Available

  1. Azure OpenAI: PDF limitation is a model architecture constraint

    • Recommendation: Use for CSV and tools, fallback to Vertex/Google AI Studio for PDFs
  2. Mistral: PDF limitation is a fundamental API constraint

    • Recommendation: Use for CSV-only workflows in SDK mode

Test Logs

All test logs are available in /tmp/neurolink-sequential-tests/:

  • test-openai.log - OpenAI 19/19 (100%)
  • test-vertex.log - Vertex 19/19 (100%)
  • test-google-ai-studio.log - Google AI Studio 19/19 (100%)
  • test-litellm.log - LiteLLM 19/19 (100%)
  • test-azure.log - Azure 13/19 (68%)
  • test-mistral.log - Mistral 9/19 (47%)
  • test-ollama.log - Ollama 7/19 (37%)
  • test-anthropic.log - Anthropic 2/19 (billing issue)
  • test-bedrock.log - Bedrock 2/19 (credential issue)
  • test-huggingface.log - Hugging Face 2/19 (billing issue)
  • test-sagemaker.log - SageMaker 2/19 (credential issue)

Recent Fixes and Improvements

Fix 1: File Handling System Prompt Enhancement (2025-11-02)

Providers affected: OpenAI, Vertex AI Issue: AI attempting to use GitHub MCP get_file_contents for local files Root Cause: File paths visible in context, AI confused about tool usage

Solution: Enhanced system prompt in src/lib/utils/messageBuilder.ts (lines 622-657) with file handling guidance:

if (hasCSVFiles || hasPDFFiles) {
  systemPrompt += `\n\nIMPORTANT FILE HANDLING INSTRUCTIONS:
- File content (${fileTypes.join(", ")}, images) is already processed and included in this message
- DO NOT use GitHub tools (get_file_contents, search_code, etc.) for local files
- Analyze the provided file content directly without attempting to fetch files
- GitHub MCP tools are ONLY for remote repository operations
- Use the file content shown in this message for your analysis`;
}

Result:

  • OpenAI: 18/19 → 19/19 (100%)
  • Vertex: CLI Stream PDF and CSV test passing

Fix 2: Case-Insensitive Test Validation (2025-11-02)

Provider affected: Vertex AI Issue: Test expecting "strict" but Vertex responding "Strict mode" Root Cause: Case-sensitive string matching with provider-specific capitalization

Solution: Case-insensitive comparison in test/continuous-test-suite.ts (lines 801-806):

// Before
const foundData = expectedData.filter((data) => result.content.includes(data));

// After
const contentLower = result.content.toLowerCase();
const foundData = expectedData.filter((data) =>
  contentLower.includes(data.toLowerCase()),
);

Result: Vertex: 18/19 → 19/19 (100%)


Conclusion

Primary Achievement:4 providers at 100% compatibility

The comprehensive testing reveals a mature ecosystem with multiple production-ready providers. Most "failures" are configuration/billing issues rather than technical limitations.

Key Insights:

  1. Production-Ready Options: 4 providers (Google AI Studio, Vertex AI, OpenAI, LiteLLM) provide full feature support
  2. Partial Support is Useful: Azure OpenAI at 68% is excellent for non-PDF workloads
  3. Technical Limitations are Clear: Only Azure and Mistral have actual feature limitations
  4. Configuration is Key: 4 providers need credential/billing fixes, not code changes

Next Steps for Users:

  1. For new projects: Start with Google AI Studio (free tier) or Vertex AI (enterprise)
  2. For existing Azure users: Use Azure for CSV/tools, add Vertex fallback for PDFs
  3. For cost optimization: Implement LiteLLM routing across multiple providers
  4. For privacy: Use Ollama for local development and testing

Maintenance:

  • Re-run test suite after provider API updates
  • Monitor provider changelog for new feature releases
  • Update this document quarterly or when adding new providers