Azure-Samples
diff --git a/‎.env.template‎
Lines changed: 13 additions & 0 deletions b/‎.env.template‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 75 additions & 5 deletions b/‎README.md‎
Lines changed: 75 additions & 5 deletions
diff --git a/‎api_documentation.md‎
Lines changed: 61 additions & 2 deletions b/‎api_documentation.md‎
Lines changed: 61 additions & 2 deletions
diff --git a/‎demo/mistral-dataset/output_schema.json‎
Lines changed: 46 additions & 0 deletions b/‎demo/mistral-dataset/output_schema.json‎
Lines changed: 46 additions & 0 deletions
diff --git a/‎demo/mistral-dataset/system_prompt.txt‎
Lines changed: 19 additions & 0 deletions b/‎demo/mistral-dataset/system_prompt.txt‎
Lines changed: 19 additions & 0 deletions
diff --git a/‎frontend/process_files.py‎
Lines changed: 3 additions & 3 deletions b/‎frontend/process_files.py‎
Lines changed: 3 additions & 3 deletions
@@ -18,6 +18,19 @@ AZURE_OPENAI_ENDPOINT=https://your-openai-account.openai.azure.com/
 AZURE_OPENAI_KEY=your-openai-api-key
 AZURE_OPENAI_MODEL_DEPLOYMENT_NAME=gpt-4
 
+# OCR Provider Configuration
+# Choose which OCR provider to use: "azure" or "mistral" (default: azure)
+OCR_PROVIDER=azure
+
+# Azure Document Intelligence Configuration (for OCR_PROVIDER=azure)
+DOCUMENT_INTELLIGENCE_ENDPOINT=https://your-doc-intelligence.cognitiveservices.azure.com/
+
+# Mistral Document AI Configuration (for OCR_PROVIDER=mistral)
+# Only required if OCR_PROVIDER is set to "mistral"
+MISTRAL_DOC_AI_ENDPOINT=https://your-endpoint.services.ai.azure.com/providers/mistral/azure/ocr
+MISTRAL_DOC_AI_KEY=your-mistral-api-key
+MISTRAL_DOC_AI_MODEL=mistral-document-ai-2505
+
 # To get your Principal ID, run:
 # az ad signed-in-user show --query id --output tsv
 
 
@@ -35,6 +35,7 @@ Traditional OCR solutions extract text but miss the context. AI-only approaches
 
 ### 🔍 **Intelligent Document Understanding**
 - **Hybrid AI Pipeline**: Combines OCR precision with LLM reasoning
+- **Multiple OCR Providers**: Azure Document Intelligence or Mistral Document AI
 - **Context-Aware Extraction**: Understands relationships between data points
 - **Multi-Format Support**: PDFs, images, forms, invoices, medical records
 - **Zero-Shot Learning**: Works on new document types without training
@@ -81,9 +82,12 @@ graph TB
     
     subgraph "🧠 AI Processing Engine"
         B --> D
-        D --> E[🔍 Azure Document Intelligence]
+        D --> E{🔍 OCR Provider}
+        E -->|Azure| E1[Azure Document Intelligence]
+        E -->|Mistral| E2[Mistral Document AI]
         D --> F[🤖 GPT-4 Vision]
-        E --> G[⚙️ Hybrid Processing Pipeline]
+        E1 --> G[⚙️ Hybrid Processing Pipeline]
+        E2 --> G
         F --> G
     end
     
@@ -105,6 +109,8 @@ graph TB
     style C fill:#e8f5e8,stroke:#388e3c,stroke-width:2px
     style D fill:#fff3e0,stroke:#f57c00,stroke-width:2px
     style E fill:#fce4ec,stroke:#c2185b,stroke-width:2px
+    style E1 fill:#fce4ec,stroke:#c2185b,stroke-width:2px
+    style E2 fill:#fce4ec,stroke:#c2185b,stroke-width:2px
     style F fill:#e0f2f1,stroke:#00695c,stroke-width:2px
     style G fill:#fff8e1,stroke:#ffa000,stroke-width:2px
     style H fill:#f1f8e9,stroke:#558b2f,stroke-width:2px
@@ -124,7 +130,7 @@ graph TB
 | **📱 Frontend UI** | Streamlit (Optional) | Interactive document management interface |
 | **📁 Document Storage** | Azure Blob Storage | Secure, scalable document repository |
 | **🗄️ Metadata Database** | Azure Cosmos DB | Results, configurations, and analytics |
-| **🔍 OCR Engine** | Azure Document Intelligence | Structured text and layout extraction |
+| **🔍 OCR Engine** | Azure Document Intelligence or Mistral Document AI | Structured text and layout extraction |
 | **🧠 AI Reasoning** | Azure OpenAI (GPT-4 Vision) | Contextual understanding and extraction |
 | **🏗️ Container Registry** | Azure Container Registry | Private, secure container images |
 | **🔒 Security** | Managed Identity + RBAC | Zero-credential architecture |
@@ -299,7 +305,71 @@ Datasets are managed through the Streamlit frontend interface (deployed automati
 
 ---
 
-## 🖥️ Frontend Interface: User-Friendly Document Management
+### � OCR Provider Configuration
+
+ARGUS supports **two OCR providers** for document text extraction:
+
+- **Azure Document Intelligence** (Default): Microsoft's enterprise OCR service with advanced layout understanding
+- **Mistral Document AI**: Mistral's document processing service with markdown-optimized output
+
+<details>
+<summary><b>🔧 Configure OCR Provider</b></summary>
+
+**Via Frontend (Recommended)**:
+1. Navigate to **Settings** tab in the web interface
+2. Select **OCR Provider** section
+3. Choose your provider:
+   - **Azure**: Uses Azure Document Intelligence (automatically configured during deployment)
+   - **Mistral**: Requires additional configuration (endpoint, API key, model name)
+4. For Mistral, enter:
+   - **Mistral Endpoint**: Your Mistral Document AI API endpoint URL
+   - **Mistral API Key**: Your Mistral API authentication key
+   - **Mistral Model**: Model name (default: `mistral-document-ai-2505`)
+5. Click **"Update OCR Provider"** to apply changes
+
+**Via Environment Variables**:
+Set the following environment variables in your deployment:
+
+```bash
+# Choose OCR provider
+OCR_PROVIDER=mistral  # or "azure" (default)
+
+# Mistral-specific configuration (only needed if OCR_PROVIDER=mistral)
+MISTRAL_DOC_AI_ENDPOINT=https://your-endpoint.services.ai.azure.com/providers/mistral/azure/ocr
+MISTRAL_DOC_AI_KEY=your-mistral-api-key
+MISTRAL_DOC_AI_MODEL=mistral-document-ai-2505
+```
+
+**Update via Azure Portal**:
+1. Navigate to Azure Portal → Container Apps → Your Backend App
+2. Go to **Settings** → **Environment variables**
+3. Add/update the variables listed above
+4. **Restart** the container app
+
+**Update via Azure CLI**:
+```bash
+# Switch to Mistral
+az containerapp update \
+  --name <your-backend-app-name> \
+  --resource-group <your-resource-group> \
+  --set-env-vars \
+    OCR_PROVIDER="mistral" \
+    MISTRAL_DOC_AI_ENDPOINT="https://your-endpoint.../ocr" \
+    MISTRAL_DOC_AI_KEY="your-api-key" \
+    MISTRAL_DOC_AI_MODEL="mistral-document-ai-2505"
+
+# Switch back to Azure
+az containerapp update \
+  --name <your-backend-app-name> \
+  --resource-group <your-resource-group> \
+  --set-env-vars OCR_PROVIDER="azure"
+```
+
+**Note**: OCR provider selection is configured at the solution level and applies to all document processing operations.
+
+</details>
+
+---
 
 The Streamlit frontend is **automatically deployed** with `azd up` and provides a user-friendly interface for document management.
 
@@ -677,7 +747,7 @@ Contributors will be recognized in:
 | Resource | Description | Link |
 |----------|-------------|------|
 | **📚 Documentation** | Complete setup and usage guides | [docs/](docs/) |
-| **🐛 Issue Tracker** | Bug reports and feature requests | [GitHub Issues](https://github.com/Azure-Samples/ARGUS/issues) |
+| **�🐛 Issue Tracker** | Bug reports and feature requests | [GitHub Issues](https://github.com/Azure-Samples/ARGUS/issues) |
 | **💡 Discussions** | Community Q&A and ideas | [GitHub Discussions](https://github.com/Azure-Samples/ARGUS/discussions) |
 | **📧 Team Contact** | Direct contact for enterprise needs | See team section below |
 
 
@@ -154,11 +154,21 @@ Retrieve current system configuration from Cosmos DB including datasets, prompts
           "invoice_number": {"type": "string"},
           "total_amount": {"type": "number"}
         }
+      },
+      "processing_options": {
+        "include_ocr": true,
+        "include_images": true,
+        "enable_summary": true,
+        "enable_evaluation": true,
+        "ocr_provider": "azure"
       }
     },
     "medical-dataset": {
       "system_prompt": "Extract medical information...",
-      "output_schema": {...}
+      "output_schema": {...},
+      "processing_options": {
+        "ocr_provider": "mistral"
+      }
     }
   }
 }
@@ -769,12 +779,61 @@ For production file uploads, you need to:
       "system_prompt": "string",
       "output_schema": "object",
       "max_pages": "number",
-      "options": "object"
+      "processing_options": {
+        "include_ocr": "boolean",
+        "include_images": "boolean",
+        "enable_summary": "boolean",
+        "enable_evaluation": "boolean",
+        "ocr_provider": "string (azure|mistral)"
+      }
     }
   }
 }
 ```
 
+### OCR Provider Configuration
+
+ARGUS supports two OCR providers for document text extraction:
+
+1. **Azure Document Intelligence** (default)
+   - Uses Azure's Document Intelligence service
+   - Requires `DOCUMENT_INTELLIGENCE_ENDPOINT` environment variable
+   - Configured with `"ocr_provider": "azure"`
+
+2. **Mistral Document AI** (alternative)
+   - Uses Mistral's Document AI API
+   - Requires `MISTRAL_DOC_AI_ENDPOINT` and `MISTRAL_DOC_AI_KEY` environment variables
+   - Configured with `"ocr_provider": "mistral"`
+   - Supports base64-encoded PDFs and images
+   - Can use structured extraction with bbox annotation
+
+**Example Configuration with Mistral:**
+```json
+{
+  "id": "configuration",
+  "partitionKey": "configuration",
+  "datasets": {
+    "medical-dataset": {
+      "system_prompt": "Extract medical information...",
+      "output_schema": {...},
+      "processing_options": {
+        "include_ocr": true,
+        "include_images": true,
+        "enable_summary": true,
+        "enable_evaluation": true,
+        "ocr_provider": "mistral"
+      }
+    }
+  }
+}
+```
+
+**Environment Variables Required for Mistral:**
+```bash
+MISTRAL_DOC_AI_ENDPOINT=https://your-endpoint.services.ai.azure.com/providers/mistral/azure/ocr
+MISTRAL_DOC_AI_KEY=your-mistral-api-key
+```
+
 ### Event Grid Event Model
 ```json
 {
 
@@ -0,0 +1,46 @@
+{
+    "Customer Name": "",
+    "Invoice Number": "",
+    "Date": "",
+    "Billing info": {
+        "Customer": "",
+        "Customer ID": "",
+        "Address": "",
+        "Phone": ""
+    },
+    "Payment Due": "",
+    "Salesperson": "",
+    "Payment Terms": "",
+    "Shipping info": {
+        "Recipient": "",
+        "Address": "",
+        "Phone": ""
+    },
+    "Delivery Date": "",
+    "Shipping Method": "",
+    "Shipping Terms": "",
+    "Table": {
+        "Items": [
+            {
+                "Qty": "",
+                "Item#": "",
+                "Description": "",
+                "Unit price": "",
+                "Discount": "",
+                "Line total": ""
+            }
+        ],
+        "Total Discount": "",
+        "Subtotal": "",
+        "Sales Tax": "",
+        "Total": ""
+    },
+    "Footer": {
+        "Customer Name": "",
+        "Address": "",
+        "Website": "",
+        "Phone number": "",
+        "Fax number": "",
+        "Email": ""
+    }
+}
@@ -0,0 +1,19 @@
+You are an expert document processing assistant. Your task is to extract structured information from invoices.
+
+Carefully analyze the provided document and extract all relevant information according to the schema provided.
+
+For each field:
+- Extract the exact text as it appears in the document
+- If a field is not present, leave it as an empty string
+- For numerical values, extract them exactly as shown (including currency symbols if present)
+- For dates, preserve the original format
+- For tables, extract all rows of items
+
+Pay special attention to:
+1. Invoice number and date
+2. Billing and shipping addresses
+3. All line items in the table
+4. Total amounts and taxes
+5. Contact information in the footer
+
+Be thorough and accurate in your extraction.
@@ -196,20 +196,20 @@ def process_files_tab():
 
             with col_a:
                 include_ocr = st.checkbox(
-                    "📄 Run OCR and use it in GPT Extraction", 
+                    "📄 Run OCR Processing", 
                     value=processing_options.get("include_ocr", True),
                     help="Extract and analyze the text content from your documents using Optical Character Recognition (OCR). This captures all written information including tables, forms, and structured data. Essential for text-heavy documents like contracts, invoices, and reports. When enabled, the AI can understand and extract information from the document's text content."
                 )
 
                 include_images = st.checkbox(
-                    "🖼️ Split in Images and use them in GPT Extraction", 
+                    "🖼️ Run GPT Vision", 
                     value=processing_options.get("include_images", True),
                     help="Process document pages as images so the AI can visually understand layouts, charts, diagrams, handwritten notes, and visual elements that OCR might miss. This is particularly valuable for forms with checkboxes, complex layouts, signatures, charts, or documents where visual context matters. Combines with OCR for the most comprehensive analysis."
                 )
 
             # Validation: Ensure at least one of OCR or Images is enabled
             if not include_ocr and not include_images:
-                st.error("⚠️ **Validation Error**: You must enable at least one of 'Include OCR Text' or 'Include Images' for GPT extraction to work properly.")
+                st.error("⚠️ **Validation Error**: You must enable at least one of 'OCR' or 'GPT Vision' for GPT extraction to work properly.")
                 # Force at least one to be true
                 include_ocr = True
                 st.warning("🔧 **Auto-correction**: Automatically re-enabled 'Include OCR Text' to ensure proper functionality.")