docs-1/src/oss/python/integrations/chat/oci_generative_ai.mdx at 30b22050e34adff1b4745080a2790334d763121d · fede-kamel/docs-1

title	ChatOCIGenAI integration
description	Integrate with ChatOCIGenAI chat model using LangChain Python.

This doc will help you get started with Oracle Cloud Infrastructure (OCI) Generative AI chat models. OCI Generative AI is a fully managed service providing state-of-the-art, customizable large language models covering a wide range of use cases through a single API. Access ready-to-use pretrained models or create and host fine-tuned custom models on dedicated AI clusters.

For detailed documentation, see the OCI Generative AI documentation and API reference.

Overview

Integration details

Class	Package	Serializable	JS support	Downloads	Version
ChatOCIGenAI	langchain-oci	beta	❌

Model features

Tool calling	Structured output	Image input	Audio input	Video input	Token-level streaming	Native async	Token usage	Logprobs
✅	✅	✅	✅ (Gemini)	✅ (Gemini)	✅	✅	✅	❌

Setup

Installation

```bash pip pip install -qU langchain-oci oci ```

uv add langchain-oci oci

Credentials

Set up authentication with the OCI CLI (creates ~/.oci/config):

oci setup config

For other auth methods (session tokens, instance principals), see OCI SDK authentication.

Instantiation

from langchain_oci import ChatOCIGenAI

llm = ChatOCIGenAI(
    model_id="meta.llama-3.3-70b-instruct",
    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
    compartment_id="ocid1.compartment.oc1..your-compartment-id",
    model_kwargs={"temperature": 0.7, "max_tokens": 500},  # Optional
)

Key parameters:

model_id - The model to use (see available models)
service_endpoint - Regional endpoint (us-chicago-1, eu-frankfurt-1, etc.)
compartment_id - Your OCI compartment OCID
model_kwargs - Model settings like temperature, max_tokens

Invocation

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
print(ai_msg.content)

J'adore la programmation.

Multi-turn Conversations

from langchain.messages import HumanMessage, AIMessage

messages = [
    HumanMessage(content="Hi, I'm Alice."),
    AIMessage(content="Hello Alice! How can I help you today?"),
    HumanMessage(content="What's my name?"),
]

response = llm.invoke(messages)
print(response.content)

Your name is Alice.

Streaming

Get responses as they're generated:

for chunk in llm.stream(messages):
    print(chunk.content, end="", flush=True)

Async

Use async for concurrent requests or non-blocking applications:

import asyncio

# Async generation
response = await llm.ainvoke("What is 2+2?")

# Async streaming
async for chunk in llm.astream("Tell me a story"):
    print(chunk.content, end="")

# Run multiple requests concurrently
results = await asyncio.gather(
    llm.ainvoke("What is 2+2?"),
    llm.ainvoke("What is 3+3?"),
)

Tool Calling

Give models access to external functions (APIs, databases, etc.):

from langchain.tools import tool

@tool
def get_weather(city: str) -> str:
    """Get the weather for a city."""
    # In production, call a weather API
    return f"Weather in {city}: 72°F, sunny"

llm_with_tools = llm.bind_tools([get_weather])
response = llm_with_tools.invoke("What's the weather in Chicago?")

# Model decides to call the tool
print(response.tool_calls)
# [{'name': 'get_weather', 'args': {'city': 'Chicago'}, 'id': 'call_1'}]

Parallel tools (Llama 4+ only) execute multiple tools simultaneously:

llm = ChatOCIGenAI(model_id="meta.llama-4-scout-17b-16e-instruct", ...)
llm_with_tools = llm.bind_tools(
    [get_weather, get_time],
    parallel_tool_calls=True,
)
# "Weather in Chicago and time in NYC?" → calls both tools at once

Structured Output

Extract data into Pydantic models for type-safe parsing:

from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int
    email: str

structured_llm = llm.with_structured_output(Person)
result = structured_llm.invoke("John is 30 years old, email john@example.com")

print(result.name)   # "John"
print(result.age)    # 30
print(result.email)  # "john@example.com"

Vision & Multimodal

Analyze images with vision-capable models:

from langchain.messages import HumanMessage
from langchain_oci import ChatOCIGenAI, load_image

llm = ChatOCIGenAI(model_id="meta.llama-3.2-90b-vision-instruct", ...)

message = HumanMessage(content=[
    {"type": "text", "text": "What's in this image?"},
    load_image("./photo.jpg"),  # Or use a URL
])

response = llm.invoke([message])

Vision-capable models: Llama 3.2 Vision, Gemini 2.0/2.5, Grok 4, Command A Vision

Gemini Multimodal (PDF, Video, Audio)

Gemini models process PDFs, videos, and audio:

import base64
from langchain.messages import HumanMessage

llm = ChatOCIGenAI(model_id="google.gemini-2.5-flash", ...)

# Load file as base64
with open("document.pdf", "rb") as f:
    data = base64.b64encode(f.read()).decode()

message = HumanMessage(content=[
    {"type": "text", "text": "Summarize this document"},
    {"type": "media", "data": data, "mime_type": "application/pdf"}
])

response = llm.invoke([message])

Supported formats: PDF, MP4/MOV video, MP3/WAV audio (Gemini 2.0/2.5 only)

Configuration

Control model behavior with model_kwargs:

llm = ChatOCIGenAI(
    model_id="meta.llama-3.3-70b-instruct",
    model_kwargs={
        "temperature": 0.7,    # Creativity (0-1)
        "max_tokens": 500,     # Response length limit
        "top_p": 0.9,         # Nucleus sampling
    },
    # ... other params
)

Available Models

Provider	Example Models	Key Features
Meta	Llama 3.2/3.3/4 (Scout, Maverick)	Vision, parallel tools
Google	Gemini 2.0/2.5 Flash, Pro	PDF, video, audio
xAI	Grok 3, Grok 4	Vision, reasoning
Cohere	Command R+, Command A	RAG, vision

See the OCI model catalog for the complete list and regional availability.

API Reference

For detailed documentation of all ChatOCIGenAI features and configurations, see the API reference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overview

Integration details

Model features

Setup

Installation

Credentials

Instantiation

Invocation

Multi-turn Conversations

Streaming

Async

Tool Calling

Structured Output

Vision & Multimodal

Gemini Multimodal (PDF, Video, Audio)

Configuration

Available Models

API Reference

Related

FilesExpand file tree

oci_generative_ai.mdx

Latest commit

History

oci_generative_ai.mdx

File metadata and controls

Overview

Integration details

Model features

Setup

Installation

Credentials

Instantiation

Invocation

Multi-turn Conversations

Streaming

Async

Tool Calling

Structured Output

Vision & Multimodal

Gemini Multimodal (PDF, Video, Audio)

Configuration

Available Models

API Reference

Related