| title | ChatOCIGenAI integration |
|---|---|
| description | Integrate with ChatOCIGenAI chat model using LangChain Python. |
This doc will help you get started with Oracle Cloud Infrastructure (OCI) Generative AI chat models. OCI Generative AI is a fully managed service providing state-of-the-art, customizable large language models covering a wide range of use cases through a single API. Access ready-to-use pretrained models or create and host fine-tuned custom models on dedicated AI clusters.
For detailed documentation, see the OCI Generative AI documentation and API reference.
| Class | Package | Serializable | JS support | Downloads | Version |
|---|---|---|---|---|---|
| ChatOCIGenAI | langchain-oci | beta | ❌ |
| Tool calling | Structured output | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
|---|---|---|---|---|---|---|---|---|
| ✅ | ✅ | ✅ | ✅ (Gemini) | ✅ (Gemini) | ✅ | ✅ | ✅ | ❌ |
uv add langchain-oci ociSet up authentication with the OCI CLI (creates ~/.oci/config):
oci setup configFor other auth methods (session tokens, instance principals), see OCI SDK authentication.
from langchain_oci import ChatOCIGenAI
llm = ChatOCIGenAI(
model_id="meta.llama-3.3-70b-instruct",
service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
compartment_id="ocid1.compartment.oc1..your-compartment-id",
model_kwargs={"temperature": 0.7, "max_tokens": 500}, # Optional
)Key parameters:
model_id- The model to use (see available models)service_endpoint- Regional endpoint (us-chicago-1,eu-frankfurt-1, etc.)compartment_id- Your OCI compartment OCIDmodel_kwargs- Model settings like temperature, max_tokens
messages = [
(
"system",
"You are a helpful assistant that translates English to French.",
),
("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
print(ai_msg.content)J'adore la programmation.
from langchain.messages import HumanMessage, AIMessage
messages = [
HumanMessage(content="Hi, I'm Alice."),
AIMessage(content="Hello Alice! How can I help you today?"),
HumanMessage(content="What's my name?"),
]
response = llm.invoke(messages)
print(response.content)Your name is Alice.
Get responses as they're generated:
for chunk in llm.stream(messages):
print(chunk.content, end="", flush=True)Use async for concurrent requests or non-blocking applications:
import asyncio
# Async generation
response = await llm.ainvoke("What is 2+2?")
# Async streaming
async for chunk in llm.astream("Tell me a story"):
print(chunk.content, end="")
# Run multiple requests concurrently
results = await asyncio.gather(
llm.ainvoke("What is 2+2?"),
llm.ainvoke("What is 3+3?"),
)Give models access to external functions (APIs, databases, etc.):
from langchain.tools import tool
@tool
def get_weather(city: str) -> str:
"""Get the weather for a city."""
# In production, call a weather API
return f"Weather in {city}: 72°F, sunny"
llm_with_tools = llm.bind_tools([get_weather])
response = llm_with_tools.invoke("What's the weather in Chicago?")
# Model decides to call the tool
print(response.tool_calls)
# [{'name': 'get_weather', 'args': {'city': 'Chicago'}, 'id': 'call_1'}]Parallel tools (Llama 4+ only) execute multiple tools simultaneously:
llm = ChatOCIGenAI(model_id="meta.llama-4-scout-17b-16e-instruct", ...)
llm_with_tools = llm.bind_tools(
[get_weather, get_time],
parallel_tool_calls=True,
)
# "Weather in Chicago and time in NYC?" → calls both tools at onceExtract data into Pydantic models for type-safe parsing:
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
email: str
structured_llm = llm.with_structured_output(Person)
result = structured_llm.invoke("John is 30 years old, email john@example.com")
print(result.name) # "John"
print(result.age) # 30
print(result.email) # "john@example.com"Analyze images with vision-capable models:
from langchain.messages import HumanMessage
from langchain_oci import ChatOCIGenAI, load_image
llm = ChatOCIGenAI(model_id="meta.llama-3.2-90b-vision-instruct", ...)
message = HumanMessage(content=[
{"type": "text", "text": "What's in this image?"},
load_image("./photo.jpg"), # Or use a URL
])
response = llm.invoke([message])Vision-capable models: Llama 3.2 Vision, Gemini 2.0/2.5, Grok 4, Command A Vision
Gemini models process PDFs, videos, and audio:
import base64
from langchain.messages import HumanMessage
llm = ChatOCIGenAI(model_id="google.gemini-2.5-flash", ...)
# Load file as base64
with open("document.pdf", "rb") as f:
data = base64.b64encode(f.read()).decode()
message = HumanMessage(content=[
{"type": "text", "text": "Summarize this document"},
{"type": "media", "data": data, "mime_type": "application/pdf"}
])
response = llm.invoke([message])Supported formats: PDF, MP4/MOV video, MP3/WAV audio (Gemini 2.0/2.5 only)
Control model behavior with model_kwargs:
llm = ChatOCIGenAI(
model_id="meta.llama-3.3-70b-instruct",
model_kwargs={
"temperature": 0.7, # Creativity (0-1)
"max_tokens": 500, # Response length limit
"top_p": 0.9, # Nucleus sampling
},
# ... other params
)| Provider | Example Models | Key Features |
|---|---|---|
| Meta | Llama 3.2/3.3/4 (Scout, Maverick) | Vision, parallel tools |
| Gemini 2.0/2.5 Flash, Pro | PDF, video, audio | |
| xAI | Grok 3, Grok 4 | Vision, reasoning |
| Cohere | Command R+, Command A | RAG, vision |
See the OCI model catalog for the complete list and regional availability.
For detailed documentation of all ChatOCIGenAI features and configurations, see the API reference.