LLM tool-calling primitives for Temporal activities — define tools once, use with Anthropic or OpenAI.
A Temporal Activity is a function that Temporal monitors and retries automatically on failure. Temporal streams progress between retries via heartbeats — that's the mechanism AgenticSession uses to resume a crashed LLM conversation mid-turn.
ToolRegistry.runToolLoop works standalone in any function — no Temporal server needed. Add AgenticSession only when you need crash-safe resume inside a Temporal activity.
AgenticSession requires a running Temporal worker — it reads and writes heartbeat state from the active activity context. Use ToolRegistry.runToolLoop standalone for scripts, one-off jobs, or any code that runs outside a Temporal worker.
New to Temporal? → https://docs.temporal.io/develop
Python or TypeScript user? Those SDKs also ship framework-level integrations (openai_agents, google_adk_agents, langgraph, @temporalio/ai-sdk) for teams already using a specific agent framework. ToolRegistry is the equivalent story for direct Anthropic/OpenAI calls, and shares the same API surface across all six Temporal SDKs.
Build under JDK ≤ 21 for the unmerged-branch build. JDK 25's stricter compiler checks cause spurious failures even though the module targets
--release 11. Example:JAVA_HOME=$JDK21 ./gradlew :temporal-tool-registry:publishToMavenLocal.
Add to your build.gradle:
dependencies {
// Replace VERSION with the latest release from https://search.maven.org
implementation 'io.temporal:temporal-tool-registry:VERSION'
// Add only the LLM SDK(s) you use:
implementation 'com.anthropic:anthropic-java:VERSION' // Anthropic
implementation 'com.openai:openai-java:VERSION' // OpenAI
}Tool definitions use JSON Schema for inputSchema. The quickstart uses a single string field; for richer schemas refer to the JSON Schema docs.
import io.temporal.toolregistry.*;
@ActivityMethod
public List<String> analyze(String prompt) throws Exception {
List<String> results = new ArrayList<>();
ToolRegistry registry = new ToolRegistry();
registry.register(
ToolDefinition.builder()
.name("flag_issue")
.description("Flag a problem found in the analysis")
.inputSchema(Map.of(
"type", "object",
"properties", Map.of("description", Map.of("type", "string")),
"required", List.of("description")))
.build(),
(Map<String, Object> input) -> {
results.add((String) input.get("description"));
return "recorded"; // this string is sent back to the LLM as the tool result
});
AnthropicConfig cfg = AnthropicConfig.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.build();
Provider provider = new AnthropicProvider(cfg, registry,
"You are a code reviewer. Call flag_issue for each problem you find.");
ToolRegistry.runToolLoop(provider, registry, prompt);
return results;
}The default model is "claude-sonnet-4-6" (Anthropic) or "gpt-4o" (OpenAI). Override with the model() builder method:
AnthropicConfig cfg = AnthropicConfig.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.model("claude-3-5-sonnet-20241022")
.build();Model IDs are defined by the provider — see Anthropic or OpenAI docs for current names.
OpenAIConfig cfg = OpenAIConfig.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.build();
Provider provider = new OpenAIProvider(cfg, registry, "your system prompt");
ToolRegistry.runToolLoop(provider, registry, "" /* system prompt: "" defers to provider default */, prompt);For multi-turn LLM conversations that must survive activity retries, use
AgenticSession.runWithSession. It saves conversation history via
Activity.getExecutionContext().heartbeat() on every turn and restores it on retry.
@ActivityMethod
public List<Object> longAnalysis(String prompt) throws Exception {
List<Object> results = new ArrayList<>();
AgenticSession.runWithSession(session -> {
ToolRegistry registry = new ToolRegistry();
registry.register(
ToolDefinition.builder().name("flag").description("...").inputSchema(Map.of("type", "object")).build(),
input -> { session.addResult(input); return "ok"; /* sent back to LLM */ });
AnthropicConfig cfg = AnthropicConfig.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY")).build();
Provider provider = new AnthropicProvider(cfg, registry, "your system prompt");
session.runToolLoop(provider, registry, prompt);
results.addAll(session.getResults()); // capture after loop completes
});
return results;
}import io.temporal.toolregistry.testing.*;
@Test
public void testAnalyze() throws Exception {
ToolRegistry registry = new ToolRegistry();
registry.register(
ToolDefinition.builder().name("flag").description("d")
.inputSchema(Map.of("type", "object")).build(),
input -> "ok");
MockProvider provider = new MockProvider(
MockResponse.toolCall("flag", Map.of("description", "stale API")),
MockResponse.done("analysis complete"));
List<Map<String, Object>> msgs =
ToolRegistry.runToolLoop(provider, registry, "analyze");
assertTrue(msgs.size() > 2);
}To run the integration tests against live Anthropic and OpenAI APIs:
RUN_INTEGRATION_TESTS=1 \
ANTHROPIC_API_KEY=sk-ant-... \
OPENAI_API_KEY=sk-proj-... \
./gradlew test --tests "*.ToolRegistryTest.testIntegration*"Tests skip automatically when RUN_INTEGRATION_TESTS is unset. Real API calls
incur billing — expect a few cents per full test run.
session.getResults() accumulates application-level
results during the tool loop. Elements are serialized to JSON inside each heartbeat
checkpoint — they must be plain maps/dicts with JSON-serializable values. A non-serializable
value raises a non-retryable ApplicationError at heartbeat time rather than silently
losing data on the next retry.
Convert your domain type to a plain dict at the tool-call site and back after the session:
record Result(String type, String file) {}
// Inside tool handler:
session.addResult(Map.of("type", "smell", "file", "Foo.java"));
// After session (using Jackson for convenient mapping):
// requires jackson-databind in your build.gradle:
// implementation 'com.fasterxml.jackson.core:jackson-databind:VERSION'
ObjectMapper mapper = new ObjectMapper();
List<Result> results = session.getResults().stream()
.map(m -> mapper.convertValue(m, Result.class))
.toList();Individual LLM calls inside the tool loop are unbounded by default. A hung HTTP
connection holds the activity open until Temporal's ScheduleToCloseTimeout
fires — potentially many minutes. Set a per-turn timeout on the provider client:
AnthropicConfig cfg = AnthropicConfig.builder()
.apiKey(System.getenv("ANTHROPIC_API_KEY"))
.timeout(Duration.ofSeconds(30))
.build();
Provider provider = new AnthropicProvider(cfg, registry, "your system prompt");
// provider now enforces 30s per turnRecommended timeouts:
| Model type | Recommended |
|---|---|
| Standard (Claude 3.x, GPT-4o) | 30 s |
| Reasoning (o1, o3, extended thinking) | 300 s |
Set setScheduleToCloseTimeout on the activity stub options to bound the entire conversation:
ActivityOptions opts = ActivityOptions.newBuilder()
.setScheduleToCloseTimeout(Duration.ofMinutes(10))
.build();
MyActivities stub = Workflow.newActivityStub(MyActivities.class, opts);The per-turn client timeout and ScheduleToCloseTimeout are complementary:
- Per-turn timeout fires if one LLM call hangs (protects against a single stuck turn)
ScheduleToCloseTimeoutbounds the entire conversation including all retries (protects against runaway multi-turn loops)
ToolRegistry.fromMcpTools converts a list of McpTool descriptors into a populated
registry. Handlers default to no-ops that return an empty string; override them with
register after construction.
// mcpTools is List<McpTool> — populate from your MCP client.
ToolRegistry registry = ToolRegistry.fromMcpTools(mcpTools);
// Override specific handlers before running the loop.
registry.register(
ToolDefinition.builder().name("read_file") /* ... */ .build(),
input -> readFile((String) input.get("path")));McpTool mirrors the MCP protocol's Tool object: name, description, and
inputSchema (a Map<String, Object> containing a JSON Schema object).