Skip to content

MalformedError('No key could be detected.') When Using BigQuery Tool in LangGraph Cloud Deployment #3325

Open
@johannescastner

Description

Checked other resources

  • This is a bug, not a usage question. For questions, please use GitHub Discussions.
  • I added a clear and detailed title that summarizes the issue.
  • I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • I included a self-contained, minimal example that demonstrates the issue INCLUDING all the relevant imports. The code run AS IS to reproduce the issue.

Example Code

import os
import json
import asyncio
from typing import Type
import logging
# Core dependencies
from pydantic import BaseModel, Field
# Google Cloud
from google.cloud import bigquery
from google.oauth2 import service_account
# LangChain & LangGraph
from langchain_openai import ChatOpenAI
from langchain_core.tools import StructuredTool
from langgraph.prebuilt import create_react_agent

# Configure logging
logging.basicConfig(level=logging.INFO)

# CONFIGURATION
PROJECT_ID = os.getenv("PROJECT_ID", "datawarehouse-447422")
RAW_DATASET_ID = os.getenv("RAW_DATASET_ID", "linkedin_raw")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# INPUT SCHEMA
class BigQueryListTablesInput(BaseModel):
    dataset_name: str = Field(..., description="Name of the BigQuery dataset to list tables from")

# BIGQUERY CLIENT INITIALIZATION
def get_bigquery_client() -> bigquery.Client:
    """Initialize BigQuery client with proper credentials"""
    if creds_json := os.getenv("GOOGLE_CLOUD_CREDENTIALS_JSON"):
        logging.info("Using service account credentials from environment variable.")
        credentials = service_account.Credentials.from_service_account_info(json.loads(creds_json))
        return bigquery.Client(credentials=credentials, project=credentials.project_id)
    logging.info("Using default project ID for BigQuery client.")
    return bigquery.Client(project=PROJECT_ID)

# TOOL IMPLEMENTATION
async def list_bigquery_tables(dataset_name: str) -> str:
    """List tables in a BigQuery dataset"""
    logging.info(f"Received dataset_name: {dataset_name}")
    if not dataset_name:
        raise ValueError("Missing required input: dataset_name")
    
    try:
        logging.info("Starting BigQuery client initialization...")
        client = get_bigquery_client()
        logging.info(f"BigQuery client initialized successfully with project ID: {client.project}")
        
        logging.info(f"Creating dataset reference for dataset: {dataset_name}")
        dataset_ref = client.dataset(dataset_name)
        logging.info(f"Dataset reference created: {dataset_ref.path}")
        
        logging.info("Listing tables in the dataset...")
        tables = client.list_tables(dataset_ref)
        table_ids = ", ".join(table.table_id for table in tables)
        logging.info(f"Table IDs: {table_ids}")
        
        return table_ids or "No tables found"
    except Exception as e:
        logging.error(f"Error listing tables: {e}")
        raise

# TOOL REGISTRATION
tools = [
    StructuredTool.from_function(
        func=list_bigquery_tables,
        name="list_bigquery_tables",
        description="Lists tables in a BigQuery dataset. Input: JSON object with 'dataset_name'",
        args_schema=BigQueryListTablesInput,
        coroutine=list_bigquery_tables,
    ),
]

# AGENT CREATION
def create_agent():
    llm = ChatOpenAI(
        model_name="gpt-3.5-turbo",
        temperature=0,
        max_tokens=1200,
        openai_api_key=OPENAI_API_KEY
    )
    return create_react_agent(llm, tools)

# Initialize the agent graph
graph = create_agent()

# Simulate agent flow locally (for testing purposes)
async def simulate_agent_flow():
    # Simulate input generation
    dataset_name = "linkedin_raw"
    
    # Test the tool
    print(await list_bigquery_tables(dataset_name))

# Run simulation locally
if __name__ == "__main__":
    asyncio.run(simulate_agent_flow())

Error Message and Stack Trace (if applicable)

Error listing tables: No key could be detected.

Description

I am encountering a persistent MalformedError('No key could be detected.') error when deploying an agent with a BigQuery tool (list_bigquery_tables) to LangGraph Cloud. The same code works flawlessly in a local environment, which suggests the issue lies within the LangGraph Cloud deployment or its interaction with external APIs like BigQuery.

Steps to Reproduce
Deploy the following minimal code to LangGraph Cloud.
Set the required secrets (GOOGLE_CLOUD_CREDENTIALS_JSON and OPENAI_API_KEY) in the LangGraph Cloud environment.
Trigger the list_bigquery_tables tool by asking the agent to list tables in the linkedin_raw dataset.

Expected Behavior
The tool should successfully list all tables in the specified dataset and return their names.

Actual Behavior
The tool fails with the following error:

Error listing tables: No key could be detected.
Troubleshooting Steps Taken
Validated Secrets : Confirmed that GOOGLE_CLOUD_CREDENTIALS_JSON and OPENAI_API_KEY are correctly set in LangGraph Cloud.
Tested Locally : Verified that the code works locally with the same service account credentials.
Added Logging : Enhanced logging to capture the entire execution flow, including project ID, dataset reference, and table listing.
Checked Permissions : Ensured the service account has the BigQuery Admin role.

Hypotheses
LangGraph Cloud Restrictions :
LangGraph Cloud might block or limit outbound HTTP requests to external APIs like BigQuery 4.
There could be constraints on the size or format of responses returned by tools.
Agent-Tool Integration :
The agent might mishandle the tool's response, leading to a malformed output 3.
There could be a mismatch between the expected and actual response formats.

Request for Assistance
Could the maintainers of LangGraph Cloud provide clarification on the following points?

Are there any restrictions on outbound HTTP requests to external APIs like BigQuery?
Are there specific requirements for tool response formats or schemas?
Could this issue be related to the runtime environment or permissions in LangGraph Cloud?
This example is fully self-contained, minimal, and reproducible. It includes all relevant imports, configurations, and logging to help diagnose the issue 1. Please let me know if further clarification is needed!

System Info

Python Version: 3.9+
Required Libraries: google-cloud-bigquery, langchain-openai, langchain-core, langgraph
Deployment Environment: LangGraph Cloud
Google Service Account Role: BigQuery Admin
OpenAI Model: gpt-3.5-turbo

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions