Cancelled error with langgraph runs #1601
-
I have an inconsistent issue as CancelledError().
and
The below is the error trace: |
Beta Was this translation helpful? Give feedback.
Replies: 12 comments 8 replies
-
I get the same error when execution time reaches about 60s |
Beta Was this translation helpful? Give feedback.
-
Hi @kedarsp-informa you shouldn't use |
Beta Was this translation helpful? Give feedback.
-
I get the same error when attempting to run a subprocess to install a linter on startup. I am using langgraph cloud. Unsure about what "shouldn't use asyncio.run in add_node" means. this is a very simple graph, which also successfully runs my subprocess to install a linter, but fails afterwards. Deploys successfully without this line: subprocess.run(["bash", "build_linter.sh"], check=True) Registering graph with id 'agent' During handling of the above exception, another exception occurred: Traceback (most recent call last): |
Beta Was this translation helpful? Give feedback.
-
@nfcampos I have this issue too - wrote to support and shared traces. In my case there are no asyncio.run calls - only asyncio.gather (but that one has to be for parallelisation). The rest is async/await. |
Beta Was this translation helpful? Give feedback.
-
Could this issue be related to having no listeners on the stream (as the run it purely a background one)? It seems to occur only under those circumstances on LangGraph Cloud. |
Beta Was this translation helpful? Give feedback.
-
We are also facing this and are not doing any asyncio.run in nodes. Its getting a timeout from redis which is caused by cancelled error
|
Beta Was this translation helpful? Give feedback.
-
I'm running a similar issue with the prebuilt ReAct agent:
My environment:
Unfortunately I've not been to consistently reproduce it. |
Beta Was this translation helpful? Give feedback.
-
I have the similar issue CancelledError('Cancelled by cancel scope 3e9aadaf5ed0', <Task cancelled name='Task-393' coro=<AsyncExitStack.aexit() done, defined at /usr/local/lib/python3.11/contextlib.py:698>>)Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/langgraph/pregel/init.py", line 2080, in astream File "/usr/local/lib/python3.11/site-packages/langgraph/pregel/runner.py", line 495, in atick File "/usr/local/lib/python3.11/asyncio/tasks.py", line 428, in wait File "/usr/local/lib/python3.11/asyncio/tasks.py", line 535, in _wait asyncio.exceptions.CancelledError: Cancelled by cancel scope 3e9aadaf5ed0 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/langgraph/pregel/init.py", line 2033, in astream File "/usr/local/lib/python3.11/site-packages/langgraph/pregel/loop.py", line 1103, in aexit File "/usr/local/lib/python3.11/contextlib.py", line 698, in aexit asyncio.exceptions.CancelledError: ('Cancelled by cancel scope 3e9aadaf5ed0', <Task cancelled name='Task-393' coro=<AsyncExitStack.aexit() done, defined at /usr/local/lib/python3.11/contextlib.py:698>>) |
Beta Was this translation helpful? Give feedback.
-
We're facing a similar issue.
Similarly to what's reported above; it seems to sometimes trigger when a node takes more than 1 minute to execute. Doesn't seem to do so reliably though. We don't use asyncio.run(). @nfcampos this discussion is marked as resolved but it is not - should we create a new one? |
Beta Was this translation helpful? Give feedback.
-
I had been getting a similar with prebuilt react agent for a while. Adding a step timeout to the compile graph solved this for me.
|
Beta Was this translation helpful? Give feedback.
-
Here is how I increased the timeout for my langgraph code. async def get_message_stream(self, message: Optional[str], base64_image: Optional[str], thread_id: str):
|
Beta Was this translation helpful? Give feedback.
-
You may also want to try setting the env var Some cancellations may be caused because your instance is being restarted due to failed health checks. A common cause of failed health checks is that you have some synchronous processes blocking the main event loop and slowing the server down. We have made a lot of improvements to address this but you may want to keep this in mind if you are seeing inexplicable cancellations in your code today. We are working to add even more isolation in the future. |
Beta Was this translation helpful? Give feedback.
Hi @kedarsp-informa you shouldn't use
asyncio.run
inadd_node
we have native support for async functions in nodes. Usingasyncio.run
would create a new event loop for each node, which is very inefficient and can cause issues such as cancellation not propagating from graph to node etc