-
Notifications
You must be signed in to change notification settings - Fork 44.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(backend): Ensure validity of OAuth credentials during graph execution #8191
base: master
Are you sure you want to change the base?
feat(backend): Ensure validity of OAuth credentials during graph execution #8191
Conversation
✅ Deploy Preview for auto-gpt-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
- Move `KeyedMutex` to `autogpt_libs.utils.synchronize` - Add locks to transactions in `SupabaseIntegrationCredentialsStore`
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
…entials JIT Also: - Add exposed `acquire_credentials`, `release_credentials` methods to `AgentServer` - Add Pydantic model auto-deserialization support to `@expose` decorator - Make `ExecutionManager.agent_server_client` cached
…-are-fresh-before-using-them-in
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly. |
d48fbfe
to
fda2cf6
Compare
fda2cf6
to
5f3ef42
Compare
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
…-are-fresh-before-using-them-in
Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, though we are using KeyedMutex in many places, where this is planned to be killed:
#8197
We need to sync on this or make a helper lies in a single place for an easy replacement
@@ -14,26 +17,29 @@ | |||
|
|||
|
|||
class SupabaseIntegrationCredentialsStore: | |||
locks = KeyedMutex() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: This will be gone very soon: https://github.com/Significant-Gravitas/AutoGPT/pull/8197/files#diff-39d77f5b792b124e499a265155ab201f0a211a16c57129cec51d32ba4f0c695d
Let me know if there will be any issue migrating this to Redis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be an issue, other than extra time spent. I just need the locking service to be available within autogpt_libs
.
key = (key,) | ||
key = (self.supabase.supabase_url, *key) | ||
|
||
self.locks.lock(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to add a key prefix here to avoid clashing with another possible usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that the point of locking resources? It should clash if it's simultaneously in use elsewhere.
- The atomic unit of data here is the entire user metadata object (because it needs to be read/written in its entirety), so that dictates the lock scope:
with self.locked(user_id)
locks
is shared across allSupabaseIntegrationCredentialsStore
instances in a process- I'm using
supabase_url
as a key prefix to account for the edge case of multiple connections to different Supabase instances
# Release lock on credentials ASAP | ||
if credentials: | ||
api_client.release_credentials(user_id, credentials.id) | ||
credentials = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this if the finally block already does the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the comment says:
Release lock on credentials ASAP
because between this and the finally
block being called, some other stuff happens and I want to avoid the extra latency
@@ -731,7 +724,7 @@ def add_execution( | |||
raise Exception(f"Graph #{graph_id} not found.") | |||
|
|||
graph.validate_graph(for_run=True) | |||
node_input_credentials = self._get_node_input_credentials(graph, user_id) | |||
self._validate_node_input_credentials(graph, user_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to do this on each node execution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? This checks if the credentials exist / are available. If this is all good at the start of graph execution, we can start executing. If halfway through execution the user somehow deletes the in-use credentials, there's no use in re-checking if they exist before trying to use them because it will result in execution failure either way.
@@ -720,6 +711,8 @@ def cleanup(self): | |||
|
|||
@property | |||
def agent_server_client(self) -> "AgentServer": | |||
# Since every single usage of this property happens from a different thread, | |||
# there is no value in caching it. | |||
return get_agent_server_client() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharing client between the different thread is forbidden.
The comment is contradicting with the reality though, what's the objective? should we remove the property decorator then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put that comment there to prevent future confusion and time wasted on this. I tried to make the property cached because it seemed like a waste to re-create the client for every single use. Then I found out that that's illegal and will cause errors because a client can only be owned by one thread. I reverted the technical change and left a note for the next person who might think it's a good idea to re-use the client.
autogpt_platform/backend/backend/server/integrations/creds_manager.py
Outdated
Show resolved
Hide resolved
key = (self.store.supabase.supabase_url, *key) | ||
self._locks.unlock(key) | ||
|
||
@contextmanager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point we might as well bring this feature to synchronize.py
|
||
def delete(self, user_id: str, credentials_id: str) -> None: | ||
with self._locked(user_id, credentials_id): | ||
self.store.delete_creds_by_id(user_id, credentials_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need lock for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this lock, we would allow deleting a set of credentials while it's in use.
@@ -40,9 +52,52 @@ def wrapper(*args, **kwargs): | |||
logger.exception(msg) | |||
raise Exception(msg, e) | |||
|
|||
# Register custom serializers and deserializers for annotated Pydantic models | |||
for name, annotation in func.__annotations__.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Though can we extract this out as a function, we definitely need a test for the peeling logic, something like test_types.py testing things like Union[A,B], A|B, list[dict[str, int]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO properly documenting the intended use and/or limitations is more efficient and effective here than building exhaustive unit tests. That's what the "Gotcha" in the docstring is for. Not because unit tests are bad, but predicting use cases is hard and accounting for all of the potential use cases in unit tests would take up way too much time and make this unnecessarily complex.
This implementation is based on how we currently use it. It supports Annotated
and Union
, and IMO that's as complex as it should get with parameter or return annotations containing Pydantic models.
The complexity of the peeling logic also increases with the complexity of the types that you want it to support. Regarding your examples:
list[dict[str, int]]
isn't relevant here, because we only care about Pydantic modelsUnion[A, B]
==A | B
, which is supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracted the type peeling logic into a helper function: 94fba3c
…-are-fresh-before-using-them-in
This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request. |
Changes 🏗️
backend
@expose
decoratorbackend.executor
backend.server
IntegrationCredentialsManager
to handle and synchronize credentials-related operationsbackend.server.integrations
module withrouter
,creds_manager
, andutils
in itautogpt_libs
SupabaseIntegrationCredentialsStore
to ensure thread-safetyKeyedMutex
toautogpt_libs.utils.synchronize