Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backend): Ensure validity of OAuth credentials during graph execution #8191

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

Pwuts
Copy link
Member

@Pwuts Pwuts commented Sep 26, 2024

Changes 🏗️

backend

  • Added Pydantic model (de)serialization support to @expose decorator

backend.executor

  • Change credential injection mechanism to acquire credentials just before execution
    • Also locks the credentials for the duration of the execution

backend.server

  • Add IntegrationCredentialsManager to handle and synchronize credentials-related operations
  • Make backend.server.integrations module with router, creds_manager, and utils in it

autogpt_libs

  • Add mutexes to SupabaseIntegrationCredentialsStore to ensure thread-safety
  • Move KeyedMutex to autogpt_libs.utils.synchronize

@github-actions github-actions bot added platform/backend AutoGPT Platform - Back end size/m labels Sep 26, 2024
Copy link

netlify bot commented Sep 26, 2024

Deploy Preview for auto-gpt-docs ready!

Name Link
🔨 Latest commit fab700b
🔍 Latest deploy log https://app.netlify.com/sites/auto-gpt-docs/deploys/66fd4f3502b98900096b4776
😎 Deploy Preview https://deploy-preview-8191--auto-gpt-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

- Move `KeyedMutex` to `autogpt_libs.utils.synchronize`
- Add locks to transactions in `SupabaseIntegrationCredentialsStore`
@github-actions github-actions bot added size/l conflicts Automatically applied to PRs with merge conflicts and removed size/m labels Sep 27, 2024
Copy link
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

…entials JIT

Also:
- Add exposed `acquire_credentials`, `release_credentials` methods to `AgentServer`
- Add Pydantic model auto-deserialization support to `@expose` decorator
- Make `ExecutionManager.agent_server_client` cached
@github-actions github-actions bot added size/xl and removed size/l labels Sep 27, 2024
@github-actions github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Sep 27, 2024
Copy link
Contributor

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

@Pwuts Pwuts marked this pull request as ready for review September 27, 2024 22:33
@Pwuts Pwuts requested a review from a team as a code owner September 27, 2024 22:33
@Pwuts Pwuts requested review from aarushik93, majdyz and ntindle and removed request for a team September 27, 2024 22:33
@Pwuts Pwuts force-pushed the reinier/open-1891-ensure-oauth-credentials-are-fresh-before-using-them-in branch from d48fbfe to fda2cf6 Compare September 27, 2024 22:48
@Pwuts Pwuts force-pushed the reinier/open-1891-ensure-oauth-credentials-are-fresh-before-using-them-in branch from fda2cf6 to 5f3ef42 Compare September 27, 2024 23:00
@github-actions github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Sep 29, 2024
Copy link
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

@CLAassistant
Copy link

CLAassistant commented Sep 30, 2024

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Sep 30, 2024
Copy link
Contributor

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

@Pwuts Pwuts changed the title feat(platform/executor): Ensure validity of OAuth credentials during graph execution feat(backend): Ensure validity of OAuth credentials during graph execution Sep 30, 2024
@Pwuts Pwuts enabled auto-merge (squash) September 30, 2024 16:41
majdyz
majdyz previously approved these changes Sep 30, 2024
Copy link
Contributor

@majdyz majdyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, though we are using KeyedMutex in many places, where this is planned to be killed:
#8197

We need to sync on this or make a helper lies in a single place for an easy replacement

@@ -14,26 +17,29 @@


class SupabaseIntegrationCredentialsStore:
locks = KeyedMutex()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: This will be gone very soon: https://github.com/Significant-Gravitas/AutoGPT/pull/8197/files#diff-39d77f5b792b124e499a265155ab201f0a211a16c57129cec51d32ba4f0c695d

Let me know if there will be any issue migrating this to Redis.

Copy link
Member Author

@Pwuts Pwuts Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be an issue, other than extra time spent. I just need the locking service to be available within autogpt_libs.

key = (key,)
key = (self.supabase.supabase_url, *key)

self.locks.lock(key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to add a key prefix here to avoid clashing with another possible usage

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that the point of locking resources? It should clash if it's simultaneously in use elsewhere.

  • The atomic unit of data here is the entire user metadata object (because it needs to be read/written in its entirety), so that dictates the lock scope: with self.locked(user_id)
  • locks is shared across all SupabaseIntegrationCredentialsStore instances in a process
  • I'm using supabase_url as a key prefix to account for the edge case of multiple connections to different Supabase instances

# Release lock on credentials ASAP
if credentials:
api_client.release_credentials(user_id, credentials.id)
credentials = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this if the finally block already does the same

Copy link
Member Author

@Pwuts Pwuts Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the comment says:

Release lock on credentials ASAP

because between this and the finally block being called, some other stuff happens and I want to avoid the extra latency

@@ -731,7 +724,7 @@ def add_execution(
raise Exception(f"Graph #{graph_id} not found.")

graph.validate_graph(for_run=True)
node_input_credentials = self._get_node_input_credentials(graph, user_id)
self._validate_node_input_credentials(graph, user_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to do this on each node execution?

Copy link
Member Author

@Pwuts Pwuts Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This checks if the credentials exist / are available. If this is all good at the start of graph execution, we can start executing. If halfway through execution the user somehow deletes the in-use credentials, there's no use in re-checking if they exist before trying to use them because it will result in execution failure either way.

@@ -720,6 +711,8 @@ def cleanup(self):

@property
def agent_server_client(self) -> "AgentServer":
# Since every single usage of this property happens from a different thread,
# there is no value in caching it.
return get_agent_server_client()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharing client between the different thread is forbidden.
The comment is contradicting with the reality though, what's the objective? should we remove the property decorator then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put that comment there to prevent future confusion and time wasted on this. I tried to make the property cached because it seemed like a waste to re-create the client for every single use. Then I found out that that's illegal and will cause errors because a client can only be owned by one thread. I reverted the technical change and left a note for the next person who might think it's a good idea to re-use the client.

key = (self.store.supabase.supabase_url, *key)
self._locks.unlock(key)

@contextmanager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point we might as well bring this feature to synchronize.py


def delete(self, user_id: str, credentials_id: str) -> None:
with self._locked(user_id, credentials_id):
self.store.delete_creds_by_id(user_id, credentials_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need lock for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this lock, we would allow deleting a set of credentials while it's in use.

@@ -40,9 +52,52 @@ def wrapper(*args, **kwargs):
logger.exception(msg)
raise Exception(msg, e)

# Register custom serializers and deserializers for annotated Pydantic models
for name, annotation in func.__annotations__.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!
Though can we extract this out as a function, we definitely need a test for the peeling logic, something like test_types.py testing things like Union[A,B], A|B, list[dict[str, int]]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO properly documenting the intended use and/or limitations is more efficient and effective here than building exhaustive unit tests. That's what the "Gotcha" in the docstring is for. Not because unit tests are bad, but predicting use cases is hard and accounting for all of the potential use cases in unit tests would take up way too much time and make this unnecessarily complex.

This implementation is based on how we currently use it. It supports Annotated and Union, and IMO that's as complex as it should get with parameter or return annotations containing Pydantic models.

The complexity of the peeling logic also increases with the complexity of the types that you want it to support. Regarding your examples:

  • list[dict[str, int]] isn't relevant here, because we only care about Pydantic models
  • Union[A, B] == A | B, which is supported

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted the type peeling logic into a helper function: 94fba3c

@github-actions github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Oct 3, 2024
Copy link
Contributor

github-actions bot commented Oct 3, 2024

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conflicts Automatically applied to PRs with merge conflicts platform/backend/executor platform/backend AutoGPT Platform - Back end size/xl
Projects
Status: 👍🏼 Mergeable
Development

Successfully merging this pull request may close these issues.

Ensure OAuth credentials are fresh before using them in graph execution
3 participants