Skip to content

Conversation

@jaswanth8888
Copy link

Description

  • Added token validator middleware in core utils which validates jwt token using provided identity provider
  • Added keycloak class which validates, verifies,decodes a given JWT token with the realm url provided
  • made changes dataprep-redis to use token validator
  • Ingest,deleting a documnet would need admin role, get documents can be done both admin and users
  • if JWT_AUTH(default false) is set to false Verification wouldn't happen and service works as is

Issues

n/a

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

pyjwt

Tests

Describe the tests that you ran to verify your changes.

@codecov
Copy link

codecov bot commented Nov 10, 2024

Codecov Report

Attention: Patch coverage is 24.32432% with 56 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
comps/cores/mega/keycloak.py 30.00% 28 Missing ⚠️
comps/cores/mega/utils.py 17.64% 28 Missing ⚠️
Files with missing lines Coverage Δ
comps/cores/mega/keycloak.py 30.00% <30.00%> (ø)
comps/cores/mega/utils.py 21.64% <17.64%> (-0.86%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chensuyue chensuyue added this to the v1.1 milestone Nov 13, 2024
Copy link
Collaborator

@mkbhanda mkbhanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we end up in a denial of service type attack with too many documents uploaded and running out of space. May need to address this space aspect to trigger a warning to admin or at the very least add a comment to that effect here in case someone wants to adopt this sample, then the things they need to watch out for. On document delete, do we remove all vectors associated with the given document?

Copy link
Collaborator

@Ruoyu-y Ruoyu-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the PR, i guess the purpose is to allow only 'ADMIN_ROLE' user to upload and delete files and allow both 'USER_ROLE' and 'ADMIN_ROLE' user to get the existing files. Is that correct?
And from my understanding, authentication and authorization is not part of the logic of data preparation(comments are welcome), so it might not be that proper to make this part inside the component itself. Otherwise, any other component needs authentication and authorization have to add this part in there code. As now most of the pipelines involve a gateway, it is better to trigger authentication and authorization there. Comments are welcome.

@amberjain1
Copy link
Collaborator

amberjain1 commented Nov 13, 2024

Could we end up in a denial of service type attack with too many documents uploaded and running out of space. May need to address this space aspect to trigger a warning to admin or at the very least add a comment to that effect here in case someone wants to adopt this sample, then the things they need to watch out for. On document delete, do we remove all vectors associated with the given document?

@mkbhanda,
This is a valid comment for data-prep service itself. Though this PR is not intending to touch this part or addressing this issue.
DDoS related issue shall be handled as a separate PR.

@amberjain1 amberjain1 closed this Nov 13, 2024
@amberjain1 amberjain1 reopened this Nov 13, 2024
@mkbhanda
Copy link
Collaborator

From the PR, i guess the purpose is to allow only 'ADMIN_ROLE' user to upload and delete files and allow both 'USER_ROLE' and 'ADMIN_ROLE' user to get the existing files. Is that correct? And from my understanding, authentication and authorization is not part of the logic of data preparation(comments are welcome), so it might not be that proper to make this part inside the component itself. Otherwise, any other component needs authentication and authorization have to add this part in there code. As now most of the pipelines involve a gateway, it is better to trigger authentication and authorization there. Comments are welcome.

@Ruoyu-y I agree with your summarization on which roles can do what. @amberjain1 and @jaswanth8888 would you please comment on her suggestion on where authentication and authorization should be handled -- as in the gateway

@jaswanth8888
Copy link
Author

Could we end up in a denial of service type attack with too many documents uploaded and running out of space. May need to address this space aspect to trigger a warning to admin or at the very least add a comment to that effect here in case someone wants to adopt this sample, then the things they need to watch out for. On document delete, do we remove all vectors associated with the given document?

On document delete we are deleting the vectors even before. so we didn't touch that part.

@jaswanth8888
Copy link
Author

From the PR, i guess the purpose is to allow only 'ADMIN_ROLE' user to upload and delete files and allow both 'USER_ROLE' and 'ADMIN_ROLE' user to get the existing files. Is that correct? And from my understanding, authentication and authorization is not part of the logic of data preparation(comments are welcome), so it might not be that proper to make this part inside the component itself. Otherwise, any other component needs authentication and authorization have to add this part in there code. As now most of the pipelines involve a gateway, it is better to trigger authentication and authorization there. Comments are welcome.

@Ruoyu-y I agree with your summarization on which roles can do what. @amberjain1 and @jaswanth8888 would you please comment on her suggestion on where authentication and authorization should be handled -- as in the gateway

@Ruoyu-y Different Api's would need different Role authorization. so, we imported the authorization middleware and used it in data prep. if any other components to implement this they can import and use it

@Ruoyu-y
Copy link
Collaborator

Ruoyu-y commented Nov 14, 2024

From the PR, i guess the purpose is to allow only 'ADMIN_ROLE' user to upload and delete files and allow both 'USER_ROLE' and 'ADMIN_ROLE' user to get the existing files. Is that correct? And from my understanding, authentication and authorization is not part of the logic of data preparation(comments are welcome), so it might not be that proper to make this part inside the component itself. Otherwise, any other component needs authentication and authorization have to add this part in there code. As now most of the pipelines involve a gateway, it is better to trigger authentication and authorization there. Comments are welcome.

@Ruoyu-y I agree with your summarization on which roles can do what. @amberjain1 and @jaswanth8888 would you please comment on her suggestion on where authentication and authorization should be handled -- as in the gateway

@Ruoyu-y Different Api's would need different Role authorization. so, we imported the authorization middleware and used it in data prep. if any other components to implement this they can import and use it

I understand that different APIs require different role authorization. But what i mean is that this authentication and authorization step shall not be part of the component or for certain API. Otherwise, the piece of code should duplicate for multiple components whatever needs an authentication

From the PR, i guess the purpose is to allow only 'ADMIN_ROLE' user to upload and delete files and allow both 'USER_ROLE' and 'ADMIN_ROLE' user to get the existing files. Is that correct? And from my understanding, authentication and authorization is not part of the logic of data preparation(comments are welcome), so it might not be that proper to make this part inside the component itself. Otherwise, any other component needs authentication and authorization have to add this part in there code. As now most of the pipelines involve a gateway, it is better to trigger authentication and authorization there. Comments are welcome.

@Ruoyu-y I agree with your summarization on which roles can do what. @amberjain1 and @jaswanth8888 would you please comment on her suggestion on where authentication and authorization should be handled -- as in the gateway

@Ruoyu-y Different Api's would need different Role authorization. so, we imported the authorization middleware and used it in data prep. if any other components to implement this they can import and use it

I understand that different API requires different Role authorization. But should the authentication and authorization logic reside within component or the API itself(handled by the API call itself)? Shall the gateway be the component that manage all these stuff?
Like if i would like to allow user with role A to be able to access API B and API C. Instead of setting rules in both API B and API C, is it better to set in a central component(gateway) with a single rule?

@chensuyue
Copy link
Collaborator

The ChatQnA test on gaudi failed https://github.com/opea-project/GenAIComps/actions/runs/11820755586/job/32959963824?pr=868, please check.

@chensuyue
Copy link
Collaborator

We have some cross test when the PR update core part.

@chensuyue chensuyue removed this from the v1.1 milestone Nov 18, 2024
@chensuyue
Copy link
Collaborator

Release the v1.1 milestone after sync with @jaswanth8888.

@xiguiw
Copy link
Collaborator

xiguiw commented Feb 25, 2025

From the PR, i guess the purpose is to allow only 'ADMIN_ROLE' user to upload and delete files and allow both 'USER_ROLE' and 'ADMIN_ROLE' user to get the existing files. Is that correct? And from my understanding, authentication and authorization is not part of the logic of data preparation(comments are welcome), so it might not be that proper to make this part inside the component itself. Otherwise, any other component needs authentication and authorization have to add this part in there code. As now most of the pipelines involve a gateway, it is better to trigger authentication and authorization there. Comments are welcome.

@Ruoyu-y I agree with your summarization on which roles can do what. @amberjain1 and @jaswanth8888 would you please comment on her suggestion on where authentication and authorization should be handled -- as in the gateway

@Ruoyu-y Different Api's would need different Role authorization. so, we imported the authorization middleware and used it in data prep. if any other components to implement this they can import and use it

I understand that different APIs require different role authorization. But what i mean is that this authentication and authorization step shall not be part of the component or for certain API. Otherwise, the piece of code should duplicate for multiple components whatever needs an authentication

From the PR, i guess the purpose is to allow only 'ADMIN_ROLE' user to upload and delete files and allow both 'USER_ROLE' and 'ADMIN_ROLE' user to get the existing files. Is that correct? And from my understanding, authentication and authorization is not part of the logic of data preparation(comments are welcome), so it might not be that proper to make this part inside the component itself. Otherwise, any other component needs authentication and authorization have to add this part in there code. As now most of the pipelines involve a gateway, it is better to trigger authentication and authorization there. Comments are welcome.

@Ruoyu-y I agree with your summarization on which roles can do what. @amberjain1 and @jaswanth8888 would you please comment on her suggestion on where authentication and authorization should be handled -- as in the gateway

@Ruoyu-y Different Api's would need different Role authorization. so, we imported the authorization middleware and used it in data prep. if any other components to implement this they can import and use it

I understand that different API requires different Role authorization. But should the authentication and authorization logic reside within component or the API itself(handled by the API call itself)? Shall the gateway be the component that manage all these stuff? Like if i would like to allow user with role A to be able to access API B and API C. Instead of setting rules in both API B and API C, is it better to set in a central component(gateway) with a single rule?

Good point.
Authorisation strategy seems are not clear. It's for the system level or micro-service level?
If componnent level, its not only in data-prep.
How to unify all component access?

@xiguiw
Copy link
Collaborator

xiguiw commented Mar 26, 2025

Remind

Not active for months.
@jaswanth8888
Are you still working on this? There is CI failure.

@chensuyue chensuyue self-requested a review as a code owner April 18, 2025 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A3 Maintain

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants