Skip to content

[Reference PR] VRL - GV Part #276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 87 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
ed63954
added external_service, configured_addon, authorized_account
opaduchak Mar 28, 2025
535639a
fixed leftovers
opaduchak Mar 28, 2025
f0fcddd
cleaned up interface for link addons
opaduchak Mar 28, 2025
1daba47
added resource type
opaduchak Mar 28, 2025
5299545
fixed supportedFeatures enum
opaduchak Mar 31, 2025
c441323
added proper resource types
opaduchak Mar 31, 2025
434e150
added validator to resource type
opaduchak Mar 31, 2025
30ea16a
fixed some things whcih vere left over
opaduchak Apr 1, 2025
47c34ed
Apply suggestions from code review
opaduchak Apr 2, 2025
d625575
more complete implementation of link structures
opaduchak Apr 4, 2025
9a32d59
squashed migrations, fixed unit tests fixed resource type for configu…
opaduchak Apr 4, 2025
eb9e065
Merge pull request #255 from opaduchak/feature/ENG-7568
adlius Apr 8, 2025
2c1974f
enable sharing sessions between GV and OSF
adlius Feb 8, 2025
f86e13e
wip
opaduchak Apr 8, 2025
bb42594
implemented celery communication from GV to OSF
opaduchak Apr 9, 2025
2c99397
removed unnecessary stuff
opaduchak Apr 9, 2025
5515b34
added rabbitme to tests
opaduchak Apr 9, 2025
f84c286
fixed host for rabbitmq
opaduchak Apr 9, 2025
9fbde27
add redis to CI workflow
adlius Apr 9, 2025
236377c
add REDIS_HOST env var
adlius Apr 9, 2025
5e379fa
fix
adlius Apr 9, 2025
1890581
fix tests
adlius Apr 9, 2025
5e3ef39
Implemented dataverse for VRL
opaduchak Apr 9, 2025
bc4a201
cleaned dataverse implementation up
opaduchak Apr 11, 2025
686cc5d
Merge remote-tracking branch 'upstream/develop' into feature/verified…
cslzchen Apr 15, 2025
aeb0995
Merge pull request #261 from opaduchak/feature/ENG-7757
cslzchen Apr 15, 2025
793a22f
CR follwup
adlius Apr 17, 2025
1c5254e
Merge branch 'feature/verified-resource-linking' into share-sessions
adlius Apr 17, 2025
3dca7b8
fix linting
adlius Apr 17, 2025
4c9b674
Merge remote-tracking branch 'origin/share-sessions' into share-sessions
adlius Apr 17, 2025
cbdba0b
fixed link config to have both web url and api url
opaduchak Apr 18, 2025
30ac7e0
Merge pull request #262 from opaduchak/feature/ENG-7778
cslzchen Apr 22, 2025
9c7a55c
added PoC to get all verified links for given node
opaduchak Apr 22, 2025
8bff195
fixed permissions
opaduchak Apr 22, 2025
4481512
fix test
adlius Apr 23, 2025
2c6931b
Merge pull request #227 from CenterForOpenScience/share-sessions
cslzchen Apr 23, 2025
ef53298
fixed comments
opaduchak Apr 24, 2025
0231a36
fixed comment for verified link class
opaduchak Apr 24, 2025
e5ff0d1
fixed permissions
opaduchak Apr 24, 2025
6d47039
Merge pull request #266 from opaduchak/feature/ENG-7853
cslzchen Apr 24, 2025
7d5f81b
implemented sending celery task when links change
opaduchak Apr 24, 2025
e17fa44
fixed user guid
opaduchak Apr 24, 2025
bd48926
fixed Resource types to be consistent with datacite
opaduchak Apr 25, 2025
37db09c
fixed access token auth for user-references
opaduchak Apr 25, 2025
06df870
fixed tests
opaduchak Apr 25, 2025
fdc4209
Merge pull request #270 from opaduchak/hotfix/user-reference-token
cslzchen Apr 25, 2025
9cf0e95
fixed PAT for IsAuthenticated checks
opaduchak Apr 25, 2025
a9ec3f1
Merge pull request #271 from opaduchak/hotfix/user-reference-token
cslzchen Apr 25, 2025
623fab6
Merge pull request #268 from opaduchak/feature/ENG-7760
cslzchen Apr 25, 2025
2a44bb2
final fix for PAT
opaduchak Apr 28, 2025
47d2fca
Merge pull request #272 from opaduchak/hotfix/user-reference-token
cslzchen Apr 28, 2025
4e68fd8
Merge pull request #269 from opaduchak/feature/ENG-7769
cslzchen Apr 29, 2025
2da64a8
Merge remote-tracking branch 'upstream/develop' into feature/verified…
cslzchen Apr 29, 2025
2f2ef0e
Re-run lock and update poetry.lock
cslzchen Apr 29, 2025
92fd2c0
Fix migration conflicts from develop
cslzchen Apr 30, 2025
75a14ba
Fix resource_type and enum serializer
cslzchen Apr 30, 2025
dad9fc9
Merge pull request #274 from cslzchen/feature/fix-migrations
cslzchen Apr 30, 2025
3af8f7f
Both target_id and resource_type can be blank or null
cslzchen May 5, 2025
9e3b900
Update ConfiguredLinkAddonSerializer: allow null and blank
cslzchen May 5, 2025
706b7a0
Update VerifiedLinkSerializer
cslzchen May 5, 2025
16521c4
fixed Hybrid service type not being displayed in admin
opaduchak May 6, 2025
034a34d
fixed helper to be private
opaduchak May 6, 2025
9b20430
Fix typo in condition check
cslzchen May 6, 2025
f044698
Use required=False for target_id and resource_type in serializer
cslzchen May 6, 2025
e3f0127
Add comments on why we eased restrictions due to FE limitation
cslzchen May 6, 2025
dcf47fa
Skip allow_null and allow_blank for read-only fields in serializer
cslzchen May 6, 2025
8210b92
Merge pull request #280 from opaduchak/fix/ENG-7931
cslzchen May 6, 2025
3776048
Merge pull request #278 from cslzchen/fix/null-resource-type
adlius May 7, 2025
b9983eb
fixed requests failing when dataverses contain unpublished datasets
opaduchak May 7, 2025
d781641
Merge pull request #281 from opaduchak/fix/ENG-7943
cslzchen May 7, 2025
01a2786
fix
adlius May 12, 2025
5b0a63b
Merge pull request #282 from CenterForOpenScience/fix-serializer
cslzchen May 12, 2025
7d311e1
Fix naming for OutputManagementPlan
cslzchen May 12, 2025
ef5cc36
Merge pull request #283 from cslzchen/fix-resource-type-naming
cslzchen May 12, 2025
5a228fa
fixed invalid serializer for AuthorizedLinkAccounts accessed from Use…
opaduchak May 15, 2025
387b79c
made EnumNameMultipleChoiceField sorted
opaduchak May 16, 2025
4d8a877
Merge pull request #287 from opaduchak/feature/ENG-8022
cslzchen May 16, 2025
c876d88
added dataverse to fill_external_services script
opaduchak May 20, 2025
57ca18f
Merge remote-tracking branch 'upstream/develop' into feature/verified…
cslzchen May 20, 2025
db9250b
Merge pull request #289 from opaduchak/feature/ENG-8023
cslzchen May 23, 2025
338020b
Reorder KnownAddonImps and AddonImpNumbers
cslzchen May 23, 2025
fb49231
Merge pull request #291 from cslzchen/feature/eng-8075
cslzchen May 23, 2025
1b3516f
Add int_supported_features to link type external service
cslzchen May 27, 2025
2b53b15
renamed task__update_doi_metadata_with_verified_links
opaduchak May 27, 2025
df4d173
Merge pull request #296 from opaduchak/feature/ENG-8079
cslzchen May 27, 2025
f6c088b
Update supported features for LINK external service
cslzchen May 27, 2025
74e8fed
Merge pull request #295 from cslzchen/feature/link_int_supported_feature
cslzchen May 28, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .github/workflows/run_gravyvalet_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,20 @@ jobs:
postgres-version: ['15']
runs-on: ubuntu-latest
services:
redis:
image: redis
# Set health checks to wait until redis has started
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
rabbitmq:
image: rabbitmq:latest
ports:
- 5672:5672
postgres:
image: postgres:${{ matrix.postgres-version }}
env:
Expand Down Expand Up @@ -54,7 +68,9 @@ jobs:
run: poetry run python -Werror manage.py test
env:
DEBUG: 1
AMQP_BROKER_URL: "amqp://guest:guest@localhost:5672"
POSTGRES_HOST: localhost
POSTGRES_DB: gravyvalettest
POSTGRES_USER: postgres
SECRET_KEY: oh-so-secret
REDIS_HOST: redis://localhost:6379
Empty file added addon_imps/link/__init__.py
Empty file.
260 changes: 260 additions & 0 deletions addon_imps/link/dataverse.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
from __future__ import annotations

import asyncio
import re
from dataclasses import dataclass
from http import HTTPStatus
from typing import Literal

from django.core.exceptions import ValidationError

from addon_toolkit.interfaces.link import (
ItemResult,
ItemSampleResult,
ItemType,
LinkAddonHttpRequestorImp,
)


DATAVERSE_REGEX = re.compile(r"^dataverse/(?P<id>\d*)$")
DATASET_REGEX = re.compile(r"^dataset/(?P<persistent_id>.*)$")
FILE_REGEX = re.compile(r"^file/(?P<persistent_id>.*)$")


@dataclass
class DataverseLinkImp(LinkAddonHttpRequestorImp):
"""storage on dataverse

see https://guides.dataverse.org/en/latest/api/native-api.html
"""

async def build_url_for_id(self, item_id: str) -> str:
if match := DATASET_REGEX.match(item_id):
entity_type = "dataset"
elif match := FILE_REGEX.match(item_id):
entity_type = "file"
else:
raise ValidationError(f"Invalid {item_id=}")

persistent_id = match["persistent_id"]

return self._make_url(entity_type, persistent_id)

def _make_url(self, entity_type: Literal["file", "dataset"], persistent_id):
return f"{self.config.external_web_url}/{entity_type}.xhtml?persistentId={persistent_id}"

async def get_external_account_id(self, _: dict[str, str]) -> str:
try:
async with self.network.GET("api/v1/users/:me") as response:
if not response.http_status.is_success:
raise ValidationError(
"Could not get dataverse account id, check your API Token"
)
content = await response.json_content()
return content.get("data", {}).get("id")
except ValueError as exc:
if "relative url may not alter the base url" in str(exc).lower():
raise ValidationError(
"Invalid host URL. Please check your Dataverse base URL."
)
raise

async def list_root_items(self, page_cursor: str = "") -> ItemSampleResult:
async with self.network.GET(
"api/mydata/retrieve",
query=[
["selected_page", page_cursor],
*[("role_ids", role) for role in range(1, 9)],
(
"dvobject_types",
"Dataverse",
), # only published dataverses may contain published datasets
("published_states", "Published"),
],
) as response:
content = await response.json_content()
if resp_data := content.get("data"):
return parse_mydata(resp_data)
return ItemSampleResult(items=[], total_count=0)

async def get_item_info(self, item_id: str) -> ItemResult:
if not item_id:
return ItemResult(item_id="", item_name="", item_type=ItemType.FOLDER)
elif match := DATAVERSE_REGEX.match(item_id):
entity = await self._fetch_dataverse(match["id"])
elif match := DATASET_REGEX.match(item_id):
entity = await self._fetch_dataset(
dataset_id=match["id"], persistent_id=match["persistent_id"]
)
elif match := FILE_REGEX.match(item_id):
entity = await self._fetch_file(match["persistent_id"])
else:
raise ValueError(f"Invalid item id: {item_id}")

return entity

async def list_child_items(
self,
item_id: str,
page_cursor: str = "",
item_type: ItemType | None = None,
) -> ItemSampleResult:
if not item_id:
return await self.list_root_items(page_cursor)
elif match := DATAVERSE_REGEX.match(item_id):
items = await self._fetch_dataverse_items(match["id"])
return ItemSampleResult(
items=items,
total_count=len(items),
)
elif match := DATASET_REGEX.match(item_id):
items = await self._fetch_dataset_files(
persistent_id=match["persistent_id"]
)
return ItemSampleResult(
items=items,
total_count=len(items),
)
else:
return ItemSampleResult(items=[], total_count=0)

async def _fetch_dataverse_items(self, dataverse_id) -> list[ItemResult]:
async with self.network.GET(
f"api/dataverses/{dataverse_id}/contents"
) as response:
response_content = await response.json_content()
items = await asyncio.gather(
*[
self._get_dataverse_or_dataset_item(item)
for item in response_content["data"]
]
)
return [item for item in items if item]

async def _get_dataverse_or_dataset_item(self, item: dict):
match item["type"]:
case "dataset":
return await self._fetch_dataset(dataset_id=item["id"])
case "dataverse":
return parse_dataverse_as_subitem(item)
raise ValueError(f"Invalid item type: {item['type']}")

async def _fetch_file(self, dataverse_id) -> ItemResult:
async with self.network.GET(
"api/files/:persistentId", query={"persistentId": dataverse_id}
) as response:
return self._parse_datafile(await response.json_content())

async def _fetch_dataverse(self, dataverse_id) -> ItemResult:
async with self.network.GET(f"api/dataverses/{dataverse_id}") as response:
return parse_dataverse(await response.json_content())

async def _fetch_dataset_with_parser(
self,
dataset_id: str = None,
persistent_id: str = None,
parser=None,
) -> ItemResult | list[ItemResult] | None:
url = f"api/datasets/{':persistentId' if persistent_id else dataset_id}/versions/:latest-published"
query = {"persistentId": persistent_id} if persistent_id else {}
async with self.network.GET(url, query=query) as response:
if response.http_status == HTTPStatus.NOT_FOUND:
return None
return parser(await response.json_content())

async def _fetch_dataset(
self, dataset_id: str = None, persistent_id: str = None
) -> ItemResult:
return await self._fetch_dataset_with_parser(
dataset_id, persistent_id, parser=self._parse_dataset
)

async def _fetch_dataset_files(
self, dataset_id: str = None, persistent_id: str = None
) -> list[ItemResult]:
return await self._fetch_dataset_with_parser(
dataset_id, persistent_id, parser=self._parse_dataset_files
)

def _parse_datafile(self, data: dict):
if data.get("data"):
data = data["data"]

doi = data["dataFile"]["persistentId"]
return ItemResult(
item_id=f"file/{doi}",
item_name=data["label"],
item_type=ItemType.RESOURCE,
item_link=self._make_url("file", doi),
doi=doi,
)

def _parse_dataset_files(self, data: dict) -> list[ItemResult]:
if data.get("data"):
data = data["data"]
try:
return [self._parse_datafile(file) for file in data["files"]]
except (KeyError, IndexError) as e:
raise ValueError(f"Invalid dataset response:{e=}")

def _parse_dataset(self, data: dict) -> ItemResult:
if data.get("data"):
data = data["data"]
try:
doi = data["datasetPersistentId"]
return ItemResult(
item_id=f"dataset/{doi}",
item_name=[
item
for item in data["metadataBlocks"]["citation"]["fields"]
if item["typeName"] == "title"
][0]["value"],
item_type=ItemType.FOLDER,
item_link=self._make_url("dataset", doi),
doi=doi,
)
except (KeyError, IndexError) as e:
raise ValueError(f"Invalid dataset response: {e=}")


###
# module-local helpers


def parse_dataverse_as_subitem(data: dict):
return ItemResult(
item_type=ItemType.FOLDER,
item_name=data["title"],
item_id=f'dataverse/{data["id"]}',
)


def parse_dataverse(data: dict):
if data.get("data"):
data = data["data"]
return ItemResult(
item_type=ItemType.FOLDER,
item_name=data["name"],
item_id=f'dataverse/{data["id"]}',
)


def parse_mydata(data: dict):
if data.get("data"):
data = data["data"]
return ItemSampleResult(
items=[
ItemResult(
item_id=f"dataverse/{file['entity_id']}",
item_name=file["name"],
item_type=ItemType.FOLDER,
)
for file in data["items"]
],
total_count=data["total_count"],
next_sample_cursor=(
data["pagination"]["nextPageNumber"]
if data["pagination"]["hasNextPageNumber"]
else None
),
)
36 changes: 35 additions & 1 deletion addon_service/addon_imp/instantiation.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,12 @@
ComputingAddonImp,
ComputingConfig,
)
from addon_toolkit.interfaces.link import (
LinkAddonClientRequestorImp,
LinkAddonHttpRequestorImp,
LinkAddonImp,
LinkConfig,
)
from addon_toolkit.interfaces.storage import (
StorageAddonClientRequestorImp,
StorageAddonHttpRequestorImp,
Expand All @@ -26,6 +32,7 @@


if TYPE_CHECKING:
from addon_service.authorized_account.link.models import AuthorizedLinkAccount
from addon_service.authorized_account.models import AuthorizedAccount
from addon_service.models import (
AuthorizedCitationAccount,
Expand All @@ -37,14 +44,16 @@
async def get_addon_instance(
imp_cls: type[AddonImp],
account: AuthorizedAccount,
config: StorageConfig | CitationConfig | ComputingConfig,
config: StorageConfig | CitationConfig | ComputingConfig | LinkConfig,
) -> AddonImp:
if issubclass(imp_cls, StorageAddonImp):
return await get_storage_addon_instance(imp_cls, account, config)
elif issubclass(imp_cls, CitationAddonImp):
return await get_citation_addon_instance(imp_cls, account, config)
elif issubclass(imp_cls, ComputingAddonImp):
return await get_computing_addon_instance(imp_cls, account, config)
elif issubclass(imp_cls, LinkAddonImp):
return await get_link_addon_instance(imp_cls, account, config)
raise ValueError(f"unknown addon type {imp_cls}")


Expand Down Expand Up @@ -134,3 +143,28 @@ async def get_computing_addon_instance(


get_computing_addon_instance__blocking = async_to_sync(get_computing_addon_instance)


async def get_link_addon_instance(
imp_cls: type[LinkAddonImp], account: AuthorizedLinkAccount, config: LinkConfig
) -> LinkAddonImp:
"""create an instance of a `linkAddonImp`"""

assert issubclass(imp_cls, AddonImp)
assert imp_cls is not LinkAddonImp, "Addons shouldn't directly extend LinkAddonImp"
if issubclass(imp_cls, LinkAddonHttpRequestorImp):
imp = imp_cls(
network=GravyvaletHttpRequestor(
client_session=await get_singleton_client_session(),
prefix_url=config.external_api_url,
account=account,
),
config=config,
)
if issubclass(imp_cls, LinkAddonClientRequestorImp):
imp = imp_cls(credentials=await account.get_credentials__async(), config=config)

return imp


get_link_addon_instance__blocking = async_to_sync(get_link_addon_instance)
6 changes: 2 additions & 4 deletions addon_service/addon_operation_invocation/serializers.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
)
from addon_service.common import view_names
from addon_service.common.enum_serializers import EnumNameChoiceField
from addon_service.common.get_user_uri import get_user_uri
from addon_service.common.invocation_status import InvocationStatus
from addon_service.common.serializer_fields import (
CustomPolymorphicResourceRelatedField,
Expand Down Expand Up @@ -113,10 +114,7 @@ def create(self, validated_data):
_imp_cls = _thru_account.imp_cls
_operation = _imp_cls.get_operation_declaration(_operation_name)
_request = self.context["request"]
_user_uri = (
_request.session.get("user_reference_uri")
or f"{settings.OSF_BASE_URL}/anonymous"
)
_user_uri = get_user_uri(_request) or f"{settings.OSF_BASE_URL}/anonymous"
_user, _ = UserReference.objects.get_or_create(user_uri=_user_uri)
return AddonOperationInvocation(
operation=AddonOperationModel(_imp_cls.ADDON_INTERFACE, _operation),
Expand Down
Loading