-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
Issue checklist
- This is a feature request/enhancement. And not a bug.
- I searched through the GitHub issues and this feature/enhancement has not been requested before.
- I have installed the latest version of Foundry DevTools and don't use an unsupported python version.
- Others could also benefit from this feature or enhancement and it is not a very specific use case.
Feature use case
Having transforms.external.systems implemented in foundry dev tools would allow me to:
- Improve my code structure: To follow software engineering best practices, I typically structure my external transforms into smaller components (e.g. Python classes) for which I can then create unit tests. At the moment this is very tedious with external transforms, as I have to keep all of the logic outside of the module that contains the external transform. The reason is that importing the module throws an error as transforms.external.systems is not implemented.
- Make me faster in developing external transforms: Being able to run external transforms locally would allow me to develop external transforms substantially faster, as I do not have to wait for checks to run etc.
Description of the Feature
The feature should allow me to execute external transforms locally. For this, it should also fetch the credentials that I have configured for the external transform.
Example transform:
from transforms.api import Output, transform
from transforms.external.systems import (
Credential,
EgressPolicy,
use_external_systems,
)
from myproject.ingest import MyExternalConnector
@use_external_systems(
egress_mdigital=EgressPolicy(
"ri.resource-policy-manager.global.network-egress-policy.123"
),
egress_oauth=EgressPolicy(
"ri.resource-policy-manager.global.network-egress-policy.123"
),
creds=Credential("ri.credential..credential.123"),
)
@transform(
raw_dataset=Output("ri.foundry.main.dataset.143"),
)
def compute(
egress_mdigital, egress_oauth, creds, raw_dataset: Output, ctx
):
client_id = creds.get("datasource_client_id")
client_secret = creds.get("datasource_client_secret")
connector = MyExternalConnector(
client_id=client_id,
client_secret=client_secret,
)
pdf_data = connector.download_data()
df_data = ctx.spark_session.createDataFrame(pdf_data)
raw_dataset.write_dataframe(df_data)
Example local execution:
from myproject.datasets import my_external_transform
df = my_external_transform.compute.compute()
Alternatives you considered
No response
Additional Context
No response
Waschenbacher
Metadata
Metadata
Assignees
Labels
No labels