Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/developer_guide/metadata.md
Original file line number Diff line number Diff line change
Expand Up @@ -279,4 +279,4 @@ Explicit migration centers around the schemaspace from which instances are being

When called `SchemasProvider.migrate()` enumerates instances of the given schema, updates those instances, persisting changes as necessary, and returns the list of migrated instances to the caller (i.e., the `migrate()` method on `Schemaspace`). (The code for the migration of `component-registry` instances to the `component-catalogs` schemaspace can be found [here](https://github.com/elyra-ai/elyra/blob/05bbdf22fa25b0a65f72c9054337f32fe5fde460/elyra/metadata/schemasproviders.py#L144-L212).)

To drive explicit migration, the `elyra-metadata` CLI tool has been updated with a `migrate` command - which acts on the _pre-migration_ schemaspace. This option essentially calls the `migrate()` method on the given `Schemaspace` instance - which then invokes the appropriate `SchemasProvider` to migrate its schema instances. See [Migrating user-defined component registries to 3.3](../user_guide/pipeline-components.html#migrating-user-defined-component-registries-to-3-3) for details.
To drive explicit migration, the `elyra-metadata` CLI tool has been updated with a `migrate` command - which acts on the _pre-migration_ schemaspace. This option essentially calls the `migrate()` method on the given `Schemaspace` instance - which then invokes the appropriate `SchemasProvider` to migrate its schema instances. See [Migrating user-defined component registries to 3.3](../user_guide/pipeline-components.html#migrating-user-defined-component-registries-to-3-3) for details.
33 changes: 27 additions & 6 deletions docs/source/user_guide/runtime-conf.md
Original file line number Diff line number Diff line change
Expand Up @@ -346,17 +346,17 @@ Specify `access_key_id` and `secret_access_key` as `cos_username` and `cos_passw

##### Cloud Object Storage Authentication Type (cos_auth_type)

Authentication type Elyra uses to gain access to Cloud Object Storage. This setting is required. Supported types are:
Authentication type Elyra uses to gain access to S3-compatible Cloud Object Storage. This setting is required. Supported types are:
- Username and password (`USER_CREDENTIALS`). This authentication type requires a username and password. Caution: this authentication mechanism exposes the credentials in plain text. When running Elyra on Kubernetes, it is highly recommended to use the `KUBERNETES_SECRET` authentication type instead.
- Username, password, and Kubernetes secret (`KUBERNETES_SECRET`). This authentication type requires a username, password, and the name of an existing Kubernetes secret in the target runtime environment. Refer to section [Cloud Object Storage Credentials Secret](#cloud-object-storage-credentials-secret) for details.
- Kubernetes secret (`KUBERNETES_SECRET`). This authentication type requires the name of an existing Kubernetes secret in the target runtime environment (namespace) and Elyra Jupyterlab environment (namespace). When running in context of Elyra and Jupyterlab, env variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY must be set. Refer to section [Cloud Object Storage Credentials Secret](#cloud-object-storage-credentials-secret) for details.
- IAM roles for service accounts (`AWS_IAM_ROLES_FOR_SERVICE_ACCOUNTS`). Supported for AWS only. Refer to the [AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) for details.

##### Cloud Object Storage Credentials Secret (cos_secret)

Kubernetes secret that's defined in the specified user namespace, containing the Cloud Object Storage username and password.
If specified, this secret must exist on the Kubernetes cluster hosting your pipeline runtime in order to successfully
execute pipelines. This setting is optional but is recommended for use in shared environments to avoid exposing a user's
Cloud Object Storage credentials.
Cloud Object Storage credentials. This setting still requires setting env variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for use within Elyra itself.

Example: `my-cos-secret`

Expand All @@ -375,19 +375,40 @@ data:
AWS_SECRET_ACCESS_KEY: <BASE64_ENCODED_YOUR_AWS_SECRET_ACCESS_KEY>
```

It is important that this secret is present in the target runtime environment namespace as well as in all namespaces that e.g. Kubeflow notebooks running Elyra is running in.
A very good, operations-oriented way to make the keys from the K8S secret available for the notebook container is via envFrom:
https://kubernetes.io/docs/tasks/inject-data-application/distribute-credentials-secure/#configure-all-key-value-pairs-in-a-secret-as-container-environment-variables
This makes all keys from our K8s secret available as env vars in the Kubeflow Notebook container.

```yaml
kind: Notebook
....
spec:
...
containers:
- name: jupyterlab
image: <elyra-kubeflow-jupyterlab-image>
envFrom:
- secretRef:
name: cos-secret
...
```
In Kubeflow notebooks, you can define env vars taken from K8S secrets (envFrom) via PodDefault specs
https://v0-7.kubeflow.org/docs/notebooks/setup/
In Open Data Hub, you either have to supply the two env vars AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY via Dashboard Workbench GUI manually (best saved in a Secret) or you can patch the notebook container spec with envFrom information manually.

##### Cloud Object Storage username (cos_username)

Username used to connect to Object Storage, if credentials are required for the selected authentication type.
Username used to connect to Object Storage, if user credentials are required for the selected authentication type. Does not need to be filled out for authentication type KUBERNETES_SECRET.

Example: `minio`

##### Cloud Object Storage password (cos_password)

Password for cos_username, if credentials are required for the selected authentication type.
Password for cos_username, if user credentials are required for the selected authentication type. Does not need to be filled out for authentication type KUBERNETES_SECRET.

Example: `minio123`


### Verifying runtime configurations

The [Elyra examples repository contains a basic pipeline](https://github.com/elyra-ai/examples/pipelines/setup_validation) that you can use to verify your runtime configurations:
Expand Down
16 changes: 6 additions & 10 deletions elyra/pipeline/airflow/airflow_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
# limitations under the License.
#

import os
from typing import Any

from elyra.metadata.manager import MetadataManager
Expand All @@ -29,25 +30,20 @@
def on_load(self, **kwargs: Any) -> None:
super().on_load(**kwargs)

update_required = False

if self.metadata.get("git_type") is None:
# Inject git_type property for metadata persisted using Elyra < 3.5:
self.metadata["git_type"] = SupportedGitTypes.GITHUB.name
update_required = True

if self.metadata.get("cos_auth_type") is None:
# Inject cos_auth_type property for metadata persisted using Elyra < 3.4:
# - cos_username and cos_password must be present
# - cos_secret may be present (above statement also applies in this case)
if self.metadata.get("cos_username") and self.metadata.get("cos_password"):
if len(self.metadata.get("cos_secret", "")) == 0:
if len(self.metadata.get("cos_secret", "").strip()) == 0:

Check warning on line 42 in elyra/pipeline/airflow/airflow_metadata.py

View check run for this annotation

Codecov / codecov/patch

elyra/pipeline/airflow/airflow_metadata.py#L42

Added line #L42 was not covered by tests
self.metadata["cos_auth_type"] = "USER_CREDENTIALS"
else:
self.metadata["cos_auth_type"] = "KUBERNETES_SECRET"
update_required = True

if update_required:
# save changes
MetadataManager(schemaspace="runtimes").update(self.name, self, for_migration=True)

Expand Down Expand Up @@ -79,12 +75,12 @@
)
elif self.metadata["cos_auth_type"] == "KUBERNETES_SECRET":
if (
len(self.metadata.get("cos_username", "").strip()) == 0
or len(self.metadata.get("cos_password", "").strip()) == 0
or len(self.metadata.get("cos_secret", "").strip()) == 0
len(self.metadata.get("cos_secret", "").strip()) == 0
or "AWS_ACCESS_KEY_ID" not in os.environ
or "AWS_SECRET_ACCESS_KEY" not in os.environ
):
raise ValueError(
"Username, password, and Kubernetes secret are required "
"env variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, and K8S secret name are required "
"for the selected Object Storage authentication type."
)
elif self.metadata["cos_auth_type"] == "AWS_IAM_ROLES_FOR_SERVICE_ACCOUNTS":
Expand Down
9 changes: 5 additions & 4 deletions elyra/pipeline/kfp/kfp_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
# limitations under the License.
#

import os
from typing import Any

from elyra.metadata.manager import MetadataManager
Expand Down Expand Up @@ -131,12 +132,12 @@ def pre_save(self, **kwargs: Any) -> None:
)
elif self.metadata["cos_auth_type"] == "KUBERNETES_SECRET":
if (
len(self.metadata.get("cos_username", "").strip()) == 0
or len(self.metadata.get("cos_password", "").strip()) == 0
or len(self.metadata.get("cos_secret", "").strip()) == 0
len(self.metadata.get("cos_secret", "").strip()) == 0
or "AWS_ACCESS_KEY_ID" not in os.environ
or "AWS_SECRET_ACCESS_KEY" not in os.environ
):
raise ValueError(
"Username, password, and Kubernetes secret are required "
"env variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, and K8S secret name are required "
"for the selected Object Storage authentication type."
)
elif self.metadata["cos_auth_type"] == "AWS_IAM_ROLES_FOR_SERVICE_ACCOUNTS":
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -502,7 +502,7 @@ def test_collect_envs(processor):
assert "USER_NO_VALUE" not in envs

# Repeat with non-None secret - ensure user and password envs are not present, but others are
envs = processor._collect_envs(test_operation, cos_secret="secret", cos_username="Alice", cos_password="secret")
envs = processor._collect_envs(test_operation, cos_secret="secret")

assert envs["ELYRA_RUNTIME_ENV"] == "airflow"
assert "AWS_ACCESS_KEY_ID" not in envs
Expand Down
2 changes: 0 additions & 2 deletions elyra/tests/pipeline/kfp/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -257,8 +257,6 @@ def create_runtime_config(rt_metadata_manager: MetadataManager, customization_op

if customization_options.get("use_cos_credentials_secret"):
kfp_runtime_config["metadata"]["cos_auth_type"] = "KUBERNETES_SECRET"
kfp_runtime_config["metadata"]["cos_username"] = "my_name"
kfp_runtime_config["metadata"]["cos_password"] = "my_password"
kfp_runtime_config["metadata"]["cos_secret"] = "secret-name"
else:
kfp_runtime_config["metadata"]["cos_auth_type"] = "USER_CREDENTIALS"
Expand Down
2 changes: 1 addition & 1 deletion elyra/tests/pipeline/test_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ def test_collect_envs(processor: KfpPipelineProcessor):
assert "USER_NO_VALUE" not in envs

# Repeat with non-None secret - ensure user and password envs are not present, but others are
envs = processor._collect_envs(test_operation, cos_secret="secret", cos_username="Alice", cos_password="secret")
envs = processor._collect_envs(test_operation, cos_secret="secret")

assert envs["ELYRA_RUNTIME_ENV"] == "kfp"
assert "AWS_ACCESS_KEY_ID" not in envs
Expand Down
20 changes: 15 additions & 5 deletions elyra/util/cos.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@

class CosClient(LoggingConfigurable):
"""
MinIO-based Object Storage client, enabling Elyra to upload and download
MinIO-based S3-compatible Object Storage client, enabling Elyra to upload and download
files.This client is configurable via traitlets.
"""

Expand All @@ -46,7 +46,7 @@
or len(os.environ.get("AWS_SECRET_ACCESS_KEY", "").strip()) == 0
):
raise RuntimeError(
"Cannot connect to object storage. No credentials "
"Cannot connect to S3-compatible object storage. No credentials "
" were provided and environment variables "
" AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are not "
" properly defined."
Expand All @@ -65,22 +65,32 @@
auth_type = config.metadata["cos_auth_type"]
self.endpoint = urlparse(config.metadata["cos_endpoint"])
self.bucket = config.metadata["cos_bucket"]
if auth_type in ["USER_CREDENTIALS", "KUBERNETES_SECRET"]:
if auth_type == "USER_CREDENTIALS":

Check warning on line 68 in elyra/util/cos.py

View check run for this annotation

Codecov / codecov/patch

elyra/util/cos.py#L68

Added line #L68 was not covered by tests
cred_provider = providers.StaticProvider(
access_key=config.metadata["cos_username"],
secret_key=config.metadata["cos_password"],
)
elif auth_type == "KUBERNETES_SECRET":
if "AWS_ACCESS_KEY_ID" in os.environ and "AWS_SECRET_ACCESS_KEY" in os.environ:
cred_provider = providers.EnvAWSProvider()

Check warning on line 75 in elyra/util/cos.py

View check run for this annotation

Codecov / codecov/patch

elyra/util/cos.py#L73-L75

Added lines #L73 - L75 were not covered by tests
else:
raise RuntimeError(

Check warning on line 77 in elyra/util/cos.py

View check run for this annotation

Codecov / codecov/patch

elyra/util/cos.py#L77

Added line #L77 was not covered by tests
"Cannot connect to S3-compatible object storage. No credentials "
" were provided and environment variables "
" AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are not "
" properly defined."
)
elif auth_type == "AWS_IAM_ROLES_FOR_SERVICE_ACCOUNTS":
if os.environ.get("AWS_ROLE_ARN") is None or os.environ.get("AWS_WEB_IDENTITY_TOKEN_FILE") is None:
raise RuntimeError(
"Cannot connect to object storage. "
"Cannot connect to S3-compatible object storage. "
f"Authentication provider '{auth_type}' requires "
"environment variables AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE."
)
# Verify that AWS_WEB_IDENTITY_TOKEN_FILE exists
if Path(os.environ["AWS_WEB_IDENTITY_TOKEN_FILE"]).is_file() is False:
raise RuntimeError(
"Cannot connect to object storage. The value of environment "
"Cannot connect to S3-compatible object storage. The value of environment "
"variable AWS_WEB_IDENTITY_TOKEN_FILE references "
f"'{os.environ['AWS_WEB_IDENTITY_TOKEN_FILE']}', which is not a valid file."
)
Expand Down
Loading