Skip to content

WIP: Test CI#12

Closed
mprahl wants to merge 2 commits into
opendatahub-io:orgnization-supportfrom
mprahl:multi-tenancy-backend-only
Closed

WIP: Test CI#12
mprahl wants to merge 2 commits into
opendatahub-io:orgnization-supportfrom
mprahl:multi-tenancy-backend-only

Conversation

@mprahl

@mprahl mprahl commented Nov 21, 2025

Copy link
Copy Markdown

Summary by CodeRabbit

  • New Features

    • Added workspace (multi-tenancy) support for logical isolation of MLflow resources within shared infrastructure.
    • Introduced workspace management APIs: create, list, get, update, and delete workspaces.
    • Added workspace-scoped permissions to control resource access per workspace.
    • Introduced workspace-aware REST API endpoints and workspace-prefixed routes.
    • Added CLI flags --enable-workspaces and --workspace-store-uri to configure multi-tenant deployments.
  • Documentation

    • Added comprehensive workspace configuration, setup, and permissions documentation.
    • Added workspace architecture overview and provider guidance.

✏️ Tip: You can customize this high-level summary in your review settings.

- Introduce the workspace store abstraction (SQLAlchemy + REST), name validation, and registry/URI utilities with CLI flags and env vars.
- Expose workspace CRUD endpoints in the Flask server with proper guards/feature flag wiring plus MlflowClient surface methods.
- Add the Workspace entity, tests for the stores/registry/client, and docs entries so the new multi-tenant provider can be exercised end-to-end.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
@coderabbitai

coderabbitai Bot commented Nov 21, 2025

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This pull request introduces comprehensive workspace (multi-tenancy) support to MLflow, enabling isolated resource groupings within a single deployment. It includes new workspace management APIs, permission controls, workspace-scoped artifact handling, database schema changes, server routing updates, and extensive test coverage.

Changes

Cohort / File(s) Summary
Workspace API & Core
mlflow/__init__.py, mlflow/tracking/client.py, mlflow/tracking/_workspace/fluent.py, mlflow/tracking/_workspace/client.py, mlflow/tracking/_workspace/context.py, mlflow/tracking/_workspace/registry.py, mlflow/tracking/_workspace/utils.py
New fluent and client-facing APIs for workspace CRUD operations with context management, provider resolution, and URI handling.
Workspace Store Implementations
mlflow/store/workspace/abstract_store.py, mlflow/store/workspace/sqlalchemy_store.py, mlflow/store/workspace/rest_store.py, mlflow/store/workspace/__init__.py, mlflow/store/workspace/utils.py
Abstract store interface, SQLAlchemy-backed provider with DB persistence, REST-backed remote provider, and utility helpers.
CLI & Server Configuration
mlflow/cli/__init__.py, mlflow/environment_variables.py, mlflow/utils/server_cli_utils.py
New CLI options --enable-workspaces and --workspace-store-uri, environment variables for workspace configuration, and validation logic.
Server-side Workspace Routing & Handlers
mlflow/server/__init__.py, mlflow/server/handlers.py, mlflow/server/workspace_helpers.py
Workspace-aware route registration with /workspaces/<name> prefixes, request lifecycle handlers, RBAC resource mapping, and endpoint aggregation.
Authentication & Permissions
mlflow/server/auth/__init__.py, mlflow/server/auth/client.py, mlflow/server/auth/config.py, mlflow/server/auth/routes.py, mlflow/server/auth/sqlalchemy_store.py, mlflow/server/auth/basic_auth.ini
Workspace permission endpoints, AuthContext, permission resolution, workspace-scoped access control, and configuration.
Auth Database Models & Migrations
mlflow/server/auth/db/models.py, mlflow/server/auth/db/utils.py, mlflow/server/auth/db/migrations/versions/2ed73881770d_workspace_permissions.py
New SqlWorkspacePermission ORM model, updated SqlRegisteredModelPermission with workspace column, Alembic migration creating workspace_permissions table.
Entity Workspace Support
mlflow/entities/experiment.py, mlflow/entities/model_registry/registered_model.py, mlflow/entities/model_registry/model_version.py, mlflow/entities/evaluation_dataset.py, mlflow/entities/webhook.py, mlflow/entities/workspace.py
Added workspace property to key entities with resolution, serialization, and deserialization support.
Store Workspace Integration
mlflow/store/workspace_aware_mixin.py, mlflow/store/tracking/abstract_store.py, mlflow/store/tracking/sqlalchemy_store.py, mlflow/store/tracking/rest_store.py, mlflow/store/model_registry/abstract_store.py, mlflow/store/model_registry/sqlalchemy_store.py, mlflow/store/model_registry/base_rest_store.py, mlflow/store/tracking/dbmodels/models.py, mlflow/store/model_registry/dbmodels/models.py
Workspace-scoped query filtering, workspace-aware entity construction, capability queries, and workspace isolation enforcement in tracking/model registry stores.
Artifact Repository Workspace Support
mlflow/store/artifact/artifact_repo.py, mlflow/store/artifact/artifact_repository_registry.py, mlflow/store/artifact/http_artifact_repo.py
Base for_workspace hook, registry-level workspace adaptation, and HTTP repo workspace-scoped URI handling and artifact path prefixing.
Tracing & OTLP Workspace Support
mlflow/server/otel_api.py, mlflow/tracing/utils/otlp.py
Workspace-scoped OTLP ingestion endpoint and path constant.
REST & Utilities
mlflow/utils/rest_utils.py, mlflow/tracking/_tracking_service/client.py, mlflow/tracking/default_experiment/registry.py, mlflow/tracking/fluent.py, mlflow/utils/workspace_utils.py, mlflow/models/model.py, mlflow/store/db/utils.py
Workspace-aware endpoint prefixing, experiment URL generation with workspace segments, default experiment workspace logic, entity workspace resolution, and DB table tracking.
Documentation
docs/sidebarsSelfHosting.ts, docs/api_reference/api_inventory.txt, docs/api_reference/source/rest-api.rst, docs/docs/self-hosting/architecture/overview.mdx, docs/docs/self-hosting/index.mdx, docs/docs/self-hosting/workspaces/\*
New multi-tenancy documentation, workspace configuration guide, getting-started tutorial, permissions documentation, workspace provider architecture, and API updates.
Workflow & Tests
.github/workflows/master.yml, tests/\*
GitHub Actions workflow expanded to include workspace SQLAlchemy store tests; extensive new test modules covering workspace endpoints, permissions, store implementations, REST integration, and isolation semantics.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Client as MlflowClient
    participant WS as WorkspaceContext
    participant Store as TrackingStore
    participant Auth as AuthStore
    participant DB as Database

    User->>Client: set_workspace("team-a")
    Client->>WS: set_current_workspace("team-a")
    WS->>WS: Validate & store workspace
    
    User->>Client: create_experiment("exp1")
    Client->>WS: get_current_workspace()
    WS-->>Client: "team-a"
    Client->>Store: create_experiment(name, workspace="team-a")
    Store->>DB: INSERT experiment (workspace='team-a')
    DB-->>Store: experiment_id=1
    Store-->>Client: Experiment(id=1, workspace='team-a')
    Client-->>User: Experiment

    rect rgb(200, 220, 255)
        Note over User,DB: Workspace-scoped isolation
        User->>WS: set_current_workspace("team-b")
        WS-->>WS: Switch context
        User->>Client: get_experiment(experiment_id=1)
        Client->>WS: get_current_workspace()
        WS-->>Client: "team-b"
        Client->>Store: get_experiment(1, workspace="team-b")
        Store->>DB: SELECT WHERE id=1 AND workspace='team-b'
        DB-->>Store: NOT FOUND
        Store-->>Client: MlflowException(RESOURCE_NOT_FOUND)
        Client-->>User: Error: Experiment not accessible
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45–90 minutes

  • Areas requiring extra attention:
    • Database migration logic (mlflow/server/auth/db/migrations/versions/2ed73881770d_workspace_permissions.py) — careful review of downgrade path and conflict detection
    • Permission resolution precedence and workspace fallback logic in mlflow/server/auth/__init__.py and mlflow/server/auth/sqlalchemy_store.py — intricate permission inheritance and validation
    • Workspace-scoped query filtering across tracking and model registry stores (mlflow/store/tracking/sqlalchemy_store.py, mlflow/store/model_registry/sqlalchemy_store.py) — ensure no data leakage between workspaces
    • Request lifecycle and routing in mlflow/server/handlers.py and mlflow/server/workspace_helpers.py — workspace context propagation and endpoint prefixing correctness
    • Artifact repository workspace adaptation and path prefixing in mlflow/store/artifact/http_artifact_repo.py — ensure artifact isolation and correct path construction

🐰 Hopping through workspaces with glee,
Tenants now isolated, finally free!
Multi-team MLflow dreams come true,
With permissions and scope, something new!
Isolation tested from A to Z,
Workspace magic—hip hip hooray! 🎉

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title 'WIP: Test CI' is vague and does not accurately describe the substantial changes in this PR, which involve implementing multi-tenancy with workspaces across MLflow. Revise the title to clearly describe the main objective, such as 'Add multi-tenancy workspace support to MLflow server and tracking' or 'Implement workspace-scoped resource isolation and RBAC'.
Description check ⚠️ Warning The PR description is entirely empty with no explanation of the changes, objectives, or testing approach despite the template being available and the PR containing extensive workspace multi-tenancy implementation. Fill in the PR description using the provided template, including a summary of workspace features, related issues, testing details, and appropriate release note classification.
Docstring Coverage ⚠️ Warning Docstring coverage is 41.26% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mprahl

mprahl commented Nov 21, 2025

Copy link
Copy Markdown
Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Nov 21, 2025

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions

Copy link
Copy Markdown

@mprahl Thank you for the contribution! Could you fix the following issue(s)?

⚠ Invalid PR template

This PR does not appear to have been filed using the MLflow PR template. Please copy the PR template from here and fill it out.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
mlflow/server/auth/db/models.py (1)

64-82: Address missing database migration for workspace column

The review's first concern is valid: adding a nullable=False workspace column without a migration will break on existing deployments. There is no alembic migration directory or migration scripts in this repository.

The second concern is inaccurate: all call sites properly set workspace. The only instantiation (sqlalchemy_store.py:233) explicitly sets workspace=workspace_name via _get_active_workspace_name(), and all query operations consistently filter by workspace. The code correctly ensures workspace is always provided.

Action required:

  • Create a database migration to add the workspace column with a default value (e.g., DEFAULT_WORKSPACE_NAME) for existing rows, then alter it to nullable=False.
mlflow/cli/__init__.py (1)

564-608: workspace_store_uri is claimed to be ignored but still passed to backend initialization

When enable_workspaces is False and workspace_store_uri is provided, you:

  • Print a message via click.echo that --workspace-store-uri is being ignored (lines 568–573).
  • Later, still pass workspace_store_uri into initialize_backend_stores (line 628).

This is inconsistent: the CLI tells users their value is ignored, but the backend may receive and act on it. That can lead to surprising behavior depending on how initialize_backend_stores interprets workspace_store_uri.

If the intent is “completely ignore workspace_store_uri unless workspaces are enabled,” then workspace_store_uri should be nulled out before backend initialization in the disabled case:

-    if enable_workspaces:
-        os.environ[MLFLOW_ENABLE_WORKSPACES.name] = "true"
-        if workspace_store_uri:
-            os.environ[MLFLOW_WORKSPACE_URI.name] = workspace_store_uri
-    elif workspace_store_uri:
-        click.echo(
-            "Ignoring --workspace-store-uri because workspaces are not enabled. "
-            "Use --enable-workspaces to activate workspace mode.",
-            err=True,
-        )
+    if enable_workspaces:
+        os.environ[MLFLOW_ENABLE_WORKSPACES.name] = "true"
+        if workspace_store_uri:
+            os.environ[MLFLOW_WORKSPACE_URI.name] = workspace_store_uri
+    elif workspace_store_uri:
+        click.echo(
+            "Ignoring --workspace-store-uri because workspaces are not enabled. "
+            "Use --enable-workspaces to activate workspace mode.",
+            err=True,
+        )
+        workspace_store_uri = None

You may also want to mirror this behavior around the later MLFLOW_WORKSPACE_URI.set(...) / warning block so the logged warning and the passed argument stay consistent.

Also applies to: 618-622, 624-629

mlflow/tracking/client.py (1)

1872-1885: Clarify create_experiment behavior when workspaces are enabled

The docstring now says that artifact_location “cannot be specified” when workspaces are enabled, but this method still accepts and forwards a non-None artifact_location to the tracking client. If the restriction is enforced only server‑side, consider rephrasing to something like “may be rejected by the server when workspaces are enabled,” or, if the client is also expected to enforce it, add a small guard here to raise when both workspaces are enabled and artifact_location is provided.

mlflow/server/handlers.py (1)

536-585: Workspace gating ignores MLFLOW_ARTIFACT_ROOT despite the error message

initialize_backend_stores currently checks only the default_artifact_root argument:

if MLFLOW_ENABLE_WORKSPACES.get() and not default_artifact_root:
    raise MlflowException.invalid_parameter_value(
        "--enable-workspaces requires --default-artifact-root or the "
        "MLFLOW_ARTIFACT_ROOT environment variable to be set."
    )

but it does not actually consider the MLFLOW_ARTIFACT_ROOT env var here, even though _get_tracking_store later uses it. This will reject valid configurations where users rely solely on MLFLOW_ARTIFACT_ROOT while enabling workspaces.

To align behavior with the message and _get_tracking_store semantics, derive the effective artifact root in the same way before enforcing the check, e.g.:

 def initialize_backend_stores(
     backend_store_uri: str | None = None,
     registry_store_uri: str | None = None,
     default_artifact_root: str | None = None,
     workspace_store_uri: str | None = None,
 ) -> None:
-    if MLFLOW_ENABLE_WORKSPACES.get() and not default_artifact_root:
-        raise MlflowException.invalid_parameter_value(
-            "--enable-workspaces requires --default-artifact-root or the "
-            "MLFLOW_ARTIFACT_ROOT environment variable to be set."
-        )
+    from mlflow.server import ARTIFACT_ROOT_ENV_VAR
+
+    effective_artifact_root = default_artifact_root or os.environ.get(ARTIFACT_ROOT_ENV_VAR)
+    if MLFLOW_ENABLE_WORKSPACES.get() and not effective_artifact_root:
+        raise MlflowException.invalid_parameter_value(
+            "--enable-workspaces requires --default-artifact-root or the "
+            "MLFLOW_ARTIFACT_ROOT environment variable to be set."
+        )
 
-    tracking_store = _get_tracking_store(backend_store_uri, default_artifact_root)
+    tracking_store = _get_tracking_store(backend_store_uri, default_artifact_root)

This preserves existing _get_tracking_store behavior while making the validation accurate.

mlflow/server/auth/__init__.py (1)

246-270: Clarify comment regarding workspace fallback behavior.

The comment at lines 261-264 states "fallback_permission is only None when workspaces are not enabled," but this appears inverted based on the implementation. When workspaces are enabled and a workspace_fallback is provided, it should return a Permission object or NO_PERMISSIONS. The None return value actually indicates workspaces are disabled or unsupported, allowing fallback to auth_config.default_permission.

Consider revising the comment for clarity:

-            # fallback_permission is only None when workspaces are not enabled.
-            # fallback_permissions defaults to NO_PERMISSIONS. In effect, this means that
-            # auth_config.default_permission is not supported when workspaces are enabled
-            # to keep workspace isolation.
+            # fallback_permission is None when workspaces are disabled or unsupported.
+            # When workspaces are enabled, workspace_fallback returns a Permission object
+            # (possibly NO_PERMISSIONS) to maintain workspace isolation without falling back
+            # to auth_config.default_permission.
🧹 Nitpick comments (48)
tests/store/tracking/test_sqlalchemy_store.py (1)

7791-7807: Consider also asserting the model remains deleted after logging tags

The test correctly verifies that set_logged_model_tags writes tags even when the logged model has been soft-deleted by inspecting SqlLoggedModelTag. To fully mirror the params test above and guard against accidental “resurrection” of a model in future refactors, you could optionally add an assertion that get_logged_model(model.model_id) still raises "not found" after the tag write.

 def test_set_logged_model_tags_on_deleted_model(store: SqlAlchemyStore):
@@
     store.set_logged_model_tags(model.model_id, [LoggedModelTag("tag1", "apple")])
@@
     assert tags_dict == {"tag1": "apple"}
+
+    # Ensure write does not resurrect the model
+    with pytest.raises(MlflowException, match="not found"):
+        store.get_logged_model(model.model_id)
mlflow/server/auth/entities.py (1)

1-1: Workspace-aware permission entities look consistent; consider normalizing workspace in WorkspacePermission too

Using resolve_entity_workspace_name(workspace) in RegisteredModelPermission plus "workspace" in to_json/from_json provides a backward-compatible way to add workspace context and normalize names from older payloads. The new WorkspacePermission class enforces that all required fields are present and serializes cleanly. If you want consistent workspace normalization across entities, you might optionally call resolve_entity_workspace_name on the workspace argument in WorkspacePermission.__init__ as well; otherwise, it’s fine as-is if this class is only used where an explicit, already-normalized workspace is required.

Also applies to: 142-191, 244-288

tests/helpers/db_mocks.py (1)

7-22: Shared managed-session mock looks correct; consider extendability

The mock factory mirrors the expected _get_managed_session_maker surface (accepts *args, **kwargs, returns a contextmanager, yields a session whose query() is chainable), which is sufficient for the current tests that expect empty query results (first()None).

If future tests need to assert on returned rows, you might consider:

  • Allowing _mock_query to be parameterized (e.g., via closure or attributes on the session) so first() / all() can return controlled values instead of always None.
tests/tracking/_model_registry/test_utils.py (1)

20-21: Using a shared managed-session mock keeps this test focused on store wiring

Patching _get_managed_session_maker with mock_get_managed_session_maker is appropriate here: the mock’s signature is compatible and avoids hitting real DB session machinery while the test only validates SqlAlchemyStore construction and db_uri.

As a small enhancement, you could assert that _get_managed_session_maker was invoked at least once (e.g., via assert mock_get_managed_session_maker.called) to ensure the test fails if the managed-session path stops being used.

Also applies to: 282-299

mlflow/tracking/default_experiment/registry.py (1)

4-7: Workspace-aware default experiment resolution looks sound; minor polish possible

The new get_experiment_id flow for workspace-enabled, non-Databricks URIs is reasonable:

  • It respects existing providers first, then only engages the workspace-aware path.
  • Uses MlflowClient to get_experiment_by_name / create_experiment for Experiment.DEFAULT_EXPERIMENT_NAME.
  • Correctly handles concurrent creation via RESOURCE_ALREADY_EXISTS and re-fetches the experiment.
  • Preserves prior behavior for Databricks URIs and when workspaces are disabled.

Two small, non-blocking improvements:

  1. Consider factoring the workspace-specific block into a private helper (e.g., _get_workspace_default_experiment_id()), which would make get_experiment_id() easier to read and simpler to test in isolation.
  2. When raising the INVALID_STATE MlflowException, including the current workspace name in the message (if cheaply available) would aid debugging misconfigured environments.

Also applies to: 79-105

tests/server/test_workspace_handlers.py (1)

16-76: Workspace handler tests are thorough; align dummy stores with real return types

These tests give good coverage of:

  • Creating a workspace and ensuring a default experiment is created in the correct workspace.
  • Rejecting deletion for non-empty workspaces (extra experiments or default experiment with runs).
  • Allowing deletion when only a default experiment without runs exists.
  • Correctly scoping artifact routes and repository paths to the active workspace.

One small robustness improvement: in your dummy tracking stores, create_experiment currently doesn’t return an experiment ID, whereas real tracking stores do. Returning a simple placeholder ID (e.g., "0" or an incrementing string) would keep the test doubles closer to the production interface and reduce the risk of subtle failures if _create_workspace_handler or _delete_workspace_handler start relying on the return value.

Also applies to: 81-124, 149-251

tests/tracking/test_rest_tracking.py (2)

116-162: Client and workspace fixtures are well-isolated; note interaction with env-driven workspaces

The fixture changes look solid:

  • store_type now skips file backends when MLFLOW_ENABLE_WORKSPACES is set, which matches the fact that file stores don’t support workspaces.
  • mlflow_client and workspace_mlflow_client both:
    • Force-reset backend stores before use.
    • Use ServerThread in a try / finally and restore tracking_uri / registry_uri afterward.
    • Clear handler globals (handlers._tracking_store, _model_registry_store, _workspace_store) and workspace_helpers._workspace_store to avoid cross-test leakage.
  • The autouse reset_workspace_context fixture ensures clear_workspace() runs around every test, further containing workspace state.

One thing to be aware of: if MLFLOW_ENABLE_WORKSPACES is set in the process environment before pytest runs (outside these fixtures), all store_type == "file" parametrizations in this module will be skipped. That seems consistent with the capabilities of the file backend, but it does mean enabling workspaces globally will also bypass file-backed integration coverage here.

Also applies to: 164-240


4732-5179: GraphQL helpers and workspace GraphQL tests are well-structured and align with workspace semantics

The new GraphQL utilities and tests look well thought out:

  • _graphql_post centralizes:
    • Choosing /graphql vs /workspaces/{workspace}/graphql based on get_current_workspace().
    • URL construction with urllib.parse.urljoin.
    • Consistent JSON payload and headers.
  • _graphql_error_messages provides a tolerant way to extract human-readable error messages from varied GraphQL error payload shapes, which keeps assertions concise in tests.
  • The suite of workspace GraphQL tests exercises:
    • Experiments, runs, model versions, and artifacts being accessible only within the correct workspace and returning either empty results or meaningful “not found” errors across workspaces.
    • Default workspace behavior when no workspace context is active.
    • Permission routing (test_workspace_permission_resource_type_detection) by verifying that experiment and registered-model permission helpers derive the correct (resource_type, workspace_name) pairs before delegating to _workspace_permission.

Overall, these tests give solid coverage for both the GraphQL routing layer and the workspace-aware semantics of the schema. I don’t see any correctness issues; future refactors can reuse _graphql_post and _graphql_error_messages in other modules if similar tests are added.

tests/store/artifact/test_http_artifact_repo.py (1)

399-410: Workspace rewrite test matches expected URI shape; consider adding no-workspace case

The rewrite from /artifacts/workspaces/team-a/... to /workspaces/team-a/artifacts/... looks correct and tightly validates _get_workspace_scoped_artifact_uri(). You might add a complementary test where neither the path nor environment specifies a workspace (expecting (original_uri, None)) to lock in the no-op behavior when workspace resolution fails.

tests/server/auth/db/test_cli.py (2)

20-29: Make table-name assertion more robust using a set of names

Including workspace_permissions in the expected tables is correct. To make this a bit more future-proof (and consistent with test_auth_and_tracking_store_coexist), you could assert on a set of table names instead of sorted(tables) (list of tuples), e.g. {t[0] for t in tables}. That avoids churn if additional auth tables are added later.


121-124: Legacy upgrade test aligns with new migration; consider also checking workspace_permissions

Asserting that scorer_permissions exists and that alembic_version_auth is 2ed73881770d matches the new migration. For completeness, you might also assert that workspace_permissions is present in tables here, since the same migration is responsible for adding it.

docs/docs/self-hosting/workspaces/index.mdx (1)

1-152: Clear overview; only minor optional clarifications

This page reads cleanly and is technically coherent with the rest of the workspace work (resource scoping, artifact isolation, resolution order, and backward compatibility with the default workspace). If you want to tighten it further, you could briefly call out mlflow.get_workspace() in the “Quick Start” or “Workspace Resolution” sections as a way for operators to verify which workspace a client is actually using, since you already link it in the API Reference.

tests/entities/model_registry/test_registered_model.py (2)

10-31: Workspace threading in tests looks good; small cleanup opportunity

Plumbing workspace through _check and including "workspace": DEFAULT_WORKSPACE_NAME in the various as_dict expectations correctly validates the new field on RegisteredModel. One minor nit: assert registered_model.last_updated_timestamp == last_updated_timestamp appears twice in _check (Lines 24 and 26); you can safely drop one of them to avoid redundant assertions.

Also applies to: 39-52, 93-107, 120-139, 147-167


174-189: Avoid hard‑coding the default workspace name in the repr expectation

The updated __str__ expectation including workspace='default' is consistent with the new behavior, but it now duplicates knowledge of the default workspace name. Since you already import DEFAULT_WORKSPACE_NAME, consider building the expected string with that constant (e.g., via an f-string) so the test won’t break if the default workspace identifier ever changes.

docs/docs/self-hosting/workspaces/configuration.mdx (1)

1-255: Configuration doc is thorough; only tiny optional clarity tweaks

This configuration guide is comprehensive and aligns with the implementation details implied by the tests (SQL-only, supports_workspaces(), migration/rollback behavior, env vars). If you want, you could annotate the supports_workspaces() snippet as illustrative (e.g., a comment noting it’s the interface stores must implement) to avoid it being read as literal top-level code, but that’s purely cosmetic.

tests/store/model_registry/test_rest_store.py (2)

161-174: LGTM! Consider using a fixture or helper method for setting workspace support.

The test correctly verifies that an error is raised when workspace support is disabled but an active workspace is set. However, directly mutating store._workspace_support accesses a private attribute. If the implementation changes, this test may become fragile.

Consider adding a helper method or fixture to configure workspace support state for testing.


177-190: LGTM! Whitespace handling is correctly validated.

The test verifies that whitespace-only workspace strings are ignored when workspace support is disabled. The logic is sound and the test provides good edge case coverage.

Same recommendation as the previous test: consider using a helper method to set _workspace_support instead of direct private attribute access.

mlflow/tracking/_tracking_service/client.py (1)

31-32: Workspace-aware experiment URLs and artifact_location docs look correct

  • Using get_current_workspace() plus quote(..., safe="") to build /#/workspaces/{workspace}/experiments/{id} keeps the URL hash router compatible and avoids unsafe workspace characters.
  • The create_experiment docstring accurately delegates enforcement of artifact_location under workspaces to the server, which is appropriate for a client wrapper.

If you want to reduce drift over time, you could optionally reuse a shared helper (e.g., the workspace resolution used in mlflow.utils.rest_utils) for determining the active workspace, but this is not blocking.

Also applies to: 61-62, 292-295, 776-787

tests/server/test_workspace_endpoints.py (1)

13-105: Good coverage for workspace CRUD endpoints; minor opportunity to tighten assertions

The tests cover the full workspace CRUD surface (list/create/get/update/delete), validate response payloads, and correctly inject a mock store via _get_workspace_store. This should give good confidence in the handlers.

As a minor improvement, for create_workspace and update_workspace you might assert the call arguments on mock_workspace_store.create_workspace / update_workspace (e.g., assert_called_once_with(...)) to ensure the handler passes through the expected Workspace payload, not just that it was invoked.

tests/utils/test_rest_utils.py (1)

10-37: Workspace-aware route prefixing tests cover the important behaviors

The new imports and _maybe_prefix_with_workspace* tests exercise the key cases: MLflow tracking and artifact APIs are scoped under /workspaces/<name>, non-MLflow routes are left untouched, double-prefixing is avoided, and environment-driven workspace values are trimmed or ignored when blank. This gives good confidence around the new workspace prefixing logic; adding an explicit test for the precedence when both a WorkspaceContext and MLFLOW_WORKSPACE are set could be a future nice-to-have, but is not strictly necessary.

Also applies to: 112-160

mlflow/store/workspace/utils.py (1)

9-57: Default-workspace helper is clear; consider handling missing method more defensively

The helper cleanly normalizes NotImplementedError and MlflowException with INVALID_PARAMETER_VALUE into (None, False) and re-raises everything else, which matches the documented contract and gives callers an explicit capability flag.

If you expect third-party or legacy workspace stores that might not implement get_default_workspace at all, you could make this a bit more defensive by treating a missing method as "no default workspace support" instead of leaking an AttributeError, for example:

-    try:
-        workspace = workspace_store.get_default_workspace()
+    try:
+        get_default_workspace = workspace_store.get_default_workspace  # may raise AttributeError
+        workspace = get_default_workspace()
     except NotImplementedError:
         ...

This is optional, but would make the behavior nicer for custom providers.

tests/tracking/test_client_workspace.py (1)

7-52: Workspace URI resolution test is solid; make the dummy client’s __init__ more flexible

The test nicely validates the resolution precedence (explicit workspace_uri argument, then MLFLOW_WORKSPACE_URI, then tracking URI fallback) by recording what DummyWorkspaceClient sees.

To make the test more resilient to future changes in WorkspaceProviderClient’s constructor, consider loosening the dummy’s signature so it tolerates extra positional/keyword arguments while still capturing workspace_uri, for example:

-    class DummyWorkspaceClient:
-        def __init__(self, workspace_uri: str | None = None):
-            recorded["workspace_uri"] = workspace_uri
+    class DummyWorkspaceClient:
+        def __init__(self, *_, workspace_uri: str | None = None, **__):
+            recorded["workspace_uri"] = workspace_uri

This keeps the assertion logic the same but avoids test breakage if the real client’s initializer evolves.

mlflow/store/workspace_aware_mixin.py (2)

26-30: Clarify _workspaces_enabled semantics wrt env var type

If MLFLOW_ENABLE_WORKSPACES.get() already returns a boolean (as in other MLflow env flags), wrapping it in bool(...) is redundant and can hide misconfiguration (e.g., unexpected non-bool values always becoming True). Consider either:

  • Returning the value directly, or
  • Explicitly normalizing/validating the type if this env var might be non-boolean.

This keeps the behavior easier to reason about and aligns with other env flag helpers.


59-80: _workspace_session helper looks solid; consider adding a return type

The context manager correctly couples workspace resolution with ManagedSessionMaker usage and should make callers less error-prone. For extra clarity and editor support, you might add a return type like @contextmanager def _workspace_session(self) -> Iterator[tuple[Session, str]]: (or the appropriate SQLAlchemy session type) where feasible.

tests/store/workspace/test_sqlalchemy_store.py (1)

11-35: Fixture seeding default workspace bypasses store API

The workspace_store fixture seeds the default workspace directly via SqlWorkspace and a raw session. That’s fine for tight control, but it also means get_default_workspace()’s creation path isn’t exercised here.

If you want the tests to also validate the public API for default-workspace creation, consider:

  • Either using workspace_store.get_default_workspace() in the fixture to ensure it creates/seeds the default, or
  • Adding a dedicated test that starts from an empty DB and asserts that get_default_workspace() creates and returns the default record.

Not mandatory, but would tighten end‑to‑end coverage.

mlflow/server/workspace_helpers.py (2)

19-24: Avoid duplicating the static-prefix env var constant

STATIC_PREFIX_ENV_VAR = "_MLFLOW_STATIC_PREFIX" is defined here, while mlflow.server.__init__ imports STATIC_PREFIX_ENV_VAR from mlflow.server.handlers. Maintaining the same literal in multiple modules risks drift.

Consider centralizing this constant in a single module (e.g., mlflow.server.handlers or a small shared config module) and importing it here, so any future change only needs to be done once.


187-218: Document/make explicit the memoization behavior of _get_workspace_store

_get_workspace_store memoizes _workspace_store on first call and then ignores any different workspace_uri / tracking_uri arguments later. That’s consistent with the docstring, but in a multi-threaded server:

  • Two concurrent “first” calls may instantiate the store twice (benign but slightly wasteful).
  • The first caller’s resolved URIs permanently win, which may surprise future maintainers if they try to vary URIs per call.

Given that, the implementation is acceptable, but it’d be good to:

  • Explicitly mention “first call wins, subsequent arguments are ignored” in any higher-level docs, and/or
  • Add a lightweight guard (e.g., log a warning if _workspace_store is already set but different URIs are passed).

This helps avoid subtle misconfiguration surprises.

mlflow/server/__init__.py (1)

84-125: Reuse _strip_workspace_kwarg to avoid duplicating “drop workspace_name” logic

_workspace_aware_route correctly registers both standard and workspace‑prefixed routes and strips workspace_name before calling the underlying handler. However, you now have two slightly different implementations of the same concern:

  • _strip_workspace_kwarg in mlflow.server.workspace_helpers
  • The inline workspace_wrapper(workspace_name, **kwargs) here

To keep behavior consistent and reduce maintenance overhead, consider:

  • Using _strip_workspace_kwarg(func) here and routing that as the workspace variant handler, or
  • Moving the shared stripping logic into a single helper and having both places call it.

That way, any future change to how workspace_name is handled only needs to be made once.

mlflow/server/auth/client.py (1)

49-61: _request empty-response handling and workspace permission APIs look good; minor robustness/naming nits

  • The _request change to return {} for 204 or empty bodies avoids JSON decoding errors for endpoints that don’t return a payload, which matches how many of the “update/delete” methods already ignore the response.
  • list_workspace_permissions, set_workspace_permission, delete_workspace_permission, and list_user_workspace_permissions follow the existing permission-API patterns and correctly URL-encode workspace_name via _workspace_endpoint, which is important for names containing / or spaces.

Two minor points you might consider:

  1. Assumption about list responses
    list_workspace_permissions and list_user_workspace_permissions unconditionally index resp["permissions"]. If a server implementation ever returns 204 or {} on “no data”, this will raise. Safer would be something like for p in resp.get("permissions", []), while tests can still assert that the server usually returns a populated "permissions" field.

  2. Route constant naming vs. semantics
    Using LIST_WORKSPACE_PERMISSIONS as the endpoint template for both listing and creating/updating/deleting permissions is RESTful (same path, different methods), but the name may mislead readers into thinking it’s GET‑only. If it’s easy, consider either renaming the constant to something more generic (e.g. WORKSPACE_PERMISSIONS_ENDPOINT) or adding a brief comment here to clarify that this single path is used for GET/POST/DELETE.

Both are small polish items; the overall design and behavior look sound.

Also applies to: 672-749

mlflow/tracking/_workspace/context.py (1)

15-23: Validation behavior and DEFAULT_WORKSPACE_NAME bypass

set_current_workspace validates only non-None workspaces that differ from DEFAULT_WORKSPACE_NAME. This implicitly treats the default as always valid and avoids WorkspaceNameValidator for it. If DEFAULT_WORKSPACE_NAME ever changes or gains looser constraints than regular workspaces, this could diverge from the validator’s rules.

Might be worth an inline comment or test that locks in the intended invariant (“default is always valid and not subject to WorkspaceNameValidator”), so future changes don’t accidentally break this assumption.

tests/server/auth/test_sqlalchemy_store.py (2)

325-356: Regex patterns and workspace expectations for registered model permissions

The new assertions around workspace == DEFAULT_WORKSPACE_NAME and the error-message patterns look consistent with the new workspace column semantics.

Two minor notes:

  • The regex-based messages (duplicate_permission_pattern, missing_permission_message) interpolate DEFAULT_WORKSPACE_NAME directly into the pattern. If workspace names are ever allowed to contain regex metacharacters, these patterns could start misbehaving. Using re.escape(DEFAULT_WORKSPACE_NAME) in the pattern construction would make the tests robust to such changes.
  • These tests implicitly assume that operations performed without an explicit WorkspaceContext land in DEFAULT_WORKSPACE_NAME. That contract should be explicitly documented / tested at the store layer as well to avoid subtle regressions.

Also applies to: 378-393, 415-419, 435-436


466-539: Workspace permission precedence and accessibility coverage

The workspace-permission tests cover the key behaviors well:

  • test_set_workspace_permission_creates_and_updates validates creation, update, and invalid resource types.
  • test_get_workspace_permission_precedence exercises precedence across ("*", "*"), user-specific, and resource-specific entries, plus “no entry” and invalid type.
  • Listing and deletion tests (list_workspace_permissions, list_user_workspace_permissions_includes_wildcards, delete_workspace_permission, delete_workspace_permissions_for_workspace, list_accessible_workspace_names) cover wildcard inclusion, per-workspace filtering, and NO_PERMISSIONS handling.

One possible enhancement: add a case where a workspace has both a global wildcard grant and a user-specific NO_PERMISSIONS entry, to confirm that explicit deny overrides wildcard allow. That would lock in the intended deny/allow precedence semantics.

Also applies to: 560-594, 596-614

mlflow/store/artifact/http_artifact_repo.py (1)

40-47: Workspace-aware artifact URI rewriting and helpers

The _get_workspace_scoped_artifact_uri logic covers three important cases:

  • Already workspace-scoped URIs (/mlflow-artifacts/workspaces/{ws}/...) – detected and returned as-is with the workspace extracted.
  • Legacy tail-form URIs (/mlflow-artifacts/artifacts/workspaces/{ws}/...) – normalized to the new layout while preserving the workspace and rest-of-path.
  • Global URIs with an active workspace – injects /workspaces/{encoded_ws} after /mlflow-artifacts when appropriate.

The static helpers _get_artifacts_endpoint_prefix and _with_workspace_prefix keep the path manipulations centralized and make the MPU endpoints consistent with the artifacts endpoints. Overall this looks correct and keeps backward compatibility for expected shapes.

One thing to keep in mind is that all of this assumes artifact_uri contains /mlflow-artifacts somewhere in the path. If HttpArtifactRepository is ever instantiated with an arbitrary HTTP URI that doesn’t follow this convention, split(endpoint, maxsplit=1) in callers will raise. If that isn’t already guaranteed by the caller, consider either validating the pattern early or raising a clearer error.

Also applies to: 48-87, 88-103

tests/store/tracking/test_sqlalchemy_store_workspace.py (1)

37-613: Workspace tests look solid; consider DRYing engine cleanup in startup tests

The workspace-scoping, artifact, trace, and startup / migration tests are thorough and align well with the intended semantics. I don’t see functional issues here. If you want, you could factor the repeated store._dispose_engine() + SqlAlchemyStore._db_uri_sql_alchemy_engine_map.pop(...) pattern in the startup-related tests into a small helper to keep those tests a bit cleaner, but that’s optional.

mlflow/cli/__init__.py (1)

564-573: Redundant env var handling for MLFLOW_ENABLE_WORKSPACES / MLFLOW_WORKSPACE_URI

You both mutate os.environ[...] directly and then call MLFLOW_ENABLE_WORKSPACES.set(enable_workspaces) and (conditionally) MLFLOW_WORKSPACE_URI.set(workspace_store_uri). Functionally this should be harmless, but it’s redundant and makes the precedence rules harder to reason about.

Consider relying on the MLFLOW_* environment variable helpers exclusively in this block, and using click.echo / logging only for user messaging. That keeps the codepath for how these env vars are set in one place.

This is a minor cleanup; no behavior change is required for this PR.

Also applies to: 598-607

mlflow/store/workspace/abstract_store.py (1)

12-77: Abstract workspace store and name validator look consistent

The AbstractStore surface and default behaviors are clear and give downstream implementations a well-defined contract, and WorkspaceNameValidator correctly enforces the documented constraints (length, pattern, and reserved names) with good error messages.

If you want to trim a bit of duplication, is_valid() could internally call validate() and catch MlflowException.invalid_parameter_value rather than re-implementing the checks, but that’s purely a style choice. Functionally this is good as-is.

Also applies to: 79-129

mlflow/tracking/client.py (1)

211-237: Constructor and lazy workspace client are well-structured

The workspace_uri resolution, _workspace_uri caching, and lazy _workspace_client construction follow the same pattern as the registry client and surface a clear FEATURE_DISABLED error when the URI scheme is unsupported. The dedicated get_workspace_uri() accessor and thin CRUD delegations make the new workspace APIs straightforward to use and maintain.

You might optionally add a type annotation for self._workspace_client (e.g., WorkspaceProviderClient | None) to help static analysis and readers.

Also applies to: 242-257, 295-339

mlflow/store/workspace/sqlalchemy_store.py (1)

75-89: Confirm default-workspace lifecycle and artifact-root semantics

Two behaviors are worth double‑checking against the broader workspace design:

  • get_default_workspace() simply calls get_workspace(DEFAULT_WORKSPACE_NAME), so it will raise RESOURCE_DOES_NOT_EXIST unless something else bootstraps that row. If callers assume the default always exists, you may want a “get or create” path, or explicitly document that this can fail.
  • resolve_artifact_root() ignores workspace_name and always returns (default_artifact_root, True). If the second return value is meant to indicate per‑workspace artifact rooting, this is fine as long as the caller is the one that actually namespaces by workspace. If the intent was for this store to return workspace‑specific roots, you may want to incorporate workspace_name into the path or add a comment clarifying why it’s unused.

Please verify these assumptions match how the workspace machinery and tests expect this store to behave.

Also applies to: 90-94, 95-102

mlflow/store/workspace/rest_store.py (1)

13-71: REST workspace store wiring and error handling look consistent

The REST‐backed implementation cleanly mirrors other MLflow REST stores: endpoints are well‑formed, verify_rest_response is used for general validation, and create_workspace correctly turns 201 responses (including empty bodies) into Workspace objects while mapping 400/409 conflicts to RESOURCE_ALREADY_EXISTS. The fallback to invalid_parameter_value for get_default_workspace also makes the “no REST default workspace” contract explicit.

If the workspace API ever returns non–already‑exists 409 errors, you might consider narrowing the conflict mapping to cases where error_code == "RESOURCE_ALREADY_EXISTS" instead of any 409, but this is mostly defensive and not required if the backend’s contract is strict.

Also applies to: 72-101

tests/test_cli.py (1)

13-43: Nice test harness for workspace defaults and env isolation

Overriding the --enable-workspaces default for tests, the autouse _reset_workspace_env fixture, _workspace_disabled_env() for subprocesses, and the invoke_server() helper together do a good job of preventing tests from inheriting ambient workspace configuration while still allowing explicit workspace scenarios. This should make the CLI/server tests much more deterministic across different developer environments.

In invoke_server(), you could tighten the flag detection to check for exact matches (arg == "--enable-workspaces") instead of substring membership, though it’s unlikely to matter in practice.

Also applies to: 45-71

mlflow/server/handlers.py (2)

2936-3043: Workspace-scoped artifact paths are well-factored

The _workspace_scoped_repo_path helper and its use in _download_artifact, _upload_artifact, _list_artifacts_mlflow_artifacts, and _delete_artifact_mlflow_artifacts cleanly centralize the workspace prefixing logic while leaving the client-facing paths unchanged. The early return when no active workspace and the guard against double workspaces/ prefixes keep behavior backward compatible outside multi-tenancy. Looks good.


4043-4168: Workspace-prefixed path helpers are consistent with service endpoint generation

The interplay between _get_paths(.., include_workspace_variants=...), _add_workspace_prefix, _insert_workspace_segment, and get_workspace_prefixed_path looks coherent: workspace variants are added only for /mlflow and /mlflow-artifacts routes, admin /mlflow/workspaces endpoints are correctly excluded from double-prefixing, and static prefixes are preserved. No issues here from a routing perspective.

mlflow/store/tracking/sqlalchemy_store.py (4)

285-296: Artifact root resolution & isolation checks: behavior is solid, but clarify reserved path contract

The new _resolve_workspace_artifact_root, _get_artifact_location, and _validate_artifact_isolation_constraints logic appears correct:

  • Workspace-aware artifact roots are resolved via the provider when available.
  • workspaces/<name>/<experiment_id> is appended only when resolve_artifact_root signals should_append.
  • Startup checks prevent enabling workspaces when:
    • The resolved root ends with workspaces or workspaces/<name>, or
    • Existing single-tenant experiments already store artifacts under the reserved .../workspaces/ prefix.
  • Existing multi-tenant databases (SqlExperiment.workspace != DEFAULT_WORKSPACE_NAME) skip the legacy artifact layout check.

This is a good safety net and seems backward compatible. Two minor points:

  1. When resolved_root is None, you correctly fall back to self.artifact_root_uri; worth making sure workspace providers never return ("", ...) which would be treated as falsy the same way.
  2. The reserved-prefix conflict check is skipped as soon as any non-default workspace exists. That’s correct for migrated multi-tenant DBs, but if a misconfigured early multi-tenant rollout accidentally created non-default workspaces under a conflicting artifact prefix, this guard will no longer protect them.

Consider documenting in mlflow.store.workspace.utils (or here in a short comment) that:

  • Providers must either return None or a non-empty string for resolved_root.
  • Once a non-default workspace row exists, we treat the DB as “authoritative multi-tenant” and skip legacy layout validation.

No code changes required; just clarifying the contract to avoid surprises.

Also applies to: 449-461, 472-527


1166-1217: Model-level metrics logging: good validation, minor nit on unused provided_experiment_id

The refactor of _log_model_metrics adds:

  • Validation that experiment_id (if provided) is an int and matches run.experiment_id.
  • A dedup set to avoid double-logging identical Metric objects.
  • Full reuse of _validate_metric and sanitize_metric_value.

This is a solid improvement. Only small nit:

  • provided_experiment_id is assigned but not used after validation — you could in principle drop that temporary variable and just use the experiment_id parameter directly.

Optionally simplify:

-        provided_experiment_id = experiment_id
-        if provided_experiment_id is not None:
+        if experiment_id is not None:
             try:
-                provided_experiment_id_int = int(provided_experiment_id)
+                provided_experiment_id_int = int(experiment_id)

Functionality is unchanged; this is just minor cleanup.


4671-4687: get_dataset_experiment_ids workspace behavior: silent filtering is intentional but worth documenting

When workspaces are enabled, get_dataset_experiment_ids now:

  • Attempts _get_dataset_record; if it raises, returns [] instead of surfacing RESOURCE_DOES_NOT_EXIST.
  • Still calls search_entities_by_source unscoped, which operates purely on SqlEntityAssociation.

Net effect:

  • For a dataset not visible in the active workspace, the method quietly returns an empty list.
  • For visible datasets, it returns all associated experiments, even if some experiments are from other workspaces (because SqlEntityAssociation isn’t workspace-scoped here).

To make behavior less surprising:

  • Either document that get_dataset_experiment_ids is “best-effort” and may drop associations if the dataset isn’t in the active workspace, or
  • Tighten experiment lookup by:
    • Filtering returned experiment IDs through _get_experiment / workspace context, or
    • Adding a workspace column to associations or joining back to SqlExperiment where needed.

If you want to keep compatibility, I’d at least add a short comment noting why we swallow MlflowException and return an empty list when the dataset is not in the active workspace.


4915-4963: Workspace-aware query helpers: good abstraction layer

The new helpers:

  • _runs_query
  • _trace_query
  • _logged_model_query
  • _dataset_query
  • _dataset_workspace_filter
  • _get_logged_model_record
  • _get_dataset_record

are well factored and consistently applied. They:

  • Keep workspace logic centralized.
  • Minimize duplication and reduce the likelihood of missing a workspace filter in future methods.

It might be worth adding a brief docstring to WorkspaceAwareMixin describing how these helpers are expected to be used (i.e., “all entity fetch operations should go through *_query helper to enforce workspace scoping”), but that’s purely documentation.

Also applies to: 4978-5003

mlflow/server/auth/__init__.py (2)

1099-1105: Consider bounded cache for path regex compilation.

The _re_compile_path function uses @functools.lru_cache(maxsize=None), which creates an unbounded cache. If path patterns can vary dynamically (e.g., through user input or workspace names), this could lead to unbounded memory growth.

Consider either:

  1. Setting a reasonable maxsize (e.g., 512 or 1024)
  2. Verifying that the number of unique paths is bounded by a small constant
-@functools.lru_cache(maxsize=None)
+@functools.lru_cache(maxsize=512)
 def _re_compile_path(path: str) -> re.Pattern:

1374-1380: Consider explicit key validation for clarity.

The set subtraction for validating required keys is concise but may be less readable for future maintainers.

Consider a more explicit validation:

 def _validate_workspace_permission_payload(payload: dict[str, Any]) -> tuple[str, str, str]:
-    missing = {"username", "resource_type", "permission"} - payload.keys()
-    if missing:
+    required = ["username", "resource_type", "permission"]
+    missing = [k for k in required if k not in payload]
+    if missing:
         raise MlflowException.invalid_parameter_value(
             "Workspace permission payload missing keys: " + ", ".join(sorted(missing))
         )
     return payload["username"], payload["resource_type"], payload["permission"]

Comment thread mlflow/cli/__init__.py Outdated
Comment on lines +381 to +391
@click.option(
"--workspace-store-uri",
envvar=MLFLOW_WORKSPACE_URI.name,
metavar="URI",
default=None,
help=(
"URI for storing workspace metadata. Acceptable URIs mirror the tracking backend, e.g. "
"'sqlite:///path/to/file.db', 'postgresql://user:pass@host/db', or 'http://tracking'. "
"If not specified, the tracking URI is used."
),
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Duplicate --workspace-store-uri Click option will break the CLI

You define --workspace-store-uri twice, both mapped to the same workspace_store_uri parameter (lines 381–391 and 477–486). Click does not support registering the same option flag twice on a single command; this is very likely to raise at import/command construction time or cause confusing behavior.

You should keep only one definition. Given the help text, it looks like the later block (near --enable-workspaces) is the intended one for the server, and the earlier one should be removed.

Example fix:

-@click.option(
-    "--workspace-store-uri",
-    envvar=MLFLOW_WORKSPACE_URI.name,
-    metavar="URI",
-    default=None,
-    help=(
-        "URI for storing workspace metadata. Acceptable URIs mirror the tracking backend, e.g. "
-        "'sqlite:///path/to/file.db', 'postgresql://user:pass@host/db', or 'http://tracking'. "
-        "If not specified, the tracking URI is used."
-    ),
-)
 @click.option(
     "--default-artifact-root",

Also applies to: 477-486, 495-519

🤖 Prompt for AI Agents
In mlflow/cli/__init__.py around lines 381-391 (and note related blocks at
477-486 and 495-519), there is a duplicate Click option definition for
--workspace-store-uri which will break CLI construction; remove the earlier
option block (lines 381-391) so only the intended server-side option remains
near the --enable-workspaces block, ensure the remaining Click.option uses the
correct parameter name (workspace_store_uri) and envvar, and verify no other
duplicate flags remain in the file.

Comment on lines +105 to +120
#: Specifies the active workspace for client operations.
#: (default: ``None``)
MLFLOW_WORKSPACE = _EnvironmentVariable("MLFLOW_WORKSPACE", str, None)

#: Specifies the workspace provider URI. Falls back to the tracking URI when unset.
#: (default: ``None``)
MLFLOW_WORKSPACE_URI = _EnvironmentVariable("MLFLOW_WORKSPACE_URI", str, None)

#: Specifies the registry URI.
#: (default: ``None``)
MLFLOW_REGISTRY_URI = _EnvironmentVariable("MLFLOW_REGISTRY_URI", str, None)

#: Specifies the workspace provider backend URI.
#: Defaults to the tracking URI when unset.
MLFLOW_WORKSPACE_URI = _EnvironmentVariable("MLFLOW_WORKSPACE_URI", str, None)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Deduplicate MLFLOW_WORKSPACE_URI and MLFLOW_ENABLE_WORKSPACES declarations

MLFLOW_WORKSPACE_URI and MLFLOW_ENABLE_WORKSPACES are each defined twice in this module with slightly different docstrings. While Python will just overwrite the first binding, the duplication is confusing and increases the chance of doc/comment drift.

Suggest keeping a single definition for each with the more descriptive comment, for example:

-#: Specifies the workspace provider URI. Falls back to the tracking URI when unset.
-#: (default: ``None``)
-MLFLOW_WORKSPACE_URI = _EnvironmentVariable("MLFLOW_WORKSPACE_URI", str, None)
-
-#: Specifies the workspace provider backend URI.
-#: Defaults to the tracking URI when unset.
-MLFLOW_WORKSPACE_URI = _EnvironmentVariable("MLFLOW_WORKSPACE_URI", str, None)
+#: Specifies the workspace provider backend URI. Falls back to the tracking URI when unset.
+#: (default: ``None``)
+MLFLOW_WORKSPACE_URI = _EnvironmentVariable("MLFLOW_WORKSPACE_URI", str, None)
@@
-#: Enables workspace-aware multi-tenancy features on the MLflow server.
-#: (default: ``False``)
-MLFLOW_ENABLE_WORKSPACES = _BooleanEnvironmentVariable("MLFLOW_ENABLE_WORKSPACES", False)
+#: Enables workspace-aware (multi-tenant) behavior for MLflow servers and clients.
+#: When set, requests can include a workspace. Some workspace providers support default workspaces.
+#: (default: ``False``)
+MLFLOW_ENABLE_WORKSPACES = _BooleanEnvironmentVariable("MLFLOW_ENABLE_WORKSPACES", False)
@@
-#: Enables workspace-aware (multi-tenant) behavior for MLflow servers and clients.
-#: When set, requests can include a workspace. Some workspace providers support default workspaces.
-#: (default: ``False``)
-MLFLOW_ENABLE_WORKSPACES = _BooleanEnvironmentVariable("MLFLOW_ENABLE_WORKSPACES", False)

This keeps the behavior identical while avoiding duplicated symbols and comments.

Also applies to: 450-452, 1143-1146

🤖 Prompt for AI Agents
In mlflow/environment_variables.py around lines 105-120 (also review duplicates
at 450-452 and 1143-1146), there are duplicate definitions for
MLFLOW_WORKSPACE_URI (and MLFLOW_ENABLE_WORKSPACES elsewhere) with conflicting
docstrings; remove the redundant definitions and keep a single declaration for
each variable using the more descriptive/docstringed version. Ensure the
remaining declaration uses the correct name, type, and default, remove the
overwritten binding(s) and update or consolidate comments so each environment
variable is declared exactly once in the module.

Comment on lines +186 to +197
@dataclass(frozen=True)
class AuthContext:
username: str | None
is_admin: bool

@classmethod
def from_authorization(cls, auth: Authorization) -> "AuthContext":
return cls(
username=getattr(auth, "username", None),
is_admin=getattr(auth, "is_admin", False),
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the is_admin attribute source on Authorization objects.

The from_authorization method uses getattr(auth, "is_admin", False) to extract admin status. Ensure that the Authorization objects passed to this method consistently have the is_admin attribute set, or confirm that the default of False is always appropriate when the attribute is missing.

Run the following script to verify where is_admin is set on Authorization objects:


🏁 Script executed:

#!/bin/bash
# Search for where is_admin is set on authorization/request.authorization objects
rg -nP --type=py -C3 '\bauthorization\.(is_admin|username)\s*=' 
rg -nP --type=py -C3 'request\.authorization' | rg -A5 -B5 'is_admin'

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for authenticate_request_basic_auth and custom Authorization handling
rg -nP --type=py 'def authenticate_request|class.*Authorization|\.is_admin' -C2 | head -100

Length of output: 5881


🏁 Script executed:

#!/bin/bash
# Find all usages of AuthContext.from_authorization and AuthContext constructor calls
rg -nP --type=py 'AuthContext\.from_authorization|AuthContext\(' -C3

Length of output: 8815


🏁 Script executed:

#!/bin/bash
# Search for code that might extend or modify Authorization objects
rg -nP --type=py 'request\.authorization\s*=' -C5
# Search for custom Authorization class or patching
rg -nP --type=py 'class.*Authorization|Authorization\.' -C2 | head -80

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Get full implementation of authenticate_request_basic_auth and related functions
rg -nP --type=py -A20 'def authenticate_request_basic_auth'
# Also check line 808 context to see correct pattern
rg -nP --type=py -B5 -A5 'store\.get_user.*is_admin'

Length of output: 3286


Fix required: AuthContext.from_authorization() cannot extract is_admin from Werkzeug Authorization object, which doesn't have this attribute.

Werkzeug's Authorization class only provides username and password (for Basic auth). The is_admin attribute exists on the User entity in the store, not on the Authorization object. The getattr(auth, "is_admin", False) call will always return False, breaking all admin authorization checks throughout the codebase (lines 330–331, 721–722, 740–741, 757–758, and elsewhere).

Correct pattern exists at line 808 in sender_is_admin(): store.get_user(username).is_admin.

Solution: Modify AuthContext.from_authorization() to fetch is_admin from the store after extracting the username:

@dataclass(frozen=True)
class AuthContext:
    username: str | None
    is_admin: bool

    @classmethod
    def from_authorization(cls, auth: Authorization) -> "AuthContext":
        username = getattr(auth, "username", None)
        is_admin = False
        if username:
            user = store.get_user(username)
            is_admin = user.is_admin if user else False
        return cls(username=username, is_admin=is_admin)
🤖 Prompt for AI Agents
In mlflow/server/auth/__init__.py around lines 186 to 197,
AuthContext.from_authorization() incorrectly attempts to read is_admin from the
Werkzeug Authorization object (which only has username/password) so is_admin is
always False; change the method to extract the username from auth and then, if
username is present, query the user from the store (e.g.,
store.get_user(username)) and set is_admin based on that user entity (defaulting
to False if no user), then return the AuthContext with the resolved username and
is_admin.

Comment on lines +881 to 886
auth = authenticate_request()
username = auth.username
return _get_permission_from_store_or_default(
lambda: store.get_experiment_permission(experiment_id, username).permission
lambda: store.get_experiment_permission(experiment_id, username).permission,
workspace_fallback=lambda: _workspace_permission_for_experiment(auth, experiment_id),
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Inconsistent authentication context extraction.

This function directly accesses auth.username (line 882) without using the AuthContext pattern introduced elsewhere in this PR (e.g., lines 509-510). Other permission getters create an AuthContext to safely handle the username and is_admin attributes.

For consistency and to avoid potential AttributeError if Authorization objects don't always have a username attribute, consider:

 def _get_permission_from_run_id_or_uuid() -> Permission:
     """
     Get permission for Flask routes that use either run_id or run_uuid parameter.
     """
     run_id = request.args.get("run_id") or request.args.get("run_uuid")
     if not run_id:
         raise MlflowException(
             "Request must specify run_id or run_uuid parameter",
             INVALID_PARAMETER_VALUE,
         )
     run = _get_tracking_store().get_run(run_id)
     experiment_id = run.info.experiment_id
     auth = authenticate_request()
-    username = auth.username
+    auth_ctx = AuthContext.from_authorization(auth)
     return _get_permission_from_store_or_default(
-        lambda: store.get_experiment_permission(experiment_id, username).permission,
+        lambda: store.get_experiment_permission(experiment_id, auth_ctx.username).permission,
         workspace_fallback=lambda: _workspace_permission_for_experiment(auth, experiment_id),
     )

Apply similar changes to functions at lines 899-915, 923-942, 950-976, 984-1012, and 1015-1031.

🤖 Prompt for AI Agents
In mlflow/server/auth/__init__.py around lines 881-886, the code reads
authenticate_request() and directly accesses auth.username which is inconsistent
with the AuthContext pattern used elsewhere and can raise AttributeError;
replace direct attribute access by constructing an AuthContext (e.g., ctx =
AuthContext.from_authenticate_request(auth) or equivalent used in the PR), then
use ctx.username and ctx.is_admin in the permission lookup and
workspace_fallback lambda; apply the same change to the other permission-getter
functions at lines 899-915, 923-942, 950-976, 984-1012, and 1015-1031 so each
builds an AuthContext from authenticate_request() and uses
ctx.username/ctx.is_admin instead of direct auth attributes in all lambdas and
calls.

Comment thread mlflow/server/handlers.py Outdated
Comment on lines +4195 to +4238
def get_endpoints(
get_handler=get_handler,
include_workspace_endpoints=True,
include_workspace_variants=True,
):
"""
Returns:
List of tuples (path, handler, methods)
"""
return (
get_service_endpoints(MlflowService, get_handler)
+ get_service_endpoints(ModelRegistryService, get_handler)
+ get_service_endpoints(MlflowArtifactsService, get_handler)
+ get_service_endpoints(WebhookService, get_handler)
endpoints = (
get_service_endpoints(
MlflowService, get_handler, include_workspace_variants=include_workspace_variants
)
+ get_service_endpoints(
ModelRegistryService,
get_handler,
include_workspace_variants=include_workspace_variants,
)
+ get_service_endpoints(
MlflowArtifactsService,
get_handler,
include_workspace_variants=include_workspace_variants,
)
+ get_service_endpoints(
WebhookService, get_handler, include_workspace_variants=include_workspace_variants
)
+ [(_add_static_prefix("/graphql"), _graphql, ["GET", "POST"])]
+ _workspace_endpoints()
)

if include_workspace_variants:
endpoints.append(
(
_add_static_prefix("/workspaces/<workspace_name>/graphql"),
_strip_workspace_kwarg(_graphql),
["GET", "POST"],
)
)

if include_workspace_endpoints:
endpoints += _workspace_endpoints()

return endpoints

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

get_endpoints adds _workspace_endpoints() twice, causing duplicate route registrations

In get_endpoints:

endpoints = (
    get_service_endpoints(...)
    ...
    + [(_add_static_prefix("/graphql"), _graphql, ["GET", "POST"])]
    + _workspace_endpoints()
)

...

if include_workspace_endpoints:
    endpoints += _workspace_endpoints()

_workspace_endpoints() is always included once, and then again when include_workspace_endpoints is True (the default). This will emit duplicate (path, handler, methods) tuples for all workspace admin routes, which is likely to produce warnings or errors when binding Flask routes.

A minimal fix is to include workspace endpoints only under the flag and drop the unconditional addition:

 def get_endpoints(
     get_handler=get_handler,
     include_workspace_endpoints=True,
     include_workspace_variants=True,
 ):
@@
-    endpoints = (
-        get_service_endpoints(
-            MlflowService, get_handler, include_workspace_variants=include_workspace_variants
-        )
-        + get_service_endpoints(
-            ModelRegistryService,
-            get_handler,
-            include_workspace_variants=include_workspace_variants,
-        )
-        + get_service_endpoints(
-            MlflowArtifactsService,
-            get_handler,
-            include_workspace_variants=include_workspace_variants,
-        )
-        + get_service_endpoints(
-            WebhookService, get_handler, include_workspace_variants=include_workspace_variants
-        )
-        + [(_add_static_prefix("/graphql"), _graphql, ["GET", "POST"])]
-        + _workspace_endpoints()
-    )
+    endpoints = (
+        get_service_endpoints(
+            MlflowService, get_handler, include_workspace_variants=include_workspace_variants
+        )
+        + get_service_endpoints(
+            ModelRegistryService,
+            get_handler,
+            include_workspace_variants=include_workspace_variants,
+        )
+        + get_service_endpoints(
+            MlflowArtifactsService,
+            get_handler,
+            include_workspace_variants=include_workspace_variants,
+        )
+        + get_service_endpoints(
+            WebhookService, get_handler, include_workspace_variants=include_workspace_variants
+        )
+        + [(_add_static_prefix("/graphql"), _graphql, ["GET", "POST"])]
+    )
@@
-    if include_workspace_endpoints:
-        endpoints += _workspace_endpoints()
+    if include_workspace_endpoints:
+        endpoints += _workspace_endpoints()
 
     return endpoints

This keeps the flag meaningful and avoids double-registration.

🤖 Prompt for AI Agents
In mlflow/server/handlers.py around lines 4195-4238, _workspace_endpoints() is
being appended unconditionally to the initial endpoints tuple and then appended
again when include_workspace_endpoints is True, causing duplicate route
registrations; remove the unconditional "+ _workspace_endpoints()" from the
initial endpoints construction and only add workspace endpoints inside the
existing if include_workspace_endpoints: block (keeping the separate
include_workspace_variants handling for the workspace GraphQL variant as-is) so
workspace routes are registered exactly once.

Comment thread mlflow/utils/rest_utils.py Outdated
Comment on lines 9 to 10
from urllib.parse import quote

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Wire _maybe_prefix_with_workspace into the REST call path

The helpers _resolve_active_workspace() and _maybe_prefix_with_workspace() correctly encapsulate:

  • How to determine the active workspace (from tracking context or MLFLOW_WORKSPACE), and
  • How to prefix both /api/2.0/mlflow-artifacts and general /api/.../mlflow/... endpoints, with proper URL-encoding and guards against double-prefixing of /workspaces routes.

However, within this file they are not currently invoked by http_request, http_request_safe, call_endpoint, or call_endpoints, so callers that go through the standard proto-based path (call_endpoint / call_endpoints) will still hit un-prefixed endpoints unless they manually call _maybe_prefix_with_workspace, which is unlikely and brittle.

To make workspace behavior effective for the usual tracking/registry APIs without impacting arbitrary consumers of http_request, you can prefix in call_endpoint:

 def call_endpoint(
     host_creds,
     endpoint,
     method,
     json_body,
     response_proto,
     extra_headers=None,
     retry_timeout_seconds=None,
 ):
@@
-    # Convert json string to json dictionary, to pass to requests
-    if json_body is not None:
-        json_body = json.loads(json_body)
-
-    call_kwargs = {
-        "host_creds": host_creds,
-        "endpoint": endpoint,
-        "method": method,
-    }
+    # Convert json string to json dictionary, to pass to requests
+    if json_body is not None:
+        json_body = json.loads(json_body)
+
+    # Apply workspace prefixing for MLflow REST endpoints when an active workspace is set.
+    endpoint = _maybe_prefix_with_workspace(endpoint)
+
+    call_kwargs = {
+        "host_creds": host_creds,
+        "endpoint": endpoint,
+        "method": method,
+    }

call_endpoints already delegates to call_endpoint, so it will pick up the same behavior. This keeps the workspace logic centralized here, in line with how other REST routing helpers work.

Also applies to: 24-25, 62-111, 621-649

🤖 Prompt for AI Agents
In mlflow/utils/rest_utils.py (around lines 9-10, 24-25, 62-111 and 621-649),
call_endpoint and the logic that constructs per-endpoint request paths must run
the existing _maybe_prefix_with_workspace(path) before performing the REST call
so the active workspace is applied and URL-encoded (do not change http_request
or http_request_safe). Modify call_endpoint to pass the endpoint path through
_maybe_prefix_with_workspace (and ensure call_endpoints continues to delegate to
call_endpoint so it inherits the behavior), and add any necessary unit checks to
avoid double-prefixing of /workspaces routes and to preserve artifact vs general
API routing as implemented by the helper.

@mprahl mprahl force-pushed the multi-tenancy-backend-only branch 2 times, most recently from bb88389 to e545042 Compare November 21, 2025 21:30
Signed-off-by: mprahl <mprahl@users.noreply.github.com>
@mprahl mprahl force-pushed the multi-tenancy-backend-only branch from e545042 to d6b41b1 Compare November 21, 2025 22:37
@mprahl mprahl closed this Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant