WIP: Test CI#15
Conversation
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: Daniel Lok <daniel.lok@databricks.com>
…lflow#18838) Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
… count > 5 (mlflow#18841) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: harupy <17039389+harupy@users.noreply.github.com> Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
…field (mlflow#17744) Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Signed-off-by: Harutaka Kawamura <harutaka@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
…#18859) Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
…mlflow#18860) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
…#18861) Signed-off-by: Harutaka Kawamura <hkawamura0130@gmail.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: harupy <17039389+harupy@users.noreply.github.com> Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com>
…runners (mlflow#18865) Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com> Signed-off-by: Ben Wilson <39283302+BenWilson2@users.noreply.github.com>
…lflow#18825) Signed-off-by: Alkis Polyzotis <alkis.polyzotis@databricks.com> Co-authored-by: Claude <noreply@anthropic.com>
…flow#18877) Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Signed-off-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com> Signed-off-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: Serena Ruan <82044803+serena-ruan@users.noreply.github.com>
…flow#18882) Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
…#18497) Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Signed-off-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com>
Signed-off-by: Daniel Lok <daniel.lok@databricks.com> Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com>
…lflow#18893) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Daniel Lok <daniel.lok@databricks.com>
…irect judge invocation (mlflow#18897) Signed-off-by: Xiang Shen <xshen.shc@gmail.com>
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
…low#18382) Signed-off-by: Manu Compen <manu.compen@justeattakeaway.com> Co-authored-by: Manu Compen <manu.compen@justeattakeaway.com> Co-authored-by: Serena Ruan <82044803+serena-ruan@users.noreply.github.com>
…234 (mlflow#18920) Signed-off-by: Henrikas <henrikasgir@gmail.com>
…mlflow#18915) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: harupy <17039389+harupy@users.noreply.github.com> Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com>
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Signed-off-by: mprahl <mprahl@users.noreply.github.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR implements comprehensive multi-tenancy support via workspaces in MLflow, adding workspace management APIs, tenant-isolated resource scoping, workspace-aware permissions, new SQLAlchemy/REST store implementations, database schema updates for workspace permissions, artifact repository isolation, and extensive documentation with associated CLI configuration options and server lifecycle integration. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as MLflow Client
participant REST as MLflow REST API
participant Auth as Auth Layer
participant Store as Workspace Store
participant DB as Database
Client->>REST: POST /mlflow/workspaces (name, description)
REST->>Auth: validate_can_create_workspace()
Auth->>Store: create_workspace()
Store->>DB: INSERT INTO workspaces (name, description)
DB-->>Store: workspace_id
Store-->>REST: Workspace(name, desc)
REST-->>Client: 201 Created
Note over Client,DB: Setting Active Workspace Context
Client->>Client: set_current_workspace("team-a")
Client->>REST: POST /mlflow/experiments (name) [X-MLFLOW-WORKSPACE: team-a]
REST->>Auth: workspace context via X-MLFLOW-WORKSPACE header
Auth->>Store: resolve workspace & check permissions
Store-->>Auth: Workspace("team-a")
Auth-->>REST: Permission check passed
REST->>DB: INSERT INTO experiments (workspace, name)
DB-->>REST: experiment_id
REST-->>Client: Experiment(workspace="team-a")
sequenceDiagram
participant User as User/Client
participant Server as MLflow Server
participant PermStore as Permission Store
participant WorkspaceStore as Workspace Store
participant Default as Default Workspace
User->>Server: GET /mlflow/workspaces
Server->>PermStore: list_accessible_workspace_names(username)
PermStore->>WorkspaceStore: list all workspaces
loop For each workspace
PermStore->>PermStore: check user has RBAC permission
Note over PermStore: Permission Resolution Order:<br/>1. Explicit resource permission<br/>2. Workspace-level permission<br/>3. Default permission
end
PermStore-->>Server: [accessible workspace names]
Server-->>User: [filtered list]
Note over User,Default: Workspace Resolution Order
rect rgb(200, 220, 255)
User->>Server: Request with/without X-MLFLOW-WORKSPACE header
Server->>Server: 1. Check request header
Server->>WorkspaceStore: 2. Check active context
Server->>WorkspaceStore: 3. Use default workspace
WorkspaceStore-->>Server: Resolved workspace
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~150+ minutes Specific areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/store/tracking/test_rest_store.py (1)
2882-2947: Avoid real network timeout dependence intest_log_spans_with_version_checkThe last part of
test_log_spans_with_version_checkrelies on a real network timeout and assertselapsed_time < 5, which can be brittle under slow or noisy CI/network conditions even if the code is correct. You already validate the timeout andretry_timeout_secondswiring intest_server_version_check_caching.Consider refactoring the “could not identify MLflow server version” scenario to stay fully mocked (e.g., have
http_requestimmediately raise or return a sentinel) and assert on the arguments passed tohttp_requestinstead of wall‑clock time, so the test remains deterministic and fast.Also applies to: 2950-3013
mlflow/server/handlers.py (1)
3080-3175: Multipart upload workspace scoping assumes updated artifact repo APIThe multipart upload handlers now:
- Derive a workspace-prefixed
repo_pathfrom the URLartifact_path.- Still use
request_message.pathas the logical path while passingrepo_pathas an extra argument tocreate_multipart_upload,complete_multipart_upload, andabort_multipart_upload.This is reasonable, but it relies on all
MultipartUploadMixinimplementations being updated to accept the new extrarepo_pathparameter in that exact position. If any third-party or out-of-tree artifact repos exist, this could be a subtle breaking change. It may be worth:
- Keeping the old signature as a fallback via
inspect.signature/*args, **kwargs, or- Explicitly documenting the new interface as a required upgrade for custom artifact repos.
Given this PR’s scope, I’d at least add a short comment near these calls noting the expectation on
MultipartUploadMixinimplementers.
♻️ Duplicate comments (3)
tests/tracking/test_mlflow_artifacts.py (1)
21-40: Server launch correctly forces workspaces off, once CLI flag existsAppending
--no-enable-workspaceshere matches the test intent of preventing the artifacts server from inheriting workspace-related env vars. This will behave as expected once theservercommand exposes the--enable-workspaces/--no-enable-workspacespair (see comment inmlflow/cli/__init__.py). No further changes needed in this helper.tests/test_cli.py (2)
1000-1007: Env-file passthrough test correctly pins workspaces off by default
test_server_with_env_filenow invokes the CLI with--no-enable-workspaces, ensuring the env-file behavior is validated without enabling workspaces by default. Once theservercommand exposes the--enable-workspaces/--no-enable-workspacestoggle, this test will accurately cover the intended path and confirm thatenv_fileis forwarded to_run_server.
39-43: Test helpers assume--no-enable-workspacesexists; must be backed by CLI
invoke_serverunconditionally prepends--no-enable-workspaceswhen neither workspace flag is present, and several tests (e.g.,test_server_with_env_file) pass--no-enable-workspacesexplicitly. However, theservercommand currently defines only--enable-workspaces, so these invocations will fail until the CLI gains a matching negative flag.Once
mlflow/cli/__init__.pyis updated to declare--enable-workspaces/--no-enable-workspacesas a toggle option, this helper makes sense and provides a robust way to keep workspaces disabled by default in tests, regardless of any inherited env vars.No changes needed here beyond ensuring the CLI option is updated as suggested in the CLI module.
Also applies to: 55-71
🧹 Nitpick comments (46)
tests/store/tracking/test_sqlalchemy_store.py (1)
7791-7807: Tags-on-deleted-model behavior validated; consider mirroring the resurrection checkThis test mirrors the params case and correctly asserts that
SqlLoggedModelTagentries are written for a deleted model ID. For symmetry withtest_log_logged_model_params_on_deleted_model, you could optionally also assert thatget_logged_model(model.model_id)still raises to document that tag writes similarly don’t resurrect the model.mlflow/models/model.py (1)
1159-1178: Run param merging and experiment_id propagation look correct; consider payload size & helper signatureThe new logic to pull
run_obj.data.params, merge them with explicitparams(with explicit values taking precedence), and passexperiment_idfromrun_obj.info.experiment_idinto_create_logged_modelis semantically sound and aligns with attaching the logged model tightly to its source run/experiment.Two minor points to keep in mind:
- For runs with many parameters,
params = {**run_params, **(params or {})}will now copy the entire run param set into every logged model created in that run. If you expect very large param payloads, you might want to either restrict which run params are propagated or introduce an explicit allowlist to avoid bloating the LoggedModel record.- This call site now depends on
_create_logged_modelaccepting anexperiment_idkeyword. Please make sure the helper’s signature (and any other call sites) is updated accordingly so you don’t hit unexpectedTypeErrorat runtime.mlflow/store/artifact/http_artifact_repo.py (1)
39-60: Propagateworkspace_componentthrough MPU path construction for consistencyRight now
_get_workspace_scoped_artifact_uriand_get_artifacts_endpoint_prefixpreserve existing behavior, and workspace handling is effectively encoded in thepathportion. However:
- In
_construct_mpu_uri_and_path, you call_get_workspace_scoped_artifact_uri()but discardworkspace_componentand then hard‑codeNonewhen calling_get_artifacts_endpoint_prefix.- If
_get_artifacts_endpoint_prefixis later updated to actually vary byworkspace_component(as its signature suggests), MPU endpoints will silently diverge fromlist_artifactsbehavior.To keep things consistent and future‑proof without changing current behavior, consider:
- def _construct_mpu_uri_and_path(self, base_endpoint, artifact_path): - artifact_uri, _ = self._get_workspace_scoped_artifact_uri() - artifacts_endpoint = self._get_artifacts_endpoint_prefix(None) + def _construct_mpu_uri_and_path(self, base_endpoint, artifact_path): + artifact_uri, workspace_component = self._get_workspace_scoped_artifact_uri() + artifacts_endpoint = self._get_artifacts_endpoint_prefix(workspace_component)This is a no‑op today (since the parameter is intentionally unused), but it avoids a subtle bug if workspace‑specific prefixes are introduced later.
Also applies to: 101-107, 136-140
mlflow/server/otel_api.py (1)
59-65: Convert_get_workspace_store()failures to HTTP errors as wellThe new
_workspace_request_contextnicely mapsMlflowExceptionfrom workspace resolution (e.g. missing active workspace) intoHTTPException, but there’s a small gap:def _workspace_request_context(request: Request): if not MLFLOW_ENABLE_WORKSPACES.get(): yield None return store = _get_workspace_store() try: ... except MlflowException as e: raise HTTPException(status_code=e.get_http_status_code(), detail=e.message)If
_get_workspace_store()itself raises anMlflowException(e.g., unsupported or malformed workspace URI), that exception will bypass theexceptblock and surface as a 500 rather than a structured HTTP error, unlike the other workspace failures you’re handling.To make workspace‑related misconfiguration errors consistent, consider:
- if not MLFLOW_ENABLE_WORKSPACES.get(): - yield None - return - - store = _get_workspace_store() - try: + if not MLFLOW_ENABLE_WORKSPACES.get(): + yield None + return + + try: + store = _get_workspace_store() workspace_name = request.headers.get(WORKSPACE_HEADER_NAME) ...This keeps behavior unchanged for the happy path while ensuring all
MlflowExceptions coming from workspace resolution (including store construction) are mapped to appropriate HTTP responses.Also applies to: 85-96, 179-195, 199-225
mlflow/store/artifact/artifact_repo.py (1)
13-14: Good workspace hook onArtifactRepository, minor contract suggestionUsing
typing_extensions.Selfhere is appropriate, andfor_workspace()returningselfby default provides a non-breaking extension point for workspace-aware implementations.If you expect concrete repositories to wrap/adjust configuration per workspace, consider documenting that
for_workspace()should be idempotent for the sameworkspace_name(e.g., avoid wrapping multiple times on repeated calls) to keep behavior predictable, but this is an optional nicety rather than a blocker.Also applies to: 397-407
tests/tracking/_tracking_service/test_utils.py (1)
38-39: Session maker mocking improves isolation; optional assertion for coveragePatching
mlflow.store.db.utils._get_managed_session_makerwithmock_get_managed_session_makerin the SQLAlchemy store tests is a good move: it keeps these tests hermetic and avoids exercising real DB session lifecycle code while still validating_get_storewiring.If you want slightly stronger coverage, you could assert that
mock_get_managed_session_makerwas called in each test, but that’s optional given the current goals.Also applies to: 191-195, 231-233
mlflow/entities/workspace.py (1)
14-16: Simple and correct serialization method.The
to_dict()method provides a straightforward way to serialize the Workspace entity to a dictionary. The implementation is correct and follows common patterns.Consider using
dataclasses.asdict()for consistency with the dataclass pattern:+from dataclasses import asdict, dataclass + @dataclass(frozen=True, slots=True) class Workspace: """Minimal metadata describing a workspace.""" name: str description: str | None = None def to_dict(self) -> dict[str, str | None]: - return {"name": self.name, "description": self.description} + return asdict(self)However, the current explicit approach is also fine and may be more maintainable if the dictionary structure needs to differ from the dataclass fields in the future.
mlflow/store/artifact/artifact_repository_registry.py (1)
20-21: Workspace-aware artifact repository wrapping is sound; consider a small robustness tweakThe pattern of instantiating the repository first and then delegating to
for_workspace(workspace_name)when available cleanly layers workspace scoping on top of existing implementations without breaking non-workspace repos.To make this a bit more defensive against accidental non-callable attributes named
for_workspace, you could tighten the check:- repository_instance = repository( - artifact_uri, tracking_uri=tracking_uri, registry_uri=registry_uri - ) - - workspace_name = get_current_workspace() - if workspace_name and hasattr(repository_instance, "for_workspace"): - repository_instance = repository_instance.for_workspace(workspace_name) + repository_instance = repository( + artifact_uri, tracking_uri=tracking_uri, registry_uri=registry_uri + ) + + workspace_name = get_current_workspace() + for_workspace = getattr(repository_instance, "for_workspace", None) + if workspace_name and callable(for_workspace): + repository_instance = for_workspace(workspace_name)Behavior remains the same for correctly implemented repositories, but avoids surprises if a plugin accidentally exposes a non-callable
for_workspaceattribute.Also applies to: 78-92
tests/entities/model_registry/test_registered_model.py (1)
174-189: Avoid hard-coding'default'in repr expectationThe string repr test currently bakes in
"workspace='default'", which could get out of sync ifDEFAULT_WORKSPACE_NAMEever changes.You can keep the assertion resilient by tying it to the constant you already import:
- assert ( - str(rmd) == "<RegisteredModel: aliases={}, creation_timestamp=1000, " - "deployment_job_id=None, deployment_job_state=None, description='something about a model'," - " last_updated_timestamp=2002, " - "latest_versions=['1', '2', '3'], name='myname', tags={}, workspace='default'>" - ) + expected_workspace = DEFAULT_WORKSPACE_NAME + assert ( + str(rmd) == "<RegisteredModel: aliases={}, creation_timestamp=1000, " + "deployment_job_id=None, deployment_job_state=None, description='something about a model'," + " last_updated_timestamp=2002, " + f"latest_versions=['1', '2', '3'], name='myname', tags={}, workspace='{expected_workspace}'>" + )This keeps the test aligned with whatever the configured default workspace name is.
tests/entities/test_experiment.py (1)
72-90: Consider adding a proto round-trip check for non-default workspacesThe dict-based round-trip for a non-default workspace is solid. To fully pin down behavior, consider an additional assertion like
Experiment.from_proto(exp.to_proto()).workspacefor a non-default workspace (either expecting it to reset to the default or to preserve the explicit value, per intended contract).tests/server/test_init.py (1)
9-10: Uvicorn env propagation test is good; consider simplifyingsys.platformpatchThe parametrized
test_run_server_with_uvicorncleanly verifies that_MLFLOW_SGI_NAMEis set and thatMLFLOW_FLASK_SERVER_SECRET_KEYis only included inextra_envwhen present in the environment, which matches the intended behavior.One small improvement: instead of
mock.patch("sys.platform", return_value="linux"), consider patching with a plain string (mock.patch("sys.platform", "linux")) so any equality checks onsys.platformoperate on an actual string rather than aMagicMock.Also applies to: 162-204
docs/docs/self-hosting/workspaces/index.mdx (1)
51-80: Doc matches current workspace semantics; keep it aligned with provider behaviorThe sections on artifact isolation and workspace resolution match the store logic:
- Artifact URIs are constructed by
_get_artifact_location, which uses the workspace provider’sresolve_artifact_root(...)->(root, should_append)to decide whether to add theworkspaces/<workspace>prefix, matching the documented layout.create_experimentnow rejects client-specifiedartifact_locationwhen workspaces are enabled, consistent with the “Client-specified artifact locations” note.- Workspace resolution order in the doc should remain consistent with the actual
_get_active_workspace/ provider implementation; if the default SQL workspace provider ever changes itsresolve_artifact_rootor default-workspace behavior, this page will need updating.No blocking issues from a docs/behavior perspective.
Also applies to: 109-118
tests/entities/test_evaluation_dataset.py (1)
207-225: Non-default workspace round-trip coverage looks solidThis test cleanly verifies that a non-default workspace is preserved through
to_dict/from_dictforEvaluationDataset, with_recordsand_experiment_idsinitialized to safe empties. If you ever want an extra guard on constructor behavior, you could also assertdataset.workspace == workspacebefore serialization, but the current coverage is already sufficient.mlflow/__init__.py (1)
266-273: Workspace APIs are correctly wired into the publicmlflownamespaceImporting the six workspace functions from
mlflow.tracking._workspace.fluentunderif not IS_TRACING_SDK_ONLYand adding them to__all__aligns the public API with the new docs (mlflow.set_workspace(),mlflow.list_workspaces(), etc.) without impacting the tracing-only package. If you want richer IDE completion, you could also mirror these names in theTYPE_CHECKINGimport block, but that’s optional.Also applies to: 332-407
mlflow/store/tracking/sqlalchemy_store.py (1)
648-788: Central workspace-scoped query helpers are used consistently across runs, traces, and logged modelsThe introduction and use of:
_workspace_session()in experiment-facing methods (_search_experiments,get_experiment,get_experiment_by_name, delete/restore/rename),_runs_query()for all run-based lookups (_get_run,_get_deleted_runs,_list_run_infos,_search_runs,_get_run_inputs),_trace_query()for trace info (_get_sql_trace_info,search_traces,_delete_traces,_get_trace,batch_get_traces,_get_sql_assessment),_logged_model_query()and_get_logged_model_record()for logged model CRUD/tagging,ensures that when workspaces are enabled, every higher-level operation is automatically constrained to the active workspace via joins to
SqlExperiment.workspace. The explicit use of:
_get_run(..., session=session)before metric/history operations, and_get_sql_trace_info(...)or_get_logged_model_record(...)before mutating trace/logged-model tags or deleting rows,prevents cross-workspace access to runs, traces, and logged models.
Given these helpers are now the choke points for entity visibility, it’s important that any future queries over
SqlRun,SqlTraceInfo,SqlLoggedModel, orSqlEvaluationDatasetin this store also route through them rather than rawsession.query(...). Within this file, that convention appears to be followed.Also applies to: 1833-1899, 2128-2152, 2178-2208, 2952-3070, 3489-3492, 4915-4963
mlflow/utils/rest_utils.py (1)
23-24: Workspace header resolution logic looks solid; consider centralizing normalizationThe new flow does the right things overall:
- Prefers
get_current_workspace()overMLFLOW_WORKSPACEand ignores empty strings.- Only sets
WORKSPACE_HEADER_NAMEwhen a workspace is resolved and the endpoint is not a/mlflow/workspacesadmin path.- Uses
setdefaultso explicitextra_headerscan override the workspace header.Two minor refinements to consider:
_resolve_active_workspace()is annotated to returnstr | Nonebut can currently return a non‑strobject ifget_current_workspace()does so. It would be safer/clearer either to:
- Explicitly cast to
strwhen building the header value, or- Delegate to a single normalization helper (e.g.
resolve_entity_workspace_name) so all call sites share the same semantics._should_include_workspace_header()uses a substring match on"/mlflow/workspaces". That’s fine for current routes, but if you later add non‑admin endpoints whose paths also contain that segment, they’ll be excluded from header injection as well. If that’s not desired, tightening the match (e.g. to known prefixes) might be worth it.These are optional cleanup items; the current behavior is functionally consistent with the workspaces design.
If you want an automated check for non‑string workspace values being passed to headers, you can temporarily log
type(workspace)inside_resolve_active_workspace()in your dev environment and exercise a few workspace contexts to confirm they’re always strings.Also applies to: 44-45, 62-78, 80-93, 157-160
mlflow/server/auth/db/migrations/versions/2ed73881770d_workspace_permissions.py (1)
19-77: Migration is carefully designed; conflict guard on downgrade is a nice touchThe migration looks robust:
Upgrade path:
- Creates
workspace_permissionswith a composite PK and helper indexes onusernameandworkspace.- Adds a nullable
workspacecolumn toregistered_model_permissions, backfills existing rows to"default", then enforcesNOT NULL.- Detects the legacy
unique_name_userconstraint via the inspector before dropping it and replaces it withunique_workspace_name_useron(workspace, name, user_id).Downgrade path:
- Explicitly checks for
(name, user_id)combinations spanning multiple workspaces before droppingworkspace, preventing silent data conflicts.- Cleanly drops the workspace permissions table and restores the original unique constraint.
I don’t see correctness issues here. If you expect frequent lookups by
(workspace, username, resource_type), you might later consider adding a composite index to match that pattern, but that can be deferred until profiling shows a need.Also applies to: 79-105
tests/server/auth/db/test_cli.py (1)
20-29: CLI upgrade tests align with new schema; consider also asserting workspace table in legacy pathThe updated tests correctly track the new migration state:
test_upgradenow includesworkspace_permissionsin the expected table set, usingsorted()to avoid order sensitivity.test_auth_and_tracking_store_coexistasserts that bothregistered_model_permissionsandworkspace_permissionscoexist with tracking tables, which is exactly what we want after a combined upgrade.test_upgrade_from_legacy_databaseverifies the new head revision (2ed73881770d) and thatscorer_permissionsexists after upgrading a legacy DB.One optional enhancement: in
test_upgrade_from_legacy_database, also assert that"workspace_permissions" in tablesto ensure the workspace schema is created correctly in the legacy‑upgrade flow as well, not just in a fresh DB.Also applies to: 50-57, 118-124
docs/docs/self-hosting/workspaces/configuration.mdx (1)
1-258: Workspace configuration docs are clear and consistent; minor cleanup possibleThe documentation does a good job explaining:
- How to enable workspaces (flag vs
MLFLOW_ENABLE_WORKSPACES) and how clients select the active workspace (mlflow.set_workspace,MLFLOW_WORKSPACE, provider default).- That the
X-MLFLOW-WORKSPACEheader is used for scoping while keeping/apiand/ajax-apipaths stable, matching the client behavior inrest_utils.- Startup validation, migration/rollback behavior (including conflict checks), and the
create_experiment()limitation are all called out explicitly.- The CLI, Docker Compose, and Kubernetes examples are realistic and easy to reuse.
Only small nit:
Linkis imported at the top but not used; you can drop that import if the MDX linter complains, otherwise it’s harmless.Overall, this doc looks ready to ship.
mlflow/server/auth/client.py (1)
672-749: Consider adding usage examples to workspace permission methods.The workspace permission methods lack usage examples, while other methods in this file include comprehensive docstring examples (see lines 76-94 for
create_userexample). Adding examples would improve developer experience and maintain consistency.Example for
list_workspace_permissions:def list_workspace_permissions(self, workspace_name: str): """ List the permissions configured for the specified workspace. Args: workspace_name: The workspace name. Returns: A list of :py:class:`mlflow.server.auth.entities.WorkspacePermission` objects. .. code-block:: python :caption: Example from mlflow.server.auth.client import AuthServiceClient client = AuthServiceClient("tracking_uri") permissions = client.list_workspace_permissions("team-a") for perm in permissions: print(f"user: {perm.username}, resource: {perm.resource_type}, perm: {perm.permission}") """mlflow/entities/model_registry/model_version.py (1)
190-214: Consider adding workspace handling comments to proto methods for consistency.Unlike
Experiment(lines 87-88, 118-119) andWebhook(lines 340-341, 364-365), thefrom_protoandto_protomethods here lack comments explaining that workspace is intentionally derived from active context and not persisted in ProtoModelVersion. Adding these comments would improve consistency and clarity across workspace-aware entities.In
from_proto, add a comment before the return statement:@classmethod def from_proto(cls, proto) -> "ModelVersion": # input: mlflow.protos.model_registry_pb2.ModelVersion # returns: ModelVersion entity + # Workspace is intentionally derived from the request context (falling back to the active + # workspace resolver) and therefore is not persisted in the ProtoModelVersion. model_version = cls(In
to_proto, add a comment before the return statement:model_version.aliases.extend(self.aliases) if self.deployment_job_state is not None: ModelVersionDeploymentJobState.to_proto(self.deployment_job_state) + # Workspace is intentionally omitted because it is derived from the active context rather + # than read from ProtoModelVersion serialization. # TODO: Include params, metrics, and model ID in proto return model_versionAlso applies to: 216-248
mlflow/store/model_registry/base_rest_store.py (2)
1-11: Workspace probing logic is sound; consider not permanently caching transient failuresThe probe endpoint, timeout, and status-code check (treating 200/401/403 as “workspace-capable”) make sense and the debug logging on failures is helpful. One thing to watch is that any initial network exception or 5xx/non-404 response will cause
_probe_workspace_support()to returnFalse, and that value is then cached in_workspace_supportfor the lifetime of theBaseRestStoreinstance. In situations where the server is briefly misconfigured or experiencing a transient outage, this will make all future workspace-using calls look like the server simply “does not support workspaces”.You may want to distinguish “definitive no” (e.g., 404) from transient/ambiguous failures and avoid caching the latter, so a later call can re-probe instead of locking in a false negative.
Also applies to: 20-53
69-81: Workspace gating before REST calls is good; ensure_resolve_active_workspaceand header propagation stay alignedNormalizing the active workspace (stripping whitespace and treating empty strings as
None) and only enforcing workspace support when a workspace is actually requested is a good pattern, and the FEATURE_DISABLED error message is clear.Since
workspacehere is only used to decide whether to gate and to format the error message—and not passed intocall_endpoint—the actual propagation of the workspace (e.g., via headers) remains the responsibility ofrest_utils. It’s worth double‑checking that:
rest_utils._resolve_active_workspace()always returns eitherstrorNone(no other truthy sentinels), and- the same resolution logic is used wherever the workspace header is set,
so that the gating condition here precisely matches when the client will actually send a workspace to the server.
mlflow/store/workspace_aware_mixin.py (1)
22-80: Workspace mixin design is solid; consider minor robustness and naming clarificationsThis mixin nicely centralizes “active workspace + SQLAlchemy session” handling and gives a clean
_workspace_session()pattern for callers.A couple of small tweaks you might consider:
- In
_get_active_workspace,get_current_workspace()is used as‑is. For consistency with other utilities (e.g.,resolve_entity_workspace_name), you could strip whitespace and treat empty strings as unset before returning or raising.supports_workspaces()always returnsTrue, even whenMLFLOW_ENABLE_WORKSPACESis not set. If callers interpret this as a runtime capability check, that could be surprising; either wiring_workspaces_enabled()into this method or clarifying in the docstring that it reflects structural support (not whether the feature is enabled) would make intent clearer.These are non‑blocking but would make the behavior more predictable and aligned with the rest of the workspace utilities.
docs/docs/self-hosting/workspaces/permissions.mdx (1)
35-40: Permissions documentation is clear; consider aligning experiment identifier in examplesThe explanation of how workspace permissions interact with resource‑level permissions and
list_workspaces()is clear and the curl/Python snippets are very helpful.One small nit: earlier you refer to an experiment as
exp-123, while the later Python example usesmlflow.get_experiment("123"). To avoid any confusion over whether that argument is meant to be an ID vs a name, it might be worth aligning the identifier format (e.g., consistently using123orexp-123) across the narrative and code samples.Also applies to: 115-139
mlflow/server/auth/db/models.py (1)
104-125: Workspace permission model design looks sound; consider future-proofing relationships
SqlWorkspacePermission’s composite primary key(workspace, username, resource_type)plus indices onusernameandworkspaceshould perform well for the expected access patterns. If you later need to surface workspace permissions viaSqlUser(or ensure referential integrity), consider adding a FK tousers.usernameand a relationship onSqlUser; for now, this standalone design is acceptable.tests/server/test_workspace_handlers.py (1)
127-149: Before/teardown handler test is good; consider asserting teardown effectThe artifact-route test verifies that
workspace_before_request_handlersetsg.mlflow_workspacefrom the header and returnsNone, which is valuable. You might optionally also assert thatworkspace_teardown_request_handlerclears any workspace-related state (for example, checking thatgno longer hasmlflow_workspace) to catch future regressions in teardown behavior.mlflow/tracking/_workspace/utils.py (1)
1-52: Workspace URI resolution and precedence look correctThe precedence (explicit arg → in‑process override /
MLFLOW_WORKSPACE_URI→ resolved tracking URI) matches the docstring, and the lazy import avoids circular dependencies cleanly. The additional synchronization toMLFLOW_WORKSPACE_URIinset_workspace_uriis reasonable as a “global override” mechanism; callers that use this in tests should remember to callset_workspace_uri(None)to fully reset both the in‑memory and env-backed state.tests/entities/model_registry/test_model_version.py (1)
7-8: Solid workspace coverage; consider avoiding hard-coded default name in repr expectationThe additions nicely validate that:
- Default workspace is surfaced on
ModelVersion(_checkandexpected_dict["workspace"]usingDEFAULT_WORKSPACE_NAME), and- Non-default workspaces round-trip via
dict(model_version)→from_dictionaryand appear in__str__.One minor robustness tweak:
test_string_reprcurrently hard-codes"workspace='default'"instead of relying onDEFAULT_WORKSPACE_NAME, which is already imported. If the default workspace name ever changes, this test will be the one brittle spot.You could update the assertion as:
- assert str(model_version) == ( + assert str(model_version) == ( "<ModelVersion: aliases=[], creation_timestamp=12, current_stage='Archived', " "deployment_job_state=None, " "description='This is a test model.', last_updated_timestamp=100, metrics=None, " "model_id=None, name='myname', params=None, run_id='some run', " "run_link='http://localhost:5000/path/to/run', source='path/to/a/notebook', " "status='PENDING_REGISTRATION', status_message='Copying!', tags={}, user_id='user one', " - "version='43', workspace='default'>" + f\"version='43', workspace='{DEFAULT_WORKSPACE_NAME}'>\" )Everything else in this file looks consistent with the new workspace semantics.
Also applies to: 27-28, 43-44, 88-108, 159-185, 188-214
mlflow/tracking/_workspace/context.py (1)
18-23: Consider documenting why the default workspace skips validation
set_current_workspacevalidates all non‑Nonenames exceptDEFAULT_WORKSPACE_NAME. If this is intentional (e.g., to allow a special sentinel that wouldn't pass normal validation), a brief comment in the function docstring or next to the conditional would help future readers understand why the default is treated differently.mlflow/tracking/_workspace/client.py (1)
22-27: Optional: add a return type annotation forproviderTo help static analysis and IDEs, consider annotating
providerwith the expected interface (e.g., aWorkspaceStoreprotocol or concrete base class) so callers get better type hints when chaining throughclient.provider.mlflow/store/workspace/rest_store.py (1)
44-71: Handle empty success bodies for non‑201 creates as wellIn
create_workspace, you already handle the case where a201response has an empty body by synthesizing theWorkspacefrom the request. For other 2xx success codes (e.g., 200 or 204), the current branch:response = verify_rest_response(response, endpoint) data = json.loads(response.text) if response.text else {} return Workspace(name=data["name"], description=data.get("description"))will raise a
KeyErrorif the server legitimately returns a success status with no body.You can mirror the 201-path fallback for this case:
- response = verify_rest_response(response, endpoint) - data = json.loads(response.text) if response.text else {} - return Workspace(name=data["name"], description=data.get("description")) + response = verify_rest_response(response, endpoint) + data = json.loads(response.text) if response.text else {} + if not data: + data = {"name": workspace.name, "description": workspace.description} + return Workspace(name=data["name"], description=data.get("description"))This keeps client behavior robust against servers that treat workspace creation as a no-body success.
tests/server/test_handlers.py (1)
289-383: Regex in_build_endpoint_case_idlikely not doing intended sanitizationIn
_build_endpoint_case_id, the pattern:sanitized_path = re.sub(r"[\\W]+", "_", path.strip("/"))matches a literal backslash+
Wsequence rather than non‑word characters. That means endpoint IDs may still contain<,>,:etc., instead of being normalized, which is harmless but defeats the purpose of the helper.You probably meant to use
\W:- sanitized_path = re.sub(r"[\\W]+", "_", path.strip("/")) + sanitized_path = re.sub(r"[\W]+", "_", path.strip("/"))This keeps the behavior self‑documenting and ensures stable, clean pytest IDs.
mlflow/entities/evaluation_dataset.py (1)
35-48: Workspace plumbing and deleted‑dataset handling in EvaluationDataset look sound
- Adding an optional
workspaceargument (normalized viaresolve_entity_workspace_name) plus a read‑onlyworkspaceproperty, and including it into_dict/from_dict, keeps the entity API consistent with other workspace‑aware objects without breaking existing callers.- The
to_dfchange to catchMlflowExceptionwithRESOURCE_DOES_NOT_EXISTand return an empty DataFrame (while caching_records = []) is a reasonable, non‑surprising fallback when the dataset has been removed in the backend.If not already covered elsewhere, it would be worth adding a focused test that simulates
get_dataset/_load_dataset_recordsraisingRESOURCE_DOES_NOT_EXISTand asserts thatto_df()returns an empty DataFrame and doesn’t re‑query on subsequent calls.Also applies to: 59-60, 92-98, 326-348, 426-442, 463-475
mlflow/store/workspace/abstract_store.py (1)
79-129: Consider de-duplicating validation logic betweenis_validandvalidate
WorkspaceNameValidator.is_validandWorkspaceNameValidator.validateboth implement the same sequence of checks (type, length, pattern, reserved set) in parallel. This is correct but slightly brittle: any future change to the rules must be updated in two places.Consider refactoring to centralize the rules (e.g., implement
is_validin terms of a private_errors(name)helper, or havevalidatecallis_validand only be responsible for constructing error messages) to avoid divergence over time.tests/tracking/test_rest_tracking.py (1)
4733-5191: GraphQL helpers and workspace-aware GraphQL tests are coherentThe
_graphql_post/_graphql_error_messageshelpers, together with thetest_workspace_graphql_*cases, correctly:
- Inject the workspace header from
get_current_workspace()so tests exercise the same behavior as real clients.- Validate both positive cases (data returned in the active workspace) and negative ones (errors or empty results across workspaces).
- Cover model versions, runs, artifacts, and default-workspace semantics.
One optional enhancement (non-blocking) would be future tests around GraphQL batching, since
_graphql_postand_resolve_rbac_resource_typecurrently assume a singleoperationNameper request, but this is sufficient for the current scope.mlflow/server/workspace_helpers.py (2)
25-79: RBAC resource and GraphQL operation maps are sensible and aligned with endpointsThe
_RBAC_RESOURCE_PREFIX_MAPand_GRAPHQL_OPERATION_RESOURCE_MAPcover the major REST/GraphQL surfaces (experiments, traces, datasets, logged models, registry, workspaces, gateway) and cleanly collapse them into the three RBAC resource types. Defaulting unknown GraphQL operations to"experiments"via_DEFAULT_GRAPHQL_RESOURCEis a reasonable fallback for now.As the schema evolves, it’ll be important to keep these maps synchronized with new endpoints/operations, but the structure here makes that straightforward.
153-185: Workspace store resolution and caching are appropriate with clear failure modes
_get_workspace_storeenforcesMLFLOW_ENABLE_WORKSPACES, resolvesworkspace_uriviaworkspace_utils.resolve_workspace_uri, and returns a process-wide cached store instance. The error message when the URI can’t be resolved is explicit and should help diagnose misconfiguration.Given the memoization behavior, tests and any dynamic reconfiguration correctly reset
_workspace_storeexternally (as seen in the test fixture). If you expect hot-reload of workspace configuration in long-lived processes, a small public helper to clear/reload this cache could be a nice future enhancement, but it’s not required for correctness.mlflow/server/handlers.py (3)
535-584: Workspace backend initialization checks are sensibleThe extended
initialize_backend_storeslogic correctly:
- Enforces
--enable-workspaces+default_artifact_rootpairing.- Ensures tracking and (optional) registry backends advertise
supports_workspaces().- Optionally validates that the auth backend supports workspace permissions.
- Initializes the workspace store once workspaces are enabled.
This is a good place to centralize capability checks. One minor suggestion: if
workspace_store_uriis misconfigured, surfacing that via a more targeted error message (rather than just_get_workspace_storefailure) would aid operators, but it’s not a blocker.
2935-3041: Workspace-scoped artifact repository paths – logic is mostly solid, watch edge cases
_workspace_scoped_repo_pathand its integration into_download_artifact,_upload_artifact,*_list_*, and*_delete_*correctly:
- Prefix artifact operations with
workspaces/<current_workspace>when a workspace is active.- Treat pre-prefixed paths (
workspaces/...) as idempotent to avoid double-prefixing.- Preserve client-visible paths (responses still expose logical names like basenames).
A couple of small points to keep in mind:
- Because workspace names become path segments, it’s critical that
WorkspaceNameValidatorforbids path separators and..—which appears to be assumed here. If that’s not already enforced, adding it there is preferable to re-validating here.- The
normalized or posixpath.join("workspaces", workspace)branch instartswith("workspaces/")is effectively dead code sincenormalizedis never empty in that branch; you can simplify this without behavior change later.Functionally, this is fine as-is.
4089-4106: Extendedget_endpointsAPI remains backwards compatibleAdding the
include_workspace_endpointsparameter with a default ofTrueis backwards compatible with existing call sites that only passget_handler. The workspace endpoints themselves are further guarded by_disable_if_workspaces_disabled, so they won’t be usable when multi-tenancy is off but won’t break routing.If there are non-server callers relying on the exact endpoint list, consider documenting the new flag so they can opt out explicitly if needed, but behavior here looks correct.
tests/server/auth/test_auth.py (2)
418-454: Workspace-aware test scaffolding is solid; PK shape assumption is the only subtle dependencyThe
_TrackingStore/_RegistryStoreshims andworkspace_permission_setupfixture do a good job of:
- Modeling experiment/run/trace/logged-model → workspace mappings.
- Initializing a real
SqlAlchemyStoreagainst a temp SQLite DB.- Installing the store into
auth_moduleand wiring_get_tracking_store/_get_model_registry_store.- Managing
workspace_contexttokens and disposing the engine infinally.The only implicit assumption is that
SqlWorkspacePermission’s primary key matches the(workspace, username, resource_type)tuple used insession.get. That’s consistent with the new store code, but if the model ever changes, these tests will crash rather than fail logically—which is acceptable for unit tests.Also applies to: 457-462, 465-510
645-769: Experiment + artifact proxy validator tests cover manage/read/none permutations wellThe group of tests around experiments and experiment artifact proxies:
- Verify that
MANAGEgrants full access (read/update/delete/create).- Ensure
READallows only safe reads and blocks writes and create.- Confirm experiment artifact proxy validators handle both explicit experiment IDs and “uploads/...” paths that fall back to workspace-level permissions.
- Validate
filter_experiment_idsbehavior when workspace permissions drop toNO_PERMISSIONS.Use of
workspace_permission_setup,_set_workspace_permission, andworkspace_contexttokens is careful and avoids cross-test leakage. The coverage here is strong.mlflow/server/auth/sqlalchemy_store.py (2)
5-6: Workspace flag + active workspace resolution are reasonable, but error message is strict
_workspaces_enabledusesMLFLOW_ENABLE_WORKSPACES.get()and_get_active_workspace_namefalls back toDEFAULT_WORKSPACE_NAMEwhen the feature is disabled, which maintains old behavior.- When the feature is enabled,
_get_active_workspace_namerequires a current workspace and raisesinvalid_parameter_valueif none is set. That’s a deliberate safety rail to avoid cross-tenant leakage from the auth store.This strictness is good for multi-tenancy, but it does mean any caller of the auth store in workspace mode must ensure
workspace_contextis set (e.g., via the new before-request handler). That seems aligned with the rest of the PR.Also applies to: 19-27, 29-33, 35-55
389-452: Accessible-workspaces and permission precedence logic are well defined
list_accessible_workspace_namesaggregates all workspaces where the user (or"*") has any permission withcan_read, which is exactly what list-workspaces style endpoints need.get_workspace_permissionimplements a clear precedence order (exact user+resource, user+*,*+resource,*+*) and validates permission values before converting to aPermissionobject.One minor enhancement you might consider later: document this precedence explicitly in the public API for workspace permissions so clients understand how wildcard grants interact, but the implementation here is sound.
mlflow/store/model_registry/sqlalchemy_store.py (1)
402-405: Minor consistency nit: avoid unusedworkspacefrom_workspace_sessionIn both
search_registered_modelsandsearch_model_versions, theworkspacevalue returned by_workspace_session()is unused because the downstream helpers callcls._get_active_workspace()again.It’s not incorrect, but for clarity and a single source of truth, consider either:
- Passing the
workspacefrom_workspace_session()into the helper methods, or- Dropping the tuple unpacking if
_workspace_session()doesn’t need to expose it.This would make it clearer where the workspace context really comes from.
Also applies to: 1181-1184
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (103)
.github/workflows/master.yml(1 hunks)docs/api_reference/api_inventory.txt(14 hunks)docs/api_reference/source/rest-api.rst(1 hunks)docs/docs/self-hosting/architecture/overview.mdx(1 hunks)docs/docs/self-hosting/index.mdx(1 hunks)docs/docs/self-hosting/workspaces/configuration.mdx(1 hunks)docs/docs/self-hosting/workspaces/getting-started.mdx(1 hunks)docs/docs/self-hosting/workspaces/index.mdx(1 hunks)docs/docs/self-hosting/workspaces/permissions.mdx(1 hunks)docs/docs/self-hosting/workspaces/workspace-providers.mdx(1 hunks)docs/sidebarsSelfHosting.ts(1 hunks)mlflow/__init__.py(5 hunks)mlflow/cli/__init__.py(6 hunks)mlflow/entities/evaluation_dataset.py(7 hunks)mlflow/entities/experiment.py(6 hunks)mlflow/entities/model_registry/model_version.py(4 hunks)mlflow/entities/model_registry/registered_model.py(6 hunks)mlflow/entities/webhook.py(7 hunks)mlflow/entities/workspace.py(1 hunks)mlflow/environment_variables.py(1 hunks)mlflow/models/model.py(2 hunks)mlflow/server/__init__.py(4 hunks)mlflow/server/auth/__init__.py(35 hunks)mlflow/server/auth/basic_auth.ini(1 hunks)mlflow/server/auth/client.py(4 hunks)mlflow/server/auth/config.py(2 hunks)mlflow/server/auth/db/migrations/versions/2ed73881770d_workspace_permissions.py(1 hunks)mlflow/server/auth/db/models.py(4 hunks)mlflow/server/auth/db/utils.py(0 hunks)mlflow/server/auth/entities.py(5 hunks)mlflow/server/auth/routes.py(1 hunks)mlflow/server/auth/sqlalchemy_store.py(6 hunks)mlflow/server/handlers.py(21 hunks)mlflow/server/otel_api.py(6 hunks)mlflow/server/workspace_helpers.py(1 hunks)mlflow/store/artifact/artifact_repo.py(2 hunks)mlflow/store/artifact/artifact_repository_registry.py(2 hunks)mlflow/store/artifact/http_artifact_repo.py(4 hunks)mlflow/store/db/utils.py(2 hunks)mlflow/store/model_registry/abstract_store.py(1 hunks)mlflow/store/model_registry/base_rest_store.py(3 hunks)mlflow/store/model_registry/dbmodels/models.py(7 hunks)mlflow/store/model_registry/sqlalchemy_store.py(37 hunks)mlflow/store/tracking/abstract_store.py(1 hunks)mlflow/store/tracking/dbmodels/models.py(6 hunks)mlflow/store/tracking/rest_store.py(5 hunks)mlflow/store/tracking/sqlalchemy_store.py(89 hunks)mlflow/store/workspace/__init__.py(1 hunks)mlflow/store/workspace/abstract_store.py(1 hunks)mlflow/store/workspace/dbmodels/models.py(1 hunks)mlflow/store/workspace/rest_store.py(1 hunks)mlflow/store/workspace/sqlalchemy_store.py(1 hunks)mlflow/store/workspace/utils.py(1 hunks)mlflow/store/workspace_aware_mixin.py(1 hunks)mlflow/tracking/_tracking_service/client.py(6 hunks)mlflow/tracking/_workspace/__init__.py(1 hunks)mlflow/tracking/_workspace/client.py(1 hunks)mlflow/tracking/_workspace/context.py(1 hunks)mlflow/tracking/_workspace/fluent.py(1 hunks)mlflow/tracking/_workspace/registry.py(1 hunks)mlflow/tracking/_workspace/utils.py(1 hunks)mlflow/tracking/client.py(6 hunks)mlflow/tracking/default_experiment/registry.py(2 hunks)mlflow/tracking/fluent.py(2 hunks)mlflow/utils/rest_utils.py(6 hunks)mlflow/utils/server_cli_utils.py(1 hunks)mlflow/utils/workspace_utils.py(1 hunks)tests/entities/model_registry/test_model_version.py(5 hunks)tests/entities/model_registry/test_registered_model.py(8 hunks)tests/entities/test_evaluation_dataset.py(1 hunks)tests/entities/test_experiment.py(4 hunks)tests/entities/test_webhook.py(2 hunks)tests/genai/datasets/test_fluent.py(1 hunks)tests/helpers/__init__.py(1 hunks)tests/helpers/db_mocks.py(1 hunks)tests/server/auth/db/test_cli.py(3 hunks)tests/server/auth/test_auth.py(6 hunks)tests/server/auth/test_client.py(5 hunks)tests/server/auth/test_sqlalchemy_store.py(9 hunks)tests/server/test_handlers.py(3 hunks)tests/server/test_init.py(3 hunks)tests/server/test_otel_api.py(1 hunks)tests/server/test_workspace_endpoints.py(1 hunks)tests/server/test_workspace_handlers.py(1 hunks)tests/store/model_registry/test_rest_store.py(1 hunks)tests/store/model_registry/test_sqlalchemy_store.py(2 hunks)tests/store/model_registry/test_sqlalchemy_store_workspace.py(1 hunks)tests/store/tracking/test_rest_store.py(12 hunks)tests/store/tracking/test_sqlalchemy_store.py(3 hunks)tests/store/tracking/test_sqlalchemy_store_workspace.py(1 hunks)tests/store/workspace/test_rest_store.py(1 hunks)tests/store/workspace/test_sqlalchemy_store.py(1 hunks)tests/store/workspace/test_workspace_validator.py(1 hunks)tests/test_cli.py(29 hunks)tests/tracking/_model_registry/test_utils.py(2 hunks)tests/tracking/_tracking_service/test_utils.py(3 hunks)tests/tracking/test_client_workspace.py(1 hunks)tests/tracking/test_mlflow_artifacts.py(1 hunks)tests/tracking/test_rest_tracking.py(13 hunks)tests/tracking/test_workspace_registry.py(1 hunks)tests/tracking/test_workspace_utils.py(1 hunks)tests/utils/test_rest_utils.py(3 hunks)tests/utils/test_workspace_utils.py(1 hunks)
💤 Files with no reviewable changes (1)
- mlflow/server/auth/db/utils.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (26)
- GitHub Check: pyfunc (3)
- GitHub Check: pyfunc (2)
- GitHub Check: pyfunc (1)
- GitHub Check: pyfunc (4)
- GitHub Check: database
- GitHub Check: python (2)
- GitHub Check: java
- GitHub Check: python (3)
- GitHub Check: python (1)
- GitHub Check: windows (3)
- GitHub Check: windows (1)
- GitHub Check: windows (4)
- GitHub Check: windows (2)
- GitHub Check: python-skinny
- GitHub Check: genai
- GitHub Check: flavors
- GitHub Check: evaluate (1)
- GitHub Check: models (1)
- GitHub Check: evaluate (2)
- GitHub Check: models (2)
- GitHub Check: build (skinny)
- GitHub Check: build (dev)
- GitHub Check: build (tracing)
- GitHub Check: r
- GitHub Check: build
- GitHub Check: protect
| if enable_workspaces: | ||
| os.environ[MLFLOW_ENABLE_WORKSPACES.name] = "true" | ||
| if workspace_store_uri: | ||
| os.environ[MLFLOW_WORKSPACE_URI.name] = workspace_store_uri | ||
| elif workspace_store_uri: | ||
| click.echo( | ||
| "Ignoring --workspace-store-uri because workspaces are not enabled. " | ||
| "Use --enable-workspaces to activate workspace mode.", | ||
| err=True, | ||
| ) | ||
|
|
There was a problem hiding this comment.
Conflicting behavior vs messaging for --workspace-store-uri without --enable-workspaces
When enable_workspaces is false but workspace_store_uri is provided, the code:
- Emits a CLI message saying “Ignoring --workspace-store-uri because workspaces are not enabled…”
- Still passes
workspace_store_uriintoinitialize_backend_stores(...)(as verified intests/test_cli.py::test_server_workspace_uri_warns_without_enable), and - Logs a warning via
_logger.warningabout the workspace URI being provided without enabling workspaces.
So the CLI text claims the URI is ignored, but the backend initialization still receives it, which is confusing for users.
Recommend aligning the message with the actual behavior (or vice versa). For example, if you intend to keep passing the URI through, adjust the CLI message to remove “Ignoring” and just warn that workspaces remain disabled:
- elif workspace_store_uri:
- click.echo(
- "Ignoring --workspace-store-uri because workspaces are not enabled. "
- "Use --enable-workspaces to activate workspace mode.",
- err=True,
- )
+ elif workspace_store_uri:
+ click.echo(
+ "--workspace-store-uri was provided but workspaces are not enabled. "
+ "Workspace APIs will remain disabled unless you pass --enable-workspaces.",
+ err=True,
+ )You may also want to consider whether you need both the click.echo and _logger.warning paths or if one is sufficient, to avoid double-reporting the same condition.
Also applies to: 587-597
| # When true, users inherit default_permission for the reserved 'default' workspace. | ||
| grant_default_workspace_access = false |
There was a problem hiding this comment.
🧩 Analysis chain
Good default for security - explicit permission model.
Setting grant_default_workspace_access = false by default follows the principle of least privilege, requiring explicit permission grants rather than implicit access to the default workspace. The comment clearly explains the behavior.
Verify that this configuration option is documented in the workspace/multi-tenancy documentation:
🏁 Script executed:
#!/bin/bash
# Search for documentation of grant_default_workspace_access
rg -n 'grant_default_workspace_access' --type=rst --type=md
# Also search for related workspace permission documentation
rg -n -C5 'default workspace.*permission|permission.*default workspace' docs/Length of output: 0
I need to conduct a more comprehensive search to determine if this configuration option is documented elsewhere in the codebase.
🏁 Script executed:
#!/bin/bash
# Broader search across all file types for the configuration option
rg -n 'grant_default_workspace_access' --max-count=10
# Search for documentation files that mention workspace configuration
rg -n -i 'workspace.*config|config.*workspace' docs/ --type=md --type=rst | head -20
# Check for configuration schema or reference files
fd -n '*config*' -e json -e yaml -e toml docs/ 2>/dev/null
# Search for where this option is used in code
rg -n 'grant_default_workspace_access' -g '*.py' --max-count=10Length of output: 1972
🏁 Script executed:
#!/bin/bash
# Check the docstring at __init__.py line 299
sed -n '290,310p' mlflow/server/auth/__init__.py
# Check for any workspace permission/config documentation files
find docs/ -type f \( -name '*workspace*' -o -name '*permission*' -o -name '*auth*' \) 2>/dev/null | head -20
# Check the config.py file to see if there's inline documentation
cat mlflow/server/auth/config.pyLength of output: 2319
🏁 Script executed:
#!/bin/bash
# Check if the documentation files mention grant_default_workspace_access
echo "=== Checking basic-http-auth.mdx ==="
rg -n 'grant_default_workspace_access|default_permission' docs/docs/self-hosting/security/basic-http-auth.mdx
echo "=== Checking permissions.mdx ==="
rg -n 'grant_default_workspace_access|default_permission' docs/docs/self-hosting/workspaces/permissions.mdx
# Also check the basic_auth.ini for all documented config options
echo "=== Content of basic_auth.ini ==="
cat mlflow/server/auth/basic_auth.iniLength of output: 470
Add configuration option to documentation for consistency.
The grant_default_workspace_access option is fully implemented and has secure defaults (false). However, it's not documented in docs/docs/self-hosting/security/basic-http-auth.mdx, while the related default_permission option is documented there at line 1065. For consistency and discoverability, this new option should be added to the configuration reference documentation in that file.
The inline comment in mlflow/server/auth/basic_auth.ini is clear and sufficient for direct users, but formal documentation alignment is needed.
🤖 Prompt for AI Agents
In docs/docs/self-hosting/security/basic-http-auth.mdx around line 1065, add
documentation for the grant_default_workspace_access configuration (described in
mlflow/server/auth/basic_auth.ini lines 7-8) to match the existing
default_permission entry: describe the option name, type (boolean), default
(false), purpose (controls whether users inherit default_permission for the
reserved 'default' workspace), and recommended secure default, and include an
example snippet and any notes about security implications and interaction with
default_permission.
| @catch_mlflow_exception | ||
| @_disable_if_workspaces_disabled | ||
| def _list_workspaces_handler(): | ||
| workspaces = _get_workspace_store().list_workspaces() | ||
| return jsonify({"workspaces": [ws.to_dict() for ws in workspaces]}) | ||
|
|
||
|
|
||
| @catch_mlflow_exception | ||
| @_disable_if_workspaces_disabled | ||
| def _create_workspace_handler(): | ||
| payload = _get_request_json(request) or {} | ||
| name = payload.get("name") | ||
| if not name: | ||
| raise MlflowException.invalid_parameter_value("Workspace name must be provided") | ||
|
|
||
| WorkspaceNameValidator.validate(name) | ||
|
|
||
| description = payload.get("description") | ||
| store = _get_workspace_store() | ||
| try: | ||
| workspace = store.create_workspace(Workspace(name=name, description=description)) | ||
| except NotImplementedError: | ||
| raise _workspace_not_supported("Workspace creation is not supported by this provider") | ||
|
|
||
| _ensure_default_workspace_experiment(workspace.name) | ||
| response = jsonify(workspace.to_dict()) | ||
| response.status_code = 201 | ||
| return response | ||
|
|
||
|
|
||
| @catch_mlflow_exception | ||
| @_disable_if_workspaces_disabled | ||
| def _get_workspace_handler(workspace_name: str): | ||
| workspace = _get_workspace_store().get_workspace(workspace_name) | ||
| return jsonify(workspace.to_dict()) | ||
|
|
||
|
|
||
| @catch_mlflow_exception | ||
| @_disable_if_workspaces_disabled | ||
| def _update_workspace_handler(workspace_name: str): | ||
| payload = _get_request_json(request) or {} | ||
|
|
||
| if not payload.keys(): | ||
| raise MlflowException.invalid_parameter_value("Workspace update must have at least one key") | ||
|
|
||
| invalid_keys = payload.keys() - {"description"} | ||
| if invalid_keys: | ||
| raise MlflowException.invalid_parameter_value( | ||
| f"Workspace update had the following invalid keys: {', '.join(invalid_keys)}" | ||
| ) | ||
|
|
||
| store = _get_workspace_store() | ||
| try: | ||
| workspace = store.update_workspace( | ||
| Workspace(name=workspace_name, description=payload["description"]) | ||
| ) | ||
| except NotImplementedError: | ||
| raise _workspace_not_supported("Workspace updates are not supported by this provider") | ||
|
|
||
| return jsonify(workspace.to_dict()) | ||
|
|
||
|
|
||
| @catch_mlflow_exception | ||
| @_disable_if_workspaces_disabled | ||
| def _delete_workspace_handler(workspace_name: str): | ||
| store = _get_workspace_store() | ||
| if _workspace_contains_resources(workspace_name): | ||
| raise MlflowException( | ||
| f"Cannot delete workspace '{workspace_name}' because it contains resources", | ||
| INVALID_STATE, | ||
| ) | ||
| try: | ||
| store.delete_workspace(workspace_name) | ||
| except NotImplementedError: | ||
| raise _workspace_not_supported("Workspace deletion is not supported by this provider") | ||
| return Response(status=204) | ||
|
|
||
|
|
||
| def _workspace_endpoints(): | ||
| endpoints = [] | ||
| for path in _get_paths("/mlflow/workspaces"): | ||
| endpoints.append((path, _list_workspaces_handler, ["GET"])) | ||
| endpoints.append((path, _create_workspace_handler, ["POST"])) | ||
|
|
||
| for path in _get_paths("/mlflow/workspaces/<workspace_name>"): | ||
| endpoints.append((path, _get_workspace_handler, ["GET"])) | ||
| endpoints.append((path, _update_workspace_handler, ["PATCH"])) | ||
| endpoints.append((path, _delete_workspace_handler, ["DELETE"])) | ||
|
|
||
| return endpoints | ||
|
|
There was a problem hiding this comment.
Workspace CRUD handlers are well-structured; consider minor PATCH semantics tightening
The workspace CRUD handlers:
- Validate names via
WorkspaceNameValidator. - Use
_disable_if_workspaces_disabledat the handler level. - Wrap unsupported provider behavior as
FEATURE_DISABLED. - Prevent deleting non-empty workspaces via
_workspace_contains_resources.
One small nit: _update_workspace_handler currently allows an empty "description" field but forbids any other keys and will KeyError if "description" is omitted yet no invalid keys are present. You’re already checking if not payload.keys(); an extra if "description" not in payload branch to raise invalid_parameter_value would make the error mode clearer.
- store = _get_workspace_store()
- try:
- workspace = store.update_workspace(
- Workspace(name=workspace_name, description=payload["description"])
- )
+ if "description" not in payload:
+ raise MlflowException.invalid_parameter_value(
+ "Workspace update must include the 'description' field"
+ )
+ store = _get_workspace_store()
+ try:
+ workspace = store.update_workspace(
+ Workspace(name=workspace_name, description=payload["description"])
+ )🤖 Prompt for AI Agents
mlflow/server/handlers.py around lines 934-1024: _update_workspace_handler can
KeyError when payload is non-empty but missing "description" and currently
permits an empty description; add a validation that explicitly requires the
"description" key for PATCH updates and raise
MlflowException.invalid_parameter_value with a clear message if "description" is
not present, then use payload["description"] as before (or use
payload.get("description") if you want to allow null/empty strings) to avoid
KeyError; keep the existing invalid_keys check and NotImplementedError handling
unchanged.
| def _ensure_no_non_default_workspace_registry_entries(self) -> None: | ||
| """ | ||
| Prevent starting in single-tenant mode when registry objects exist in other workspaces. | ||
| """ | ||
|
|
||
| with self.ManagedSessionMaker() as session: | ||
| row = ( | ||
| session.query(SqlRegisteredModel.name, SqlRegisteredModel.workspace) | ||
| .filter(SqlRegisteredModel.workspace != DEFAULT_WORKSPACE_NAME) | ||
| .first() | ||
| ) | ||
|
|
||
| if row: | ||
| name, workspace = row | ||
| raise MlflowException( | ||
| "Cannot disable workspaces because registered models exist outside the default " | ||
| f"workspace (for example, model '{name}' in workspace '{workspace}'). " | ||
| "Either remove those models or re-enable workspaces before starting.", | ||
| error_code=INVALID_STATE, | ||
| ) | ||
|
|
||
| row = ( | ||
| session.query(SqlWebhook.webhook_id, SqlWebhook.workspace) | ||
| .filter(SqlWebhook.workspace != DEFAULT_WORKSPACE_NAME) | ||
| .first() | ||
| ) | ||
|
|
||
| if row: | ||
| webhook_id, workspace = row | ||
| raise MlflowException( | ||
| "Cannot disable workspaces because webhooks exist outside the default " | ||
| f"workspace (for example, webhook '{webhook_id}' in workspace '{workspace}'). " | ||
| "Either remove those webhooks or re-enable workspaces before starting.", | ||
| error_code=INVALID_STATE, | ||
| ) | ||
|
|
There was a problem hiding this comment.
Consider also treating NULL workspaces as incompatible when disabling workspaces
The checks in _ensure_no_non_default_workspace_registry_entries use workspace != DEFAULT_WORKSPACE_NAME. Depending on how migrations populate the new workspace column, rows with workspace IS NULL will not be caught by this filter but will also not be visible to code that always filters by an explicit workspace.
If there’s any chance of legacy rows with workspace left NULL, you may want to treat them as incompatible with disabling workspaces (or backfill them to the default) to avoid silently “losing” registry/webhook data in single-tenant mode.
- .filter(SqlRegisteredModel.workspace != DEFAULT_WORKSPACE_NAME)
+ .filter(
+ SqlRegisteredModel.workspace.isnot(None),
+ SqlRegisteredModel.workspace != DEFAULT_WORKSPACE_NAME,
+ )
...
- .filter(SqlWebhook.workspace != DEFAULT_WORKSPACE_NAME)
+ .filter(
+ SqlWebhook.workspace.isnot(None),
+ SqlWebhook.workspace != DEFAULT_WORKSPACE_NAME,
+ )Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In mlflow/store/model_registry/sqlalchemy_store.py around lines 159 to 194, the
workspace checks only filter where workspace != DEFAULT_WORKSPACE_NAME and
therefore miss rows where workspace IS NULL; change the checks to treat NULL as
non-default (either by expanding the query filters to include workspace IS NULL
or by backfilling NULL workspaces to DEFAULT_WORKSPACE_NAME before this
validation) so any legacy rows with NULL workspace are considered incompatible
when disabling workspaces and will trigger the same MlflowException as
non-default workspaces.
| ) | ||
| """ | ||
| Workspace name that scopes this dataset. Defaults to ``'default'`` for legacy rows. | ||
| Workspace name that scopes this dataset. |
There was a problem hiding this comment.
Guard against None workspaces in SqlEvaluationDataset.from_mlflow_entity
SqlEvaluationDataset.workspace is nullable=False with a default/server_default, but from_mlflow_entity passes workspace=dataset.workspace directly. If callers construct an EvaluationDataset without setting workspace (leaving it as None), this will attempt to persist NULL into a non‑nullable column and also bypass the DB default.
Consider normalizing None to DEFAULT_WORKSPACE_NAME (or otherwise enforcing a non‑None workspace) here:
@@
- return cls(
- dataset_id=dataset.dataset_id,
- workspace=dataset.workspace,
+ return cls(
+ dataset_id=dataset.dataset_id,
+ workspace=(
+ dataset.workspace
+ if dataset.workspace is not None
+ else DEFAULT_WORKSPACE_NAME
+ ),
name=dataset.name,This keeps the ORM/entity mapping symmetric while avoiding integrity errors when workspace is omitted by the caller.
Also applies to: 1415-1415, 1439-1439
🤖 Prompt for AI Agents
In mlflow/store/tracking/dbmodels/models.py around lines 1324 (and similarly at
1415 and 1439), from_mlflow_entity constructs SqlEvaluationDataset using
dataset.workspace directly which may be None while the DB column is
nullable=False; update from_mlflow_entity to normalize a None workspace to
DEFAULT_WORKSPACE_NAME (or the project-wide default constant) before passing it
into SqlEvaluationDataset so you never persist NULL and still allow DB default
behavior to be honored; import or reference the DEFAULT_WORKSPACE_NAME constant
as needed and apply the same None-to-default normalization at the other two
locations mentioned.
| self._workspace_support: bool | None = None | ||
|
|
||
| def supports_workspaces(self) -> bool: | ||
| if self._workspace_support is None: | ||
| self._workspace_support = self._probe_workspace_support() | ||
| return self._workspace_support | ||
|
|
||
| def _probe_workspace_support(self) -> bool: | ||
| host_creds = self.get_host_creds() | ||
| try: | ||
| response = rest_utils.http_request( | ||
| host_creds=host_creds, | ||
| endpoint="/api/2.0/mlflow/workspaces", | ||
| method="GET", | ||
| timeout=3, | ||
| max_retries=0, | ||
| raise_on_status=False, | ||
| ) | ||
| except Exception as exc: # pragma: no cover - network errors vary | ||
| _logger.debug("Failed to probe workspace support: %s", exc) | ||
| return False | ||
|
|
||
| # These response codes indicate the workspace endpoints are being served. | ||
| if response.status_code in (200, 401, 403): | ||
| return True | ||
|
|
||
| _logger.debug( | ||
| "Workspace endpoint probe returned status %s: %s", | ||
| response.status_code, | ||
| response.text, | ||
| ) | ||
| return False |
There was a problem hiding this comment.
Avoid treating transient network failures as “workspaces unsupported”
_probe_workspace_support catches all exceptions from rest_utils.http_request, logs at debug, and returns False. Because supports_workspaces() caches this False and _call_endpoint then raises FEATURE_DISABLED when a workspace is active, a transient network error during the probe will be surfaced to users as “server does not support workspaces”, rather than as a connectivity problem.
It would be better to propagate the original exception (or wrap it) so callers see a network failure instead of a misleading feature‑flag error. For example:
def _probe_workspace_support(self) -> bool:
host_creds = self.get_host_creds()
- try:
- response = rest_utils.http_request(
- host_creds=host_creds,
- endpoint="/api/2.0/mlflow/workspaces",
- method="GET",
- timeout=3,
- max_retries=0,
- raise_on_status=False,
- )
- except Exception as exc: # pragma: no cover - network errors vary
- _logger.debug("Failed to probe workspace support: %s", exc)
- return False
+ try:
+ response = rest_utils.http_request(
+ host_creds=host_creds,
+ endpoint="/api/2.0/mlflow/workspaces",
+ method="GET",
+ timeout=3,
+ max_retries=0,
+ raise_on_status=False,
+ )
+ except Exception as exc: # pragma: no cover - network errors vary
+ _logger.debug("Failed to probe workspace support: %s", exc)
+ raiseThis keeps the probe best‑effort but avoids caching False in cases where you actually couldn't reach the server.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self._workspace_support: bool | None = None | |
| def supports_workspaces(self) -> bool: | |
| if self._workspace_support is None: | |
| self._workspace_support = self._probe_workspace_support() | |
| return self._workspace_support | |
| def _probe_workspace_support(self) -> bool: | |
| host_creds = self.get_host_creds() | |
| try: | |
| response = rest_utils.http_request( | |
| host_creds=host_creds, | |
| endpoint="/api/2.0/mlflow/workspaces", | |
| method="GET", | |
| timeout=3, | |
| max_retries=0, | |
| raise_on_status=False, | |
| ) | |
| except Exception as exc: # pragma: no cover - network errors vary | |
| _logger.debug("Failed to probe workspace support: %s", exc) | |
| return False | |
| # These response codes indicate the workspace endpoints are being served. | |
| if response.status_code in (200, 401, 403): | |
| return True | |
| _logger.debug( | |
| "Workspace endpoint probe returned status %s: %s", | |
| response.status_code, | |
| response.text, | |
| ) | |
| return False | |
| self._workspace_support: bool | None = None | |
| def supports_workspaces(self) -> bool: | |
| if self._workspace_support is None: | |
| self._workspace_support = self._probe_workspace_support() | |
| return self._workspace_support | |
| def _probe_workspace_support(self) -> bool: | |
| host_creds = self.get_host_creds() | |
| try: | |
| response = rest_utils.http_request( | |
| host_creds=host_creds, | |
| endpoint="/api/2.0/mlflow/workspaces", | |
| method="GET", | |
| timeout=3, | |
| max_retries=0, | |
| raise_on_status=False, | |
| ) | |
| except Exception as exc: # pragma: no cover - network errors vary | |
| _logger.debug("Failed to probe workspace support: %s", exc) | |
| raise | |
| # These response codes indicate the workspace endpoints are being served. | |
| if response.status_code in (200, 401, 403): | |
| return True | |
| _logger.debug( | |
| "Workspace endpoint probe returned status %s: %s", | |
| response.status_code, | |
| response.text, | |
| ) | |
| return False |
| _INVALID_PARAMETER_VALUE_NAME = databricks_pb2.ErrorCode.Name( | ||
| databricks_pb2.INVALID_PARAMETER_VALUE | ||
| ) | ||
|
|
||
|
|
||
| def get_default_workspace_optional( | ||
| workspace_store, logger: logging.Logger | None = None | ||
| ) -> tuple[Workspace | None, bool]: | ||
| """ | ||
| Attempt to resolve a default workspace from the provider without bubbling opt-out errors. | ||
|
|
||
| Providers can signal that default workspace resolution is unsupported by raising either | ||
| ``NotImplementedError`` or ``MlflowException`` with ``INVALID_PARAMETER_VALUE``. This helper | ||
| normalizes those cases and returns ``(None, False)`` so callers can decide how to proceed. | ||
|
|
||
| Args: | ||
| workspace_store: Workspace store exposing ``get_default_workspace``. | ||
| logger: Optional logger used for debug messaging. | ||
|
|
||
| Returns: | ||
| Tuple of (workspace or None, supports_default_workspace flag). | ||
| """ | ||
| if workspace_store is None: | ||
| return None, False | ||
|
|
||
| provider_name = type(workspace_store).__name__ | ||
| active_logger = logger | ||
|
|
||
| try: | ||
| workspace = workspace_store.get_default_workspace() | ||
| except NotImplementedError: | ||
| if active_logger: | ||
| active_logger.debug( | ||
| "Workspace provider %s does not implement default workspace resolution", | ||
| provider_name, | ||
| ) | ||
| return None, False | ||
| except MlflowException as exc: | ||
| if exc.error_code == _INVALID_PARAMETER_VALUE_NAME: | ||
| if active_logger: | ||
| active_logger.debug( | ||
| "Workspace provider %s does not support default workspace resolution: %s", | ||
| provider_name, | ||
| exc, | ||
| ) | ||
| return None, False | ||
| raise |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Make the INVALID_PARAMETER_VALUE match robust to numeric vs string error codes
The intent to treat NotImplementedError and MlflowException with INVALID_PARAMETER_VALUE as “provider opted out of default workspace resolution” is good, but:
_INVALID_PARAMETER_VALUE_NAME = databricks_pb2.ErrorCode.Name(
databricks_pb2.INVALID_PARAMETER_VALUE
)
...
if exc.error_code == _INVALID_PARAMETER_VALUE_NAME:assumes exc.error_code is the string name. In several places in this codebase exceptions are raised with error_code=databricks_pb2.INVALID_PARAMETER_VALUE (the numeric proto value), so if MlflowException.error_code preserves that numeric form, this comparison will never be true and those opt‑out exceptions will be re‑raised instead of normalized.
To harden this, consider supporting both numeric and name forms, for example:
-from mlflow.protos import databricks_pb2
-
-_INVALID_PARAMETER_VALUE_NAME = databricks_pb2.ErrorCode.Name(
- databricks_pb2.INVALID_PARAMETER_VALUE
-)
+from mlflow.protos import databricks_pb2
+
+_INVALID_PARAMETER_VALUE_CODE = databricks_pb2.INVALID_PARAMETER_VALUE
+_INVALID_PARAMETER_VALUE_NAME = databricks_pb2.ErrorCode.Name(_INVALID_PARAMETER_VALUE_CODE)
@@
- except MlflowException as exc:
- if exc.error_code == _INVALID_PARAMETER_VALUE_NAME:
+ except MlflowException as exc:
+ if exc.error_code in (_INVALID_PARAMETER_VALUE_CODE, _INVALID_PARAMETER_VALUE_NAME):
if active_logger:
active_logger.debug(
"Workspace provider %s does not support default workspace resolution: %s",
provider_name,
exc,
)
return None, FalseThis keeps behavior correct regardless of how MlflowException.error_code is represented internally.
🤖 Prompt for AI Agents
In mlflow/store/workspace/utils.py around lines 9 to 55, the comparison against
_INVALID_PARAMETER_VALUE_NAME only handles the string name of the proto error
code and will miss cases where MlflowException.error_code is the numeric proto
value; update the check to accept both numeric and string forms (e.g., compute
both the numeric constant databricks_pb2.INVALID_PARAMETER_VALUE and its
Name(...) string and test exc.error_code against both or normalize
exc.error_code to the name via databricks_pb2.ErrorCode.Name when possible) so
that exceptions raised with either representation are normalized to (None,
False).
| def set_workspace(workspace: str | None) -> None: | ||
| """Set the active workspace for subsequent MLflow operations.""" | ||
|
|
||
| with _workspace_lock: | ||
| if workspace is None: | ||
| clear_workspace() | ||
| MLFLOW_WORKSPACE.unset() | ||
| return | ||
|
|
||
| if workspace != DEFAULT_WORKSPACE_NAME: | ||
| WorkspaceNameValidator.validate(workspace) | ||
|
|
||
| set_current_workspace(workspace) | ||
| MLFLOW_WORKSPACE.set(workspace) | ||
|
|
||
|
|
||
| @experimental(version="3.7.0") | ||
| def list_workspaces() -> list[Workspace]: | ||
| """Return the list of workspaces available to the current user.""" | ||
|
|
||
| return _workspace_client_call(lambda client: client.list_workspaces()) | ||
|
|
||
|
|
||
| @experimental(version="3.7.0") | ||
| def get_workspace(name: str) -> Workspace: | ||
| """Return metadata for the specified workspace.""" | ||
|
|
||
| return _workspace_client_call(lambda client: client.get_workspace(name)) | ||
|
|
||
|
|
||
| @experimental(version="3.7.0") | ||
| def create_workspace(name: str, description: str | None = None) -> Workspace: | ||
| """Create a new workspace.""" | ||
|
|
||
| WorkspaceNameValidator.validate(name) | ||
| return _workspace_client_call( | ||
| lambda client: client.create_workspace(name=name, description=description) | ||
| ) | ||
|
|
||
|
|
||
| @experimental(version="3.7.0") | ||
| def update_workspace(name: str, description: str | None = None) -> Workspace: | ||
| """Update metadata for an existing workspace.""" | ||
|
|
||
| if name != DEFAULT_WORKSPACE_NAME: | ||
| WorkspaceNameValidator.validate(name) | ||
| return _workspace_client_call( | ||
| lambda client: client.update_workspace(name=name, description=description) | ||
| ) | ||
|
|
||
|
|
||
| @experimental(version="3.7.0") | ||
| def delete_workspace(name: str) -> None: | ||
| """Delete an existing workspace.""" | ||
|
|
||
| if name != DEFAULT_WORKSPACE_NAME: | ||
| WorkspaceNameValidator.validate(name) | ||
| _workspace_client_call(lambda client: client.delete_workspace(name=name)) | ||
|
|
||
|
|
||
| __all__ = [ | ||
| "Workspace", | ||
| "set_workspace", | ||
| "list_workspaces", | ||
| "get_workspace", | ||
| "create_workspace", | ||
| "update_workspace", | ||
| "delete_workspace", | ||
| ] |
There was a problem hiding this comment.
Align set_workspace stability annotation with other workspace APIs
All user-facing workspace helpers in this module (list_workspaces, get_workspace, create_workspace, update_workspace, delete_workspace) are marked:
@experimental(version="3.7.0")but set_workspace is not, even though it’s a new public API and is exported via __all__ and the top-level mlflow namespace.
If the whole workspace surface is intended to be experimental in this release (which is what the decorators suggest), consider also marking set_workspace as experimental for consistency:
- def set_workspace(workspace: str | None) -> None:
+ @experimental(version="3.7.0")
+ def set_workspace(workspace: str | None) -> None:Otherwise, if set_workspace is meant to be stable while the CRUD endpoints are experimental, it would be good to explicitly document that distinction in the user docs.
🤖 Prompt for AI Agents
In mlflow/tracking/_workspace/fluent.py around lines 36 to 104, the
set_workspace function is a new public API exported in __all__ but lacks the
@experimental(version="3.7.0") decorator that the other workspace helpers use;
add the same @experimental(version="3.7.0") decorator immediately above the
set_workspace definition to mark it experimental for consistency with
list/get/create/update/delete_workspace, or if set_workspace is intentionally
stable instead, update the function docstring and project docs to explicitly
state that distinction.
| tracking_uri = get_tracking_uri() | ||
| is_db_uri = is_databricks_uri(tracking_uri) | ||
|
|
||
| if MLFLOW_ENABLE_WORKSPACES.get() and not is_db_uri: | ||
| client = MlflowClient() | ||
| experiment = client.get_experiment_by_name(Experiment.DEFAULT_EXPERIMENT_NAME) | ||
|
|
||
| if experiment is None: | ||
| try: | ||
| experiment_id = client.create_experiment(Experiment.DEFAULT_EXPERIMENT_NAME) | ||
| experiment = client.get_experiment(experiment_id) | ||
| except MlflowException as exc: | ||
| if exc.error_code != ErrorCode.Name(RESOURCE_ALREADY_EXISTS): | ||
| raise | ||
| experiment = client.get_experiment_by_name(Experiment.DEFAULT_EXPERIMENT_NAME) | ||
|
|
||
| if experiment is not None: | ||
| return experiment.experiment_id | ||
|
|
||
| raise MlflowException( | ||
| "Default experiment could not be resolved for the active workspace. " | ||
| "Call mlflow.set_workspace() and ensure the workspace contains or permits creation of " | ||
| f"an experiment named '{Experiment.DEFAULT_EXPERIMENT_NAME}'.", | ||
| error_code=INVALID_STATE, | ||
| ) |
There was a problem hiding this comment.
RESOURCE_ALREADY_EXISTS handling is incorrect; comparison will never match
The error‑handling around concurrent default‑experiment creation isn’t doing what it intends:
except MlflowException as exc:
if exc.error_code != ErrorCode.Name(RESOURCE_ALREADY_EXISTS):
raise
experiment = client.get_experiment_by_name(Experiment.DEFAULT_EXPERIMENT_NAME)RESOURCE_ALREADY_EXISTS is an enum value; ErrorCode.Name(RESOURCE_ALREADY_EXISTS) returns its string name (e.g. "RESOURCE_ALREADY_EXISTS"), while exc.error_code is set to the numeric enum value in existing MlflowException usage. This condition will therefore always be True, and you’ll re‑raise even for the expected “already exists” race.
You likely want to mirror existing patterns and compare directly against the enum value:
- except MlflowException as exc:
- if exc.error_code != ErrorCode.Name(RESOURCE_ALREADY_EXISTS):
- raise
- experiment = client.get_experiment_by_name(Experiment.DEFAULT_EXPERIMENT_NAME)
+ except MlflowException as exc:
+ if exc.error_code != RESOURCE_ALREADY_EXISTS:
+ raise
+ experiment = client.get_experiment_by_name(Experiment.DEFAULT_EXPERIMENT_NAME)You can then drop the now‑unused ErrorCode import if desired.
🤖 Prompt for AI Agents
In mlflow/tracking/default_experiment/registry.py around lines 79 to 103, the
except block compares exc.error_code to ErrorCode.Name(RESOURCE_ALREADY_EXISTS)
(a string), which will never match the numeric enum value in exc.error_code;
change the check to compare exc.error_code directly to RESOURCE_ALREADY_EXISTS
(the enum member/value) so the "already exists" race is handled correctly, and
remove the now-unused ErrorCode import if no longer referenced elsewhere.
2584611 to
956da67
Compare
These workflows are not relevant on forks. Signed-off-by: mprahl <mprahl@users.noreply.github.com>
- Introduce the workspace store abstraction (SQLAlchemy + REST), name validation, and registry/URI utilities with CLI flags and env vars. - Expose workspace CRUD endpoints in the Flask server with proper guards/feature flag wiring plus MlflowClient surface methods. - Add the Workspace entity, tests for the stores/registry/client, and docs entries so the new multi-tenant provider can be exercised end-to-end. Signed-off-by: mprahl <mprahl@users.noreply.github.com>
Signed-off-by: mprahl <mprahl@users.noreply.github.com>
956da67 to
929ffc7
Compare
Summary by CodeRabbit
New Features
Configuration
--enable-workspacesand--workspace-store-uriserver CLI optionsMLFLOW_ENABLE_WORKSPACES,MLFLOW_WORKSPACE,MLFLOW_WORKSPACE_URIDocumentation
✏️ Tip: You can customize this high-level summary in your review settings.