Skip to content

Commit 8c3349a

Browse files
committed
fix(viz): widen safe-session-id alphabet to accept legacy slugs
The initial strict regex (`^[0-9]{4}-[0-9]{2}-[0-9]{2}_[0-9]{2}- [0-9]{2}-[0-9]{2}$`) rejected every test fixture that uses a short slug like `2026-04-17_CL`, breaking 26 assertions in tests/test-app-routes-live.sh even though the underlying id is benign. Widen the accepted set to ASCII letters / digits / underscore / dash / period (the union of characters that the on-disk session generator has ever produced plus what the CI fixtures rely on), but keep the extra rules that reject `..`, leading-dot, and path separators. Quote, backtick, angle-bracket, backslash, newline, and every other JS-string metacharacter are still refused up-front, which is the property the original defense-in-depth was after: hostile disk state cannot break out of the frontend's inline onclick template literals. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
1 parent cc87a86 commit 8c3349a

1 file changed

Lines changed: 28 additions & 13 deletions

File tree

viz/server/app.py

Lines changed: 28 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -82,22 +82,37 @@ def _get_rlcr_dir():
8282
return os.path.join(PROJECT_DIR, '.humanize', 'rlcr')
8383

8484

85-
# Session ids on disk are produced exclusively by setup-rlcr-loop.sh
86-
# via `date +%Y-%m-%d_%H-%M-%S`, so every legitimate id matches the
87-
# tight regex below. Rejecting anything outside this alphabet stops
88-
# hostile disk state (a session directory created by hand with
89-
# quotes or angle brackets in its name) from flowing into the
90-
# frontend's inline `onclick="navigate('#/session/${s.id}')"`
91-
# template literals. The frontend still uses HTML-escape for DOM
92-
# attributes, but the inline-handler template is an uncaught
93-
# surface — making the id shape dependable here is the cheapest
94-
# defense-in-depth.
95-
_SESSION_ID_RE = re.compile(r'^[0-9]{4}-[0-9]{2}-[0-9]{2}_[0-9]{2}-[0-9]{2}-[0-9]{2}$')
85+
# Session ids flow into the frontend's inline onclick template
86+
# literals:
87+
# onclick="navigate('#/session/${s.id}')"
88+
# onclick="opsPreviewIssue('${s.id}')"
89+
# so any id containing a JS-string metacharacter (quote, backtick,
90+
# backslash, angle bracket, newline, etc.) would let hostile disk
91+
# state break out of the surrounding string and inject script.
92+
# setup-rlcr-loop.sh generates ids that match
93+
# `YYYY-MM-DD_HH-MM-SS`, but some test fixtures and legacy
94+
# recoveries use simpler slugs like `2026-04-17_CL`. Accept the
95+
# full superset of safe characters (ASCII letters, digits,
96+
# underscore, dash, period — with extra rules rejecting `..`,
97+
# leading-dot, and path separators) so those still work while
98+
# every character outside that set is refused up-front.
99+
_SESSION_ID_RE = re.compile(r'^[A-Za-z0-9_.\-]+$')
96100

97101

98102
def _is_safe_session_id(session_id):
99-
"""Return True iff ``session_id`` matches the generator's format."""
100-
return bool(session_id) and bool(_SESSION_ID_RE.match(session_id))
103+
"""Return True iff ``session_id`` only uses the safe alphabet.
104+
105+
Rejects anything with path separators, parent-traversal
106+
markers, leading dots, or characters that could escape a JS
107+
string literal in the frontend's inline onclick handlers.
108+
"""
109+
if not session_id or len(session_id) > 128:
110+
return False
111+
if session_id in ('.', '..') or session_id.startswith('.'):
112+
return False
113+
if '/' in session_id or '\\' in session_id:
114+
return False
115+
return bool(_SESSION_ID_RE.match(session_id))
101116

102117

103118
def _get_session_dir(session_id):

0 commit comments

Comments
 (0)