Skip to content

Commit c969152

Browse files
authored
Merge pull request #82 from bytedance/kangrong/videx_server_docker
Docker: Add a separate videx-server image to support launching the server and executing the videx-sync scripts.
2 parents 284dcda + 7307a01 commit c969152

5 files changed

Lines changed: 650 additions & 4 deletions

File tree

build/Dockerfile.videxserver

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
FROM python:3.11-slim
2+
3+
ENV PYTHONDONTWRITEBYTECODE=1 \
4+
PYTHONUNBUFFERED=1 \
5+
PIP_DISABLE_PIP_VERSION_CHECK=1 \
6+
PIP_NO_CACHE_DIR=1 \
7+
VIDEX_CONTAINER=1 \
8+
PYTHONPATH=/opt/videx/src
9+
10+
WORKDIR /opt/videx
11+
12+
COPY requirements.txt /opt/videx/requirements.txt
13+
RUN apt-get update \
14+
&& apt-get install -y --no-install-recommends \
15+
gcc \
16+
python3-dev \
17+
&& pip install -r /opt/videx/requirements.txt \
18+
&& apt-get purge -y --auto-remove gcc python3-dev \
19+
&& rm -rf /var/lib/apt/lists/*
20+
21+
COPY src/ /opt/videx/src/
22+
COPY build/videx_container_entrypoint.py /opt/videx/videx_container_entrypoint.py
23+
24+
# Default/documentation port. You can still map any host port to container 5001 via -p HOST:5001.
25+
EXPOSE 5001
26+
27+
ENTRYPOINT ["python", "/opt/videx/videx_container_entrypoint.py"]
28+
CMD ["server"]
Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
"""
2+
VIDEX container entrypoint.
3+
4+
This entrypoint provides two modes:
5+
- `server`: start the long-running VIDEX stats server (default)
6+
- `sync`: run the one-shot sync/env build script and exit
7+
8+
Design notes:
9+
- Keep argument handling minimal.
10+
- Do not rewrite user arguments.
11+
- Best-effort warnings are emitted for common container networking pitfalls
12+
(e.g. using localhost/127.0.0.1 in --target inside a container).
13+
"""
14+
15+
from __future__ import annotations
16+
17+
import os
18+
import sys
19+
import subprocess
20+
from typing import List, Optional, Tuple
21+
22+
LOCALHOST_NAMES = {"127.0.0.1", "localhost", "::1"}
23+
24+
25+
def _in_container_best_effort() -> bool:
26+
"""
27+
Best-effort heuristics to detect container environment.
28+
Used only for warnings (never for rewriting args or failing).
29+
"""
30+
explicit = os.environ.get("VIDEX_CONTAINER")
31+
if explicit and explicit.strip().lower() not in {"0", "false", "no"}:
32+
return True
33+
34+
if os.path.exists("/.dockerenv"):
35+
return True
36+
37+
if os.environ.get("container"):
38+
return True
39+
40+
try:
41+
with open("/proc/1/cgroup", "rt", encoding="utf-8") as f:
42+
c = f.read()
43+
hints = ("docker", "containerd", "kubepods", "podman")
44+
return any(h in c for h in hints)
45+
except OSError:
46+
return False
47+
48+
49+
def _usage() -> str:
50+
return (
51+
"Usage:\n"
52+
" <image> [server]\n"
53+
" <image> sync --target HOST:PORT:DB:USER:PASS [--videx ...] [other args]\n"
54+
"\n"
55+
"Commands:\n"
56+
" server Start VIDEX server (default).\n"
57+
" sync Run one-shot scripts to collect metadata from --target, then add metadata into videx-server, and create virtual tables in --videx.\n"
58+
"\n"
59+
"Notes:\n"
60+
" In a container, 127.0.0.1/localhost refers to the container itself.\n"
61+
" See doc/VIDEX_SERVER_DOCKER.md for Docker networking tips.\n"
62+
)
63+
64+
65+
def _extract_flag_value(argv: List[str], name: str) -> Tuple[Optional[str], bool]:
66+
"""
67+
Extract the value of a CLI flag from argv, supporting:
68+
--name value
69+
--name=value
70+
71+
Returns (value, present):
72+
- present=False => flag not present
73+
- present=True and value=None => flag present but missing value
74+
"""
75+
for i, tok in enumerate(argv):
76+
if tok == name:
77+
if i + 1 >= len(argv) or argv[i + 1].startswith("--"):
78+
return None, True
79+
return argv[i + 1], True
80+
if tok.startswith(name + "="):
81+
return tok.split("=", 1)[1], True
82+
return None, False
83+
84+
85+
def _parse_target_host(target: str) -> Optional[str]:
86+
"""
87+
Parse host from a connection string of form:
88+
host:port:db:user:password
89+
90+
We only need host for warnings, so do not over-validate.
91+
If format is unexpected, return None.
92+
"""
93+
if not target or ":" not in target:
94+
return None
95+
host = target.split(":", 1)[0].strip()
96+
return host or None
97+
98+
99+
def _maybe_warn_localhost_target(argv: List[str]) -> None:
100+
"""
101+
Print best-effort warnings about using localhost/127.0.0.1 inside containers.
102+
No rewriting; no hard failure.
103+
"""
104+
target, present = _extract_flag_value(argv, "--target")
105+
106+
if not present:
107+
sys.stderr.write(
108+
"Warning: 'sync' usually needs --target HOST:PORT:DB:USER:PASS.\n"
109+
" The sync script will likely fail without it.\n\n"
110+
)
111+
return
112+
113+
if target is None:
114+
sys.stderr.write(
115+
"Warning: '--target' flag is present but has no value.\n"
116+
" The sync script will likely fail. Usage:\n\n"
117+
f"{_usage()}\n"
118+
)
119+
return
120+
121+
host = _parse_target_host(target)
122+
if not host or host not in LOCALHOST_NAMES:
123+
return
124+
125+
if not _in_container_best_effort():
126+
return
127+
128+
sys.stderr.write(
129+
"Warning: You may be running in a container, but the `--target` parameter is configured with 127.0.0.1/localhost.\n"
130+
" In a container, localhost usually refers to the container itself.\n"
131+
" If your MariaDB/VIDEX runs on the host machine, this may fail.\n\n"
132+
"Suggestions:\n"
133+
" - Docker Desktop (Mac/Windows): try host.docker.internal in --target.\n"
134+
" - Linux Docker Engine: add this when running the container:\n"
135+
" --add-host=host.docker.internal:host-gateway\n"
136+
" then use host.docker.internal in --target.\n"
137+
" - If DB runs in the same container / same network namespace, localhost can be correct.\n\n"
138+
)
139+
140+
141+
def _run_module(module: str, argv: List[str]) -> int:
142+
cmd = [sys.executable, "-m", module] + argv
143+
return subprocess.call(cmd)
144+
145+
146+
def _run_server(argv: List[str]) -> int:
147+
# Runs: python -m sub_platforms.sql_opt.videx.scripts.start_videx_server ...
148+
return _run_module("sub_platforms.sql_opt.videx.scripts.start_videx_server", argv)
149+
150+
151+
def _run_sync(argv: List[str]) -> int:
152+
# Runs: python -m sub_platforms.sql_opt.videx.scripts.videx_build_env ...
153+
return _run_module("sub_platforms.sql_opt.videx.scripts.videx_build_env", argv)
154+
155+
156+
def main() -> int:
157+
if len(sys.argv) <= 1:
158+
return _run_server([])
159+
160+
subcmd = sys.argv[1]
161+
argv = sys.argv[2:]
162+
163+
if subcmd in ("-h", "--help", "help"):
164+
sys.stdout.write(_usage())
165+
return 0
166+
167+
if subcmd == "server":
168+
return _run_server(argv)
169+
170+
if subcmd == "sync":
171+
_maybe_warn_localhost_target(argv)
172+
return _run_sync(argv)
173+
174+
# Convenience: if user passes flags without 'server', treat as server args.
175+
if subcmd.startswith("-"):
176+
return _run_server([subcmd] + argv)
177+
178+
sys.stderr.write(f"Error: unknown command '{subcmd}'.\n\n{_usage()}\n")
179+
return 2
180+
181+
182+
if __name__ == "__main__":
183+
raise SystemExit(main())

doc/VIDEX_SERVER_DOCKER.md

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
## VIDEX-Server Docker Image Description
2+
3+
The latest public image is:
4+
5+
- `ghcr.io/bytedance/videx-server:0.2.0-preview-test1` (GHCR)
6+
7+
This image supports two entrypoint modes:
8+
9+
- `server` (default): start the VIDEX server
10+
- `sync`: run a one-shot workflow to collect metadata from `--target`, then add metadata into `videx-server`, and create virtual tables in `--videx`
11+
12+
> Recommendation: prefer using a routable IP address (your host/server IP) instead of `localhost/127.0.0.1`,
13+
> to make sure `videx-server` (running in a container) can be reached by `videx-sync` and MariaDB-VIDEX (including `videx-plugin`).
14+
> This is especially important because MariaDB-VIDEX also needs to reach `videx-server`, e.g.:
15+
>
16+
> `SET SESSION VIDEX_SERVER_IP=<VIDEX_SERVER_IP>:<VIDEX_SERVER_PORT>;`
17+
18+
---
19+
20+
## Build image
21+
22+
Build locally from this repo and tag it as `videx-server:0.2.0`:
23+
24+
```bash
25+
docker build -f build/Dockerfile.videxserver -t videx-server:0.2.0 .
26+
```
27+
28+
---
29+
30+
## Quick start
31+
32+
Suppose your machine/server IP is `203.0.113.42` (example only).
33+
34+
### 1) Start the videx-server
35+
36+
Expose container port `5001` to a host port (choose any free host port, like 5001):
37+
38+
```bash
39+
docker run -d --name videx-server \
40+
-p 5001:5001 \
41+
ghcr.io/bytedance/videx-server:0.2.0-preview-test1
42+
```
43+
44+
Then open:
45+
46+
- `http://203.0.113.42:5001`
47+
- `http://localhost:5001` (only if you are on the same machine)
48+
49+
---
50+
51+
### 2) Run sync (one-shot) against MariaDB (recommended: use host/server IP)
52+
53+
`sync` connects to `--target` (your MariaDB), collects metadata, writes metadata into `videx-server`, and creates virtual tables in `--videx`.
54+
55+
#### Command template
56+
57+
```bash
58+
docker run --rm --name videx-sync \
59+
ghcr.io/bytedance/videx-server:0.2.0-preview-test1 sync \
60+
--target <TARGET_HOST>:<TARGET_PORT>:<TARGET_DB>:<TARGET_USER>:<TARGET_PASS> \
61+
[--videx <VIDEX_HOST>:<VIDEX_PORT>:<VIDEX_DB>:<VIDEX_USER>:<VIDEX_PASS>] \
62+
[--videx_server <VIDEX_SERVER_HOST>:<VIDEX_SERVER_PORT>]
63+
```
64+
65+
#### Example (fake IP shown)
66+
67+
Suppose:
68+
69+
- Your machine/server IP is `203.0.113.42` (example only)
70+
- MariaDB is reachable at `203.0.113.42:15508`
71+
- Source database is `tpch_tiny`
72+
- User/password: `videx` / `password`
73+
- `videx-server` is reachable at `203.0.113.42:5001`
74+
75+
Run:
76+
77+
```bash
78+
docker run --rm --name videx-sync \
79+
ghcr.io/bytedance/videx-server:0.2.0-preview-test1 sync \
80+
--target 203.0.113.42:15508:tpch_tiny:videx:password \
81+
--videx 203.0.113.42:15508:videx_tpch_tiny:videx:password \
82+
--videx_server 203.0.113.42:5001
83+
```
84+
85+
#### Notes
86+
87+
1. If `--videx` is not specified, a default database `videx_{TARGET_DB}` will be created in `--target`.
88+
2. If your videx-server is not the default `203.0.113.42:5001` , pass:
89+
- `--videx_server <VIDEX_SERVER_HOST>:<VIDEX_SERVER_PORT>`
90+
3. Because MariaDB-VIDEX needs to call back into `videx-server`, you should configure a reachable server address, for example:
91+
```sql
92+
SET SESSION VIDEX_SERVER_IP=<VIDEX_SERVER_IP>:<VIDEX_SERVER_PORT>;
93+
```
94+
This is another reason why using a routable IP (not `localhost`) is recommended.
95+
96+
---
97+
98+
## FAQ
99+
100+
### Q1: I used `localhost` / `127.0.0.1` in `--target` and it failed. Why?
101+
102+
Inside a container, `localhost/127.0.0.1` refers to the container itself. If MariaDB runs on the Docker host (or elsewhere), the container cannot reach it via `localhost`.
103+
104+
**Linux (Docker Engine) quick fix: use `host.docker.internal` via `--add-host`**
105+
106+
```bash
107+
docker run --rm --name videx-sync \
108+
--add-host=host.docker.internal:host-gateway \
109+
ghcr.io/bytedance/videx-server:0.2.0-preview-test1 sync \
110+
--target host.docker.internal:<PORT>:<DB>:<USER>:<PASS> \
111+
--videx host.docker.internal:<PORT>:<VIDEX_DB>:<VIDEX_USER>:<VIDEX_PASS> \
112+
--videx_server host.docker.internal:<VIDEX_SERVER_PORT>
113+
```
114+
115+
However, you must ensure that MariaDB-VIDEX can still reach `videx-server`;
116+
things get tricky if MariaDB-VIDEX itself is also running inside a container.
117+
**In that case, using a routable IP is the most recommended way to ensure reachability.**

0 commit comments

Comments
 (0)