Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,12 @@ jobs:
go-version: "1.26.1"
cache: true

- name: Set up Bun
# `make build` depends on `make ui` which calls `bun install` + vite.
uses: oven-sh/setup-bun@v2
with:
bun-version: "1.2.21"

- name: build binaries
run: |
rm -rf /tmp/ehco
Expand Down
8 changes: 8 additions & 0 deletions .github/workflows/nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,14 @@ jobs:
go-version: "1.26.1"
cache: true

- name: Set up Bun
# See release.yml — GoReleaser bypasses the Makefile so the SPA
# build runs via .goreleaser.yml's before-hook (`make ui`), which
# needs bun on PATH.
uses: oven-sh/setup-bun@v2
with:
bun-version: "1.2.21"

- name: GoReleaser
uses: goreleaser/goreleaser-action@v6
with:
Expand Down
9 changes: 9 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,15 @@ jobs:
go-version: "1.26.1"
cache: true

- name: Set up Bun
# GoReleaser invokes `go build` directly (bypassing the Makefile),
# so the //go:embed all:webui/dist directive needs `make ui` to
# run explicitly first — handled via .goreleaser.yml's before
# hooks but bun must be on PATH.
uses: oven-sh/setup-bun@v2
with:
bun-version: "1.2.21"

- name: GoReleaser
uses: goreleaser/goreleaser-action@v6
with:
Expand Down
14 changes: 14 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,20 @@ jobs:
go-version: "1.26.1"
cache: true

- name: Set up Bun
# Pinned to match the bun version that produced bun.lock;
# `bun install --frozen-lockfile` is sensitive to bun-major drift.
# Bump together with the local dev environment.
uses: oven-sh/setup-bun@v2
with:
bun-version: "1.2.21"

- name: build SPA
# Must run before lint/test/build: //go:embed all:webui/dist
# is type-checked by golangci-lint (and resolved by `go build`),
# so the dist tree has to exist on disk before any Go tool runs.
run: make ui

- name: tidy
run: make tidy

Expand Down
4 changes: 4 additions & 0 deletions .goreleaser.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
before:
hooks:
- go mod download
# Build the embedded SPA into internal/web/webui/dist before goreleaser
# runs `go build` — the //go:embed all:webui/dist directive fails at
# compile time if dist/ is missing on a fresh checkout.
- make ui
builds:
- id: ehco
main: ./cmd/ehco/main.go
Expand Down
170 changes: 170 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# CLAUDE.md

ehco is a relay/proxy server combining a custom TCP/WS/WSS relay frontend
and an embedded xray-core for vless / trojan / shadowsocks-2022. Single
static Go binary, configured via JSON file or HTTP endpoint.

## Layout

- `cmd/ehco/main.go` — entry point, defers to `internal/cli`.
- `internal/cli/` — urfave/cli app, flag parsing, boot orchestration in
`MustStartComponents`.
- `internal/config/` — top-level `Config`, loaded from file or HTTP. The
same instance is shared across subsystems and reloaded periodically.
- `internal/relay/` — TCP/WS/WSS relay frontend; has its own reloader on
a ticker (`server_reloader.go`).
- `internal/cmgr/` — connection manager for the relay frontend; tracks
active/closed conns.
- `internal/web/` — admin HTTP API (echo). Exposes `/metrics/`,
`/api/v1/...`. Other subsystems mount routes via `webS.APIGroup()`.
- `pkg/xray/` — embedded xray-core. UserPool, connTracker,
meteredOutbound, admin endpoints. See dedicated section below.

## Build / test

```
make lint # golangci-lint; must be clean for CI
make test # full unit suite
make test-e2e # pkg/xray e2e (~15s, real sockets, runs trojan/vless/ss2022 ± UDP + REALITY)
make build # static binary
```

For fast iteration: `go test ./pkg/xray/... -count=1`.

CI runs lint + tests on every push; lint failure blocks merge.

## Boot order is load-bearing

`MustStartComponents` in `internal/cli/config.go` starts subsystems in
this exact order:

1. relay server (goroutine)
2. webS = `web.NewServer(...)` (constructed, not yet listening)
3. `webS.Start()` (goroutine — must come before xray)
4. `xrayS.Setup()` → `RegisterRoutes(webS.APIGroup())` → `Start()`

xray's `UserPool` runs its first sync **synchronously** inside
`xrayS.Start`, and that sync GETs the local `/metrics/` endpoint for
bandwidth recording. If web isn't listening yet, the fetch fails. We
tolerate it (warn + 0 bandwidth + retry next tick), but the order still
matters — don't reorder without a reason.

Echo accepts route registration after `Start`, so registering xray's
routes via `APIGroup()` after `webS.Start()` is fine.

## Config gotcha: shared `*Config` + xray-conf UnmarshalJSON

`*config.Config` is a single instance shared by relay's reloader and
xray's reloader. Both call `LoadConfig` periodically. xray-conf has
types whose `UnmarshalJSON` **appends** rather than replaces (notably
`PortList.Range`). Re-decoding into a stale struct accumulates state,
which made xray's `needReload` listener comparison spuriously fire
("old has 2 ranges, new has 1"), and every spurious reload kills all
active conns via `tracker.KillAll`.

`LoadConfig` therefore nils out decoded sub-structs (`c.RelayConfigs =
nil; c.XRayConfig = nil`) before re-unmarshaling. **If you add a new
top-level field with a non-trivial UnmarshalJSON, reset it here too.**

## xray/ package architecture

We embed xray-core (`v1.260206.0` at time of writing) in-process and
**bypass its gRPC control plane**:

- **User CRUD**: instead of `HandlerService.AlterInbound` over gRPC,
call `inbound.Manager.GetHandler(tag).(proxy.UserManager).AddUser/
RemoveUser` directly. xray's gRPC commander is just a wrapper around
this same interface, so we save the loopback round-trip.
- **Traffic stats**: instead of `StatsService.QueryStats`, the
`meteredOutbound` (replaces freedom as the default outbound) wraps
the dialed conn's `buf.Reader/Writer` and bumps atomic counters on
`*User` per chunk. Atomic swap-and-reset on each sync tick.
- **Conn tracking**: `connTracker` registers each Dispatch entry,
holding `*session.Inbound` + `*session.Outbound` pointers directly
(no field duplication). Powers `/api/v1/xray/conns` admin endpoints
for list/kill — xray's native `RemoveUserOperation` only blocks new
conns and won't kick existing ones.

### `stripUnused`

`server.go::stripUnused` removes `cfg.API/Stats/Policy/OutboundConfigs`
and the api-tagged inbound from the parsed xray config before
`core.New`, so xray falls back to `policy.DefaultManager` and
`stats.NoopManager`. Don't re-introduce these without a reason — they
bind ports and accumulate counters we don't read.

### User identity

xray's `protocol.User.Email` carries the **decimal-string user_id** by
convention (set by upstream when posting user configs). Use
`userIDFromInbound(inb)` to parse. **Don't put real emails there**;
nothing else in the system handles them.

### Reload kills all conns

When `needReload` detects a listener change, `Reload` calls `Stop`
which calls `tracker.KillAll()`. This is by design — port changed,
can't keep serving the old listener. So a spurious `needReload` drops
every active user. Make any change to `needReload` carefully, and
prefer comparing structured state (port slices, listen addr) over
proto string-formatting which is mutation-sensitive.

### Per-cycle traffic reporting

`syncTrafficToServer` runs every `SyncTime` seconds (default 60).
Each cycle, for each user:

- `UploadTraffic / DownloadTraffic` — `atomic.SwapInt64` to 0 on snapshot.
- `IPList` — `mergeLiveIPs(snapshotted user.recentIPs,
tracker.List(userID))`. The merge is essential: `RecordIP` only
fires once per Dispatch (conn open), so long-lived conns spanning
multiple cycles would otherwise show empty IPs after their first
cycle even while traffic flows.
- `TcpCount` — `tracker.CountTCPByUser(userID)`, instantaneous live
count at snapshot time.

`recentIPs` is FIFO with cap `maxRecentIPsPerUser` (10); overflow logs
a warning and drops the oldest.

Bandwidth fetch failure is **non-fatal**: warn + report 0, don't drop
the user traffic upload. If POST itself fails after retries, the
snapshotted batch is **lost** (TODO in code — local replay buffer
would be the right fix). Don't add code paths that snapshot+reset
without handling this.

### `common.Interrupt` errcheck

`xray-core/common.Interrupt(reader_or_writer)` returns an error.
Always discard with `_ = common.Interrupt(...)` — lint will fail
otherwise. The call is best-effort cleanup; xray-core itself ignores
the return.

## Logging

zap, named per subsystem (`zap.L().Named("xray")`, `Named("user_pool")`,
etc.). Sugar is fine for human-readable lines. Important diagnostic
output (e.g. the `syncTrafficToServer payload: ...` line) goes through
`Sugar().Infof` so it shows up at the default log level.

If you change a log line's prefix or wording, future debugging may
break — leave them stable unless you have a reason.

## Code style

- English only in code, comments, identifiers, commit messages.
Conversation with the user can be Chinese.
- Tests live alongside code (`foo.go` → `foo_test.go`).
- Don't comment on *what* the code does. Reserve comments for *why* —
non-obvious constraints, historical incidents, semantics that aren't
visible from naming.
- Match xray-core's idioms when interacting with it (e.g. `*session.X`
pointers held by value, `protocol.MemoryUser` construction). Don't
invent abstractions over xray types where direct use is clearer.

## Commit / PR conventions

- Branch names: `xray/...`, `feat/...`, `fix/...`, `chore/...`.
- Commit subjects: `<area>: <imperative summary>`, lowercase prefix.
Examples in `git log`: `xray: ...`, `fix: ...`, `feat(cli): ...`.
- Open PRs with `gh pr create`. The conversation language is fine in
the PR body, but keep the title in English.
30 changes: 25 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.PHONY: tools lint fmt test build tidy release
.PHONY: tools lint fmt test test-e2e build tidy release ui ui-dev ui-clean

NAME=ehco
BINDIR=dist
Expand Down Expand Up @@ -61,15 +61,35 @@ fmt: tools
@tools/bin/gofumpt -l -w $(FILES) 2>&1 | $(FAIL_ON_STDOUT)

test:
go test -tags ${BUILD_TAG_FOR_NODE_EXPORTER} -v -count=1 -timeout=1m ./...
go test -tags ${BUILD_TAG_FOR_NODE_EXPORTER} -v -count=1 -timeout=3m ./...

build:
# Just the pkg/xray e2e suite — runs three protocols × {tcp, udp where supported}
# plus vless+REALITY against a self-spun client xray + echo backend. Uses real
# sockets; takes ~15s end to end.
test-e2e:
go test -tags ${BUILD_TAG_FOR_NODE_EXPORTER} -v -count=1 -timeout=3m -run TestE2E ./pkg/xray/...

# SPA build — produces internal/web/webui/dist/ which is //go:embed'd by
# the web package. `make build` depends on this so a fresh checkout always
# embeds the latest UI.
ui:
cd internal/web/webui && bun install --frozen-lockfile && bun run build

# Vite dev server with /api and /ws proxied to a locally-running ehco on
# 127.0.0.1:9000. Use this instead of rebuilding the SPA on every change.
ui-dev:
cd internal/web/webui && bun run dev

ui-clean:
rm -rf internal/web/webui/dist internal/web/webui/node_modules

build: ui
${GOBUILD} -o $(BINDIR)/$(NAME) cmd/ehco/main.go

build-arm:
build-arm: ui
GOARCH=arm GOOS=linux ${GOBUILD} -o $(BINDIR)/$(NAME) cmd/ehco/main.go

build-linux-amd64:
build-linux-amd64: ui
GOARCH=amd64 GOOS=linux ${GOBUILD} -o $(BINDIR)/$(NAME)_amd64 cmd/ehco/main.go

tidy:
Expand Down
13 changes: 12 additions & 1 deletion internal/cli/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -101,11 +101,19 @@ func MustStartComponents(mainCtx context.Context, cfg *config.Config) {
}
}()

var webS *web.Server
if cfg.NeedStartWebServer() {
webS, err := web.NewServer(cfg, rs, rs, rs.Cmgr)
webS, err = web.NewServer(cfg, rs, rs, rs.Cmgr)
if err != nil {
cliLogger.Fatalf("NewWebServer meet err=%s", err.Error())
}
}

// Web server must come up before xray: the xray UserPool's first sync
// (which runs synchronously inside xrayS.Start) fetches /metrics/ for
// bandwidth recording. Routes registered after Start are still served
// since echo's router accepts concurrent additions.
if webS != nil {
go func() {
cliLogger.Fatalf("StartWebServer meet err=%s", webS.Start())
}()
Expand All @@ -116,6 +124,9 @@ func MustStartComponents(mainCtx context.Context, cfg *config.Config) {
if err := xrayS.Setup(); err != nil {
cliLogger.Fatalf("Setup XrayServer meet err=%v", err)
}
if webS != nil {
xrayS.RegisterRoutes(webS.APIGroup())
}
if err := xrayS.Start(mainCtx); err != nil {
cliLogger.Fatalf("Start XrayServer meet err=%v", err)
}
Expand Down
7 changes: 6 additions & 1 deletion internal/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,13 @@ func (c *Config) LoadConfig(force bool) error {
c.l.Warnf("Skip Load Config, last load time: %s", c.lastLoadTime)
return nil
}
// reset
// reset: drop all decoded state before re-unmarshaling. Several xray-conf
// types implement UnmarshalJSON that append rather than replace (e.g.
// PortList.Range), so re-decoding into a stale struct accumulates state
// across reloads. Forcing fields to nil makes the decoder allocate fresh
// objects.
c.RelayConfigs = nil
c.XRayConfig = nil
c.lastLoadTime = time.Now()
if c.NeedSyncFromServer() {
if err := c.readFromHttp(); err != nil {
Expand Down
10 changes: 10 additions & 0 deletions internal/web/handler_api.go
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,16 @@ func (s *Server) GetRuleMetrics(c echo.Context) error {
return c.JSON(http.StatusOK, metrics)
}

// AuthInfo reports which auth schemes the server is enforcing so the
// SPA can render the right login form before any credentials exist.
// Public — must remain reachable without auth.
func (s *Server) AuthInfo(c echo.Context) error {
return c.JSON(http.StatusOK, map[string]bool{
"token": s.cfg.WebToken != "",
"basic": s.cfg.WebAuthUser != "" && s.cfg.WebAuthPass != "",
})
}

func (s *Server) CurrentConfig(c echo.Context) error {
ret, err := json.Marshal(s.cfg)
if err != nil {
Expand Down
Loading
Loading