Event and alerting service for the Moon platform
- About
- Features
- API Overview
- Prerequisites
- Installation
- Quick Start
- Common Usage
- Development
- License
- Acknowledgments
Marksman (后羿) is the Moon platform’s event and alerting service. It manages datasources (Prometheus, VictoriaMetrics, Elasticsearch, Jaeger), strategy groups and strategies, alert levels, and strategy metrics (expressions, levels, receivers binding).
- Proto definitions:
proto/marksman/api/v1/ - HTTP + gRPC via Kratos; CLI via Cobra.
- Self (goddess): Current user info, namespaces list, change email/avatar/remark, refresh token
- User (goddess): User CRUD, list, select, ban/permit
- Member (goddess): List/get/select members in namespace, invite, dismiss, update status
- Namespace (goddess): Namespace management (shared with goddess; requires
namespaceDomain) - Captcha (goddess): Get graphic captcha (id, base64 image) for login and other unauthenticated flows
- Datasource: CRUD (including
levelUidbinding), list, select, update status (ENABLED/DISABLED), status time series (per uid, from main TSDB), metric metadata (label names and label values) for metrics datasources (Prometheus, VictoriaMetrics, Elasticsearch, Jaeger) - MetricQuery: Instant query (Prometheus /api/v1/query), range query (/api/v1/query_range), and direct HTTP proxy for metric-type datasources
- Strategy group: CRUD, list, select, status; bind receivers (recipient groups)
- Strategy: CRUD, list, status; link to strategy group; type (METRICS/LOGS/TRACE) and driver
- Level: Level CRUD, list, select, status (including
type= ALERT/DATASOURCE andbgColorfor display) - Strategy metric: Save/get metric config (expr, labels, datasourceFilter with include/exclude UIDs and datasource metadata labels, levels); get detail returns
includeDatasources/excludeDatasourcesfor bound include/exclude UIDs; save/update/delete/get metric levels (mode, condition, values, duration); bind receivers per strategy (optional levelUID) - Alert (real-time): Alert page CRUD (name, color, sort order, filter by strategy group/level/strategy/datasource/datasource level); list real-time alert events by alert page; operate events: batch intervene + intervene (on-call takeover), suppress (until time), recover (manual); alert statistics for dashboard (total active, by level, today recovered, by alert page); user followed alert pages (list/save per user)
- Notification group: CRUD for notification groups (name, remark, metadata, members, webhooks, templates, emailConfigs updated via Create/Update); get/list returns member profile (
memberName,memberAvatar) and bound resource items (webhookItems,templateItems,emailConfigItems) withuid + name - Notification group subscription: Get/update subscription filter per notification group (strategy groups, strategies, strategy-level pairs, labels, excludeLabels, datasourceUids, datasourceLevelUids); when multiple dimensions are set, alerts match if at least one dimension matches (OR)
| Service | Method / HTTP | Description |
|---|---|---|
| Self (goddess) | GET /v1/self/info |
Current user info |
GET /v1/self/namespaces |
Current user namespaces list | |
PUT /v1/self/change-email, change-avatar, change-remark |
Change email / avatar / remark | |
GET /v1/self/refresh-token |
Refresh token | |
| User (goddess) | GET /v1/user/{uid} |
Get user |
GET /v1/users, GET /v1/users/select |
List users, select for dropdown | |
PUT /v1/user/ban/{uid}, PUT /v1/user/permit/{uid} |
Ban / permit user | |
| Member (goddess) | GET /v1/members, GET /v1/member/{uid}, GET /v1/members/select |
List members, get member, select |
POST /v1/member/invite |
Invite member (email, role) | |
DELETE /v1/member/{uid}, PUT /v1/member/{uid}/status |
Dismiss member, update status | |
| Captcha (goddess) | GET /v1/captcha |
Get graphic captcha (returns captchaId, captchaB64s) |
| Datasource | POST /v1/datasource |
Create datasource (name, type, driver, url, metadata, remark, levelUid bound to a DATASOURCE level) |
PUT /v1/datasource/{uid} |
Update datasource (including levelUid) |
|
PUT /v1/datasource/{uid}/status |
Update status (ENABLED/DISABLED) | |
DELETE /v1/datasource/{uid} |
Delete datasource | |
GET /v1/datasource/{uid} |
Get datasource | |
GET /v1/datasources |
List (keyword, page, pageSize, type, driver, status) | |
GET /v1/datasources/select |
Select for dropdown | |
GET /v1/datasource/{uid}/status |
Status time series for one datasource (from main TSDB; query: startTime, endTime, stepSeconds; default last 1h, step 60s) | |
GET /v1/datasource/{uid}/metrics |
List metrics (name, description, unit, type only; optional match[], limit; default limit 100) | |
GET /v1/datasource/{uid}/metric-label-detail |
One metric's label detail: labels + each label's values (query: metric=name) | |
| MetricQuery | POST /v1/metric-query/query |
Instant query (body: uid, query, optional time); returns Prometheus-style JSON |
POST /v1/metric-query/query-range |
Range query (body: uid, query, start, end, step); returns Prometheus-style JSON | |
POST /v1/metric-query/proxy |
Direct proxy to datasource (body: uid, path, method, optional body); returns status_code and body | |
| Strategy (group) | POST /v1/strategy-group |
Create strategy group |
PUT /v1/strategy-group/{uid} |
Update strategy group | |
PUT /v1/strategy-group/{uid}/status |
Update status (ENABLED/DISABLED) | |
DELETE /v1/strategy-group/{uid} |
Delete strategy group | |
GET /v1/strategy-group/{uid} |
Get strategy group | |
GET /v1/strategy-groups |
List strategy groups | |
GET /v1/strategy-groups/select |
Select for dropdown | |
POST /v1/strategy-group/{uid}/receivers |
Bind receivers (receiverUIDs) to group | |
| Strategy (item) | POST /v1/strategy |
Create strategy (name, type, driver, strategyGroupUID, status, etc.) |
PUT /v1/strategy/{uid} |
Update strategy | |
PUT /v1/strategy/{uid}/status |
Update status | |
DELETE /v1/strategy/{uid} |
Delete strategy | |
GET /v1/strategy/{uid} |
Get strategy | |
GET /v1/strategies |
List strategies (keyword, page, pageSize, status, strategyGroupUID, type, driver) | |
GET /v1/strategies/select |
Select strategies for dropdown (keyword, limit, lastUid, status, strategyGroupUids list to filter by groups) | |
| Level | POST /v1/level |
Create level (name, remark, metadata, bgColor, type = ALERT/DATASOURCE) |
PUT /v1/level/{uid} |
Update level (bgColor, type) |
|
PUT /v1/level/{uid}/status |
Update status | |
DELETE /v1/level/{uid} |
Delete level | |
GET /v1/level/{uid} |
Get level | |
GET /v1/levels |
List levels (page, pageSize, keyword, status) | |
GET /v1/levels/select |
Select for dropdown | |
| StrategyMetric | POST /v1/metric/strategy/{strategyUID} |
Save strategy metric (expr, labels, datasourceFilter, summary, description) |
GET /v1/metric/strategy/{strategyUID} |
Get strategy metric (with levels; includeDatasources / excludeDatasources for filter include/exclude UIDs) |
|
POST /v1/metric/strategy/{strategyUID}/level |
Save metric level (levelUID, mode, condition, values, duration, status) | |
PUT /v1/metric/strategy/{strategyUID}/level/{uid}/status |
Update metric level status | |
DELETE /v1/metric/strategy/{strategyUID}/level/{uid} |
Delete metric level | |
GET /v1/metric/strategy/{strategyUID}/level/{uid} |
Get metric level | |
POST /v1/metric/strategy/{strategyUID}/receivers |
Bind receivers (receiverUIDs; optional levelUID) | |
| NotificationGroup | POST /v1/notification-groups |
Create notification group (name, remark, metadata, members, webhooks, templates, emailConfigs) |
PUT /v1/notification-groups/{uid} |
Update notification group (including members, webhooks, templates, emailConfigs) | |
PUT /v1/notification-groups/{uid}/status |
Update status (ENABLED/DISABLED) | |
DELETE /v1/notification-groups/{uid} |
Delete notification group | |
GET /v1/notification-groups/{uid} |
Get notification group (members include memberName, memberAvatar; webhook/template/emailConfig include uid + name) |
|
GET /v1/notification-groups |
List notification groups (page, pageSize, keyword, status; members include memberName, memberAvatar; webhook/template/emailConfig include uid + name) |
|
| NotificationGroupSubscription | GET /v1/notification-groups/{notification_group_uid}/subscription |
Get subscription filter for a notification group |
PUT /v1/notification-groups/{notification_group_uid}/subscription |
Save subscription filter (create or overwrite; strategy_group_uids, strategy_uids, strategy_levels, labels, exclude_labels, datasource_uids, datasource_level_uids; OR when multiple set) | |
| Alert (alert page) | POST /v1/alert/alert-pages |
Create alert page (name, color, sortOrder, filter by strategy group/level/strategy/datasource/datasource level) |
PUT /v1/alert/alert-pages/{uid} |
Update alert page (filter supports strategy group/level/strategy/datasource/datasource level) | |
DELETE /v1/alert/alert-pages/{uid} |
Delete alert page | |
GET /v1/alert/alert-pages/{uid} |
Get alert page | |
GET /v1/alert/alert-pages |
List alert pages (page, pageSize, keyword) | |
| Alert (realtime) | GET /v1/alert/alert-pages/{alertPageUid}/realtime-alerts |
List real-time alert events for page (page, pageSize, status; includes level bgColor and datasource levelName) |
GET /v1/alert/statistics |
Alert dashboard statistics (total active, by level, today recovered, by alert page) | |
POST /v1/alert/realtime-alerts/batch-intervene |
Batch assign intervened handler(s) (on-call takeover) | |
POST /v1/alert/realtime-alerts/{uid}/intervene |
Mark event as intervened (on-call takeover) | |
POST /v1/alert/realtime-alerts/{uid}/suppress |
Suppress event until time (body: suppressUntil RFC3339) | |
POST /v1/alert/realtime-alerts/{uid}/recover |
Mark event as manually recovered | |
| Alert (user) | GET /v1/alert/user/alert-pages |
List current user's followed alert pages (personal config) |
PUT /v1/alert/user/alert-pages |
Save current user's followed alert pages (body: alertPageUids, max 10; replaces list) |
Types: DatasourceType: METRICS, LOGS, TRACE. LevelType: ALERT, DATASOURCE. Drivers: METRICS_PROMETHEUS, METRICS_VICTORIA_METRICS, LOGS_ELASTICSEARCH, TRACE_JAEGER.
API definitions: Marksman-owned APIs are in proto/marksman/api/v1/ (e.g. datasource.proto, strategy.proto, level.proto, strategy_metric.proto, alert.proto). Self, User, Member, Namespace, Captcha, etc. come from goddess at proto/goddess/api/v1/. OpenAPI can be generated via make api.
From the Moon repo root or from this directory:
cd app/marksman # if at repo root
make init # install protoc plugins, wire, etc.
make build # generate API/conf/wire and build binary → bin/marksman# Build
make init && make build
# Run all (HTTP + gRPC) in development
./bin/marksman run all --log-level=DEBUG
# or
make dev# Help
./bin/marksman -h
./bin/marksman version
./bin/marksman run all -h
./bin/marksman run grpc -h
./bin/marksman run http -h| Command | Description |
|---|---|
./bin/marksman run all |
Run both HTTP and gRPC servers |
./bin/marksman run http |
HTTP only |
./bin/marksman run grpc |
gRPC only |
| Target | Description |
|---|---|
make init |
Install protoc plugins, wire, kratos, etc. |
make conf |
Generate config from proto |
make api |
Generate Go/HTTP/gRPC/OpenAPI from proto/marksman |
make wire |
Generate Wire DI |
make all |
api + conf + wire |
make build |
all + build binary to bin/marksman |
make dev |
go run . run all --log-level=DEBUG |
make gen |
Generate DO/data layer (e.g. test with generate tag) |
make clean |
Remove bin/ |
make help |
List all targets |
-
Regenerate after proto changes
make api make wire # if service graph changed -
Run without building
go run . run all go run . run http go run . run grpc
-
Config: See
internal/conf/; set config path via flag or env. For datasource status time series (GET /v1/datasource/{uid}/status), configure mainTsdb inconfig/server.yaml:driver(prometheusorvictoria_metrics) andurl; env overrides:MARKSMAN_MAIN_TSDB_DRIVER,MARKSMAN_MAIN_TSDB_URL.