Skip to content

Commit 17e573b

Browse files
authored
docs(deploy-queue): updated readme with bulk cancellation, listing outliers and slack integration. (#73)
1 parent 9107b36 commit 17e573b

File tree

1 file changed

+183
-9
lines changed

1 file changed

+183
-9
lines changed

deploy-queue/README.md

Lines changed: 183 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ This tool is designed to be used as a GitHub Action. See [action.yml](action.yml
8181

8282
## CLI Usage
8383

84-
The deploy-queue CLI has four main commands:
84+
The deploy-queue CLI has six main commands:
8585

8686
### 1. Start a Deployment
8787

@@ -131,17 +131,49 @@ deploy-queue finish <DEPLOYMENT_ID>
131131
deploy-queue finish 42
132132
```
133133

134-
### 3. Cancel a Deployment
134+
### 3. Cancel Deployment(s)
135135

136-
Cancel a deployment with an optional note.
136+
#### Cancel by Deployment ID
137+
138+
Cancel a specific deployment by its ID with an optional note.
139+
140+
```bash
141+
deploy-queue cancel deployment <DEPLOYMENT_ID> [--cancellation-note <NOTE>]
142+
```
143+
144+
**Example:**
145+
```bash
146+
deploy-queue cancel deployment 42 --cancellation-note "Cancelled due to failing health checks"
147+
```
148+
149+
#### Bulk Cancel per Component
150+
151+
Cancel all deployments for a specific component and version. Use this method only in case of emergencies.
137152

138153
```bash
139-
deploy-queue cancel <DEPLOYMENT_ID> [CANCELLATION_NOTE]
154+
deploy-queue cancel version --component <COMPONENT> --version <VERSION> [--cancellation-note <NOTE>]
140155
```
141156

142157
**Example:**
143158
```bash
144-
deploy-queue cancel 42 "Cancelled due to failing health checks"
159+
deploy-queue cancel version --component api --version v1.2.3 --cancellation-note "Rolling back bad release"
160+
```
161+
162+
#### Bulk Cancel per Region
163+
164+
Cancel all deployments in a specific location (environment, cloud provider, region, and optionally cell). Use this method only in case of emergencies and coordinate with owners of affected deployment jobs beforehand.
165+
166+
```bash
167+
deploy-queue cancel location --environment <ENVIRONMENT> --provider <PROVIDER> --region <REGION> [--cell-index <CELL_INDEX>] [--cancellation-note <NOTE>]
168+
```
169+
170+
**Examples:**
171+
```bash
172+
# Cancel all deployments in a specific cell
173+
deploy-queue cancel location --environment prod --provider aws --region us-west-2 --cell-index 1 --cancellation-note "Emergency maintenance"
174+
175+
# Cancel all deployments in a region (all cells)
176+
deploy-queue cancel location --environment prod --provider aws --region us-west-2 --cancellation-note "Regional outage"
145177
```
146178

147179
### 4. Get Deployment Info
@@ -162,6 +194,43 @@ deploy-queue info 42
162194
42 deployed [email protected]: (Hotfix for critical bug) (https://github.com/org/repo/actions/runs/123)
163195
```
164196

197+
### 5. List Outliers
198+
199+
List deployments that are taking substantially longer than expected. This is useful for identifying stuck or problematic deployments.
200+
201+
**Example:**
202+
```bash
203+
deploy-queue list outliers
204+
```
205+
206+
**Output:**
207+
- Prints outlier deployments in JSON format
208+
- Writes `active-outliers=<JSON>` to `$GITHUB_OUTPUT` if running in GitHub Actions
209+
210+
### 6. List Cells
211+
212+
List all known cells for a given environment. This shows which cloud provider/region/cell combinations have had deployments.
213+
214+
```bash
215+
deploy-queue list cells --environment <ENVIRONMENT>
216+
```
217+
218+
**Example:**
219+
```bash
220+
deploy-queue list cells --environment prod
221+
```
222+
223+
**Output:**
224+
```
225+
Known cells for environment prod:
226+
- prod-aws-us-west-2-cell-0
227+
- prod-aws-us-west-2-cell-1
228+
- prod-aws-us-east-1-cell-0
229+
- prod-gcp-us-central1-cell-0
230+
```
231+
232+
Writes `cells=<JSON>` to `$GITHUB_OUTPUT` if running in GitHub Actions.
233+
165234
## Configuration
166235

167236
### Environment Variables
@@ -338,6 +407,92 @@ jobs:
338407

339408
This "breaking glass" mechanism allows you to maintain deployment velocity during critical incidents while keeping the safety of the queue for normal operations.
340409

410+
### Slack Notifications
411+
412+
The deploy queue action supports sending deployment notifications to Slack channels, allowing teams to monitor deployments in real-time.
413+
414+
### Setup
415+
416+
To enable Slack notifications, you need:
417+
418+
1. **Slack Bot Token** - A bot token with `chat:write` permission
419+
2. **Slack Channel ID** - The ID of the channel where notifications should be sent
420+
421+
Store these as variables/secrets in your GitHub repository:
422+
- `SLACK_BOT_TOKEN` - Your Slack bot OAuth token secret
423+
- `SLACK_CHANNEL_ID` - The target Slack channel ID (e.g., `C01234567`) variable
424+
425+
### How It Works
426+
427+
The action can send three types of notifications:
428+
429+
1. **Start Notification** - Sent when a deployment begins, includes:
430+
- Component name and version
431+
- Environment, cloud provider, region, and cell
432+
- Deployment ID
433+
- Link to GitHub Actions job
434+
435+
2. **Finish Notification** - Sent when a deployment completes successfully
436+
- Can be displayed in the thread of the start notification (when providing `slack-start-message-id`)
437+
438+
3. **Cancel Notification** - Sent when a deployment is cancelled
439+
- Can be displayed in the thread of the start notification (when providing `slack-start-message-id`)
440+
441+
### Basic Example with Slack Notifications
442+
443+
```yaml
444+
name: Deploy API with Slack Notifications
445+
on:
446+
push:
447+
branches: [main]
448+
449+
jobs:
450+
deploy:
451+
runs-on: ubuntu-latest
452+
steps:
453+
- name: Start deployment
454+
id: deploy-queue-start
455+
uses: neondatabase/dev-actions/deploy-queue@v1
456+
with:
457+
mode: start
458+
environment: prod
459+
cloud-provider: aws
460+
region: us-west-2
461+
cell-index: 1
462+
component: api
463+
version: ${{ github.sha }}
464+
slack-channel-id: ${{ variables.SLACK_CHANNEL_ID }}
465+
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
466+
467+
- name: Run actual deployment
468+
run: |
469+
# Your deployment commands here
470+
echo "Deploying..."
471+
472+
- name: Finish deployment
473+
if: success()
474+
uses: neondatabase/dev-actions/deploy-queue@v1
475+
with:
476+
mode: finish
477+
deployment-id: ${{ steps.deploy-queue-start.outputs.deployment-id }}
478+
slack-channel-id: ${{ variables.SLACK_CHANNEL_ID }}
479+
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
480+
slack-start-message-id: ${{ steps.deploy-queue-start.outputs.slack-start-message-id }}
481+
482+
- name: Cancel deployment
483+
if: failure()
484+
uses: neondatabase/dev-actions/deploy-queue@v1
485+
with:
486+
mode: cancel
487+
deployment-id: ${{ steps.deploy-queue-start.outputs.deployment-id }}
488+
cancellation-note: "Deployment failed"
489+
slack-channel-id: ${{ variables.SLACK_CHANNEL_ID }}
490+
slack-bot-token: ${{ secrets.SLACK_BOT_TOKEN }}
491+
slack-start-message-id: ${{ steps.deploy-queue-start.outputs.slack-start-message-id }}
492+
```
493+
494+
By passing the `slack-start-message-id` output from the start step to the finish/cancel steps, the notifications will be threaded together in Slack, making it easy to track the full lifecycle of a deployment in one conversation thread.
495+
341496
## Development
342497

343498
### Running Tests
@@ -435,10 +590,29 @@ If compilation fails in CI without a database, ensure the `.sqlx/` directory is
435590
### Components
436591

437592
- **CLI (`src/cli.rs`)** - Command-line argument parsing using `clap`
438-
- **Library (`src/lib.rs`)** - Core logic for deployment management
439-
- **Queries (`queries/`)** - SQL queries for blocking detection
440-
- **Migrations (`migrations/`)** - Database schema versioning
441-
- **Tests (`tests/`)** - Integration and unit tests
593+
- **Library (`src/lib.rs`)** - Main entry point and orchestration logic
594+
- **Models (`src/model.rs`)** - Data structures for deployments, cells, and deployment states
595+
- **Handlers (`src/handler/`)** - Business logic for deployment operations
596+
- `cancel.rs` - Cancel deployment operations (by ID, version, or location)
597+
- `fetch.rs` - Database queries for fetching deployments, outliers, cells, etc.
598+
- `list.rs` - List operations for outliers and cells
599+
- `mod.rs` - Core handlers for enqueue, start, finish, and wait operations
600+
- **Utilities (`src/util/`)** - Helper modules
601+
- `database.rs` - Database connection and migration management
602+
- `duration.rs` - Duration formatting and calculations
603+
- `github.rs` - GitHub Actions output integration
604+
- **Constants (`src/constants.rs`)** - Application constants (e.g., retry intervals)
605+
- **Queries (`queries/`)** - SQL queries for complex operations
606+
- `blocking_deployments.sql` - Find deployments blocking a given deployment
607+
- `active_outliers.sql` - Find deployments taking longer than expected
608+
- `grafana_queries.sql` - Example queries for monitoring dashboards
609+
- **Migrations (`migrations/`)** - Database schema versioning (migrations)
610+
- **Tests (`tests/`)** - Comprehensive test suite
611+
- `integration_tests.rs` - End-to-end deployment workflow tests
612+
- `blocking_deployments_tests.rs` - Blocking logic tests
613+
- `deployment_analytics_tests.rs` - Duration analytics tests
614+
- `outlier_detection_tests.rs` - Outlier detection tests
615+
- `views_tests.rs` - Database view tests
442616

443617
## Queries
444618

0 commit comments

Comments
 (0)