Skip to content

Commit 8473848

Browse files
Refactor README to enhance clarity and organization of diagnostic scripts and usage examples
1 parent 7ad0609 commit 8473848

1 file changed

Lines changed: 81 additions & 266 deletions

File tree

README.md

Lines changed: 81 additions & 266 deletions
Original file line numberDiff line numberDiff line change
@@ -1,288 +1,103 @@
1-
# pgtools
2-
3-
A collection of SQL scripts and utilities for monitoring, troubleshooting, and maintaining PostgreSQL databases.
4-
5-
## 👋 New to pgtools?
6-
7-
**[👉 Get Started Here - Complete Beginner's Guide](GETTING-STARTED.md)**
8-
9-
Perfect for new users! This comprehensive guide walks you through installation, first steps, essential workflows, and automation setup.
10-
11-
## 📋 Table of Contents
12-
13-
- [Overview](#overview)
14-
- [Script Categories](#script-categories)
15-
- [Usage Examples](#usage-examples)
16-
- [Contributing](#contributing)
17-
- [License](#license)
18-
19-
### Quick Links
20-
21-
This toolkit provides battle-tested SQL scripts for PostgreSQL database administrators and developers to:
22-
- Monitor database health and performance
23-
- Troubleshoot common issues
24-
- Maintain database integrity
25-
- Optimize query performance
26-
- Manage replication and WAL files
27-
## Script Categories
28-
### 🔍 Monitoring Scripts
29-
**bloating.sql**
30-
- Detects table and index bloat
31-
- Shows dead tuples and wasted space
32-
- Helps identify tables needing VACUUM
33-
34-
**buffer_troubleshoot.sql**
35-
- Analyzes shared buffer usage
36-
- Shows buffer cache hit ratios
37-
- Identifies tables with poor caching
38-
39-
**locks.sql**
40-
- Lists current locks in the database
41-
- Shows lock types and waiting queries
42-
- Essential for deadlock investigation
43-
44-
**postgres_locking_blocking.sql**
45-
- Advanced lock analysis
46-
- Shows blocking and blocked queries
47-
- Includes query details and wait times
48-
49-
**replication.sql**
50-
- Monitors replication lag
51-
- Shows replication slot status
52-
- Checks standby server health
53-
54-
**txid.sql**
55-
- Displays transaction ID information
56-
- Monitors transaction wraparound risk
57-
- Shows age of databases and tables
58-
59-
**connection_pools.sql**
60-
- Monitors connection pooling health and efficiency
61-
- Analyzes connection patterns and potential leaks
62-
- Provides connection pool optimization recommendations
63-
- Works with PgBouncer, Pgpool-II, and native connections
64-
65-
### 🔧 Maintenance Scripts
66-
**switch_pg_wal_file.sql**
67-
- Forces WAL file switching
68-
- Useful for archiving and backup operations
69-
- Requires superuser privileges
70-
71-
**walfile_in_use.sql**
72-
- Shows currently active WAL files
73-
- Displays WAL file location and size
74-
- Helps troubleshoot disk space issues
75-
76-
**Transaction Wraparound**
77-
- Scripts for monitoring and preventing transaction ID wraparound
78-
- Critical for database availability
79-
80-
### 🤖 Maintenance Automation
81-
**auto_maintenance.sh**
82-
- Comprehensive automated maintenance operations (VACUUM, ANALYZE, REINDEX)
83-
- Intelligent threshold-based maintenance with configurable parameters
84-
- Parallel processing with safety controls and dry-run mode
85-
- Large table detection and resource management
86-
87-
**maintenance_scheduler.sql**
88-
- Analysis and scheduling recommendations for maintenance operations
89-
- VACUUM/ANALYZE candidate identification with workload estimation
90-
- Index bloat analysis and autovacuum effectiveness assessment
91-
- Maintenance planning and resource optimization
92-
93-
**statistics_collector.sql**
94-
- Table and index statistics analysis and optimization
95-
- Statistics quality assessment and freshness analysis
96-
- Column distribution analysis with optimization recommendations
97-
- Extended statistics support for PostgreSQL 15+
98-
99-
### 👤 Administration Scripts
100-
**extensions.sql**
101-
- Lists installed PostgreSQL extensions
102-
- Shows extension versions and schemas
103-
- Helps audit database capabilities
104-
105-
**table_ownership.sql**
106-
- Shows table ownership information
107-
- Useful for permission audits
108-
- Helps with database migrations
109-
110-
**ForeignConst.sql**
111-
- Lists foreign key constraints
112-
- Shows constraint details and relationships
113-
- Aids in schema documentation
114-
115-
**NonHypertables.sql**
116-
- Identifies non-hypertables (TimescaleDB specific)
117-
- Useful for TimescaleDB users
118-
- Helps in migration planning
119-
120-
**partition_management.sql**
121-
- Monitors partition health and performance
122-
- Analyzes partition size distribution and balance
123-
- Provides partition maintenance recommendations
124-
- Supports automated partition management strategies
125-
126-
### ⚡ Optimization Scripts
127-
**hot_update_optimization_checklist.sql**
128-
- Checks HOT (Heap-Only Tuple) update optimization
129-
- Identifies inefficient table structures
130-
- Suggests fillfactor adjustments
131-
132-
**missing_indexes.sql**
133-
- Identifies potentially beneficial indexes based on query patterns
134-
- Analyzes sequential scan activity and unused indexes
135-
- Detects foreign key columns missing indexes
136-
- Provides index optimization recommendations
137-
138-
### 📦 Backup & Recovery Scripts
139-
**backup_validation.sql**
140-
- Validates backup completeness and integrity
141-
- Checks WAL archiving status and health
142-
- Analyzes backup readiness and configuration
143-
- Provides backup strategy recommendations
144-
145-
### 🔒 Security Scripts
146-
**permission_audit.sql**
147-
- Comprehensive security audit of roles and permissions
148-
- Identifies overprivileged accounts and security risks
149-
- Analyzes database, schema, and table-level access
150-
- Reviews authentication and Row Level Security (RLS)
151-
152-
### ⚡ Performance Analysis Scripts
153-
**wait_event_analysis.sql**
154-
- Comprehensive analysis of PostgreSQL wait events and performance bottlenecks
155-
- Identifies I/O, locking, and resource contention issues
156-
- Provides detailed wait event categorization and recommendations
157-
- Analyzes connection pooling and background worker efficiency
158-
159-
**query_performance_profiler.sql**
160-
- Detailed query performance analysis using pg_stat_statements
161-
- Identifies slow queries, I/O intensive operations, and resource usage
162-
- Analyzes query variance and performance degradation patterns
163-
- Provides optimization recommendations for query tuning
164-
165-
**resource_monitoring.sql**
166-
- Comprehensive system resource utilization monitoring
167-
- Analyzes memory, I/O, connection, and storage usage patterns
168-
- Monitors autovacuum activity and maintenance requirements
169-
- Provides resource optimization recommendations
170-
171-
### ⚙️ Configuration Management Scripts
172-
**configuration_analysis.sql**
173-
- Comprehensive PostgreSQL configuration analysis and recommendations
174-
- Reviews memory, connection, WAL, and security settings
175-
- Analyzes current parameters against best practices
176-
- Provides workload-specific tuning suggestions
177-
178-
**parameter_tuner.sh** (automation/configuration/)
179-
- Interactive PostgreSQL parameter tuning assistant
180-
- Generates optimized configurations for different workload types (OLTP, OLAP, Web)
181-
- Provides memory and performance setting recommendations
182-
- Supports configuration validation and analysis modes
183-
184-
### 🔗 Integration Tools
185-
**grafana_dashboard_generator.sh** (integration/)
186-
- Generates comprehensive Grafana dashboards for PostgreSQL monitoring
187-
- Supports multiple dashboard types: comprehensive, performance, security, connections
188-
- Provides direct Grafana API integration for automatic dashboard deployment
189-
- Creates customizable monitoring visualizations
190-
191-
**prometheus_exporter.sh** (integration/)
192-
- Custom PostgreSQL metrics exporter for Prometheus
193-
- Exports database statistics, connection metrics, and performance data
194-
- Supports daemon mode for continuous metrics collection
195-
- Provides HTTP endpoint for Prometheus scraping
196-
197-
### 🩺 Troubleshooting Scripts
198-
**postgres_troubleshooting_queries.sql**
199-
- Collection of diagnostic queries
200-
- Quick health checks
201-
- Performance analysis queries
202-
203-
**postgres_troubleshooting_query_pack_01.sql**
204-
- First pack of troubleshooting queries
205-
- Focuses on basic diagnostics
206-
207-
**postgres_troubleshooting_query_pack_02.sql**
208-
- Second pack of troubleshooting queries
209-
- Intermediate level diagnostics
210-
211-
**postgres_troubleshooting_query_pack_03.sql**
212-
- Third pack of troubleshooting queries
213-
- Advanced diagnostics
214-
215-
**postgres_troubleshooting_cheat_sheet.txt**
216-
- Quick reference guide
217-
- Common commands and queries
218-
- Best practices and tips
219-
220-
## Usage Examples
221-
### Check for blocking queries
222-
```bash
223-
psql -U postgres -d mydb -f monitoring/postgres_locking_blocking.sql
224-
```
225-
### Monitor replication lag
226-
```bash
227-
psql -U postgres -d mydb -f monitoring/replication.sql
228-
```
229-
### Identify bloated tables
1+
# pgtools: The First Responder's Toolbelt for PostgreSQL
2+
3+
`pgtools` is a curated collection of safe, read-only diagnostic scripts for PostgreSQL and TimescaleDB, wrapped in a simple command-line interface. It is designed for Support Engineers, DBAs, and developers who need to triage production database issues quickly and without causing harm.
4+
5+
Every script is executed with strict, short timeouts to ensure that diagnostic queries never impact a heavily loaded system.
6+
7+
## Core Principles
8+
9+
- **Zero-Harm Policy**: Every script is read-only and executed with a 5-second `statement_timeout` and 3-second `lock_timeout`.
10+
- **No Dependencies**: The toolbelt relies only on `bash` and `psql`. No Python, Go, or other complex dependencies are required.
11+
- **Ticket-Ready Output**: All output is formatted for easy copy-pasting into Zendesk, Jira, or Markdown documents.
12+
- **Community-Driven**: Built for general PostgreSQL users, with specialized diagnostics for TimescaleDB.
13+
14+
## Getting Started
15+
16+
### Installation
17+
23018
```bash
231-
psql -U postgres -d mydb -f monitoring/bloating.sql
19+
# 1. Clone the repository
20+
git clone <https://github.com/thepostgresguy/pgtools.git>
21+
cd pgtools
22+
23+
# 2. Make the wrapper script executable
24+
chmod +x pgtools.sh
23225
```
233-
### Check transaction wraparound risk
26+
27+
### Usage
28+
29+
All commands are run through the `pgtools.sh` wrapper.
30+
23431
```bash
235-
psql -U postgres -d mydb -f monitoring/txid.sql
32+
./pgtools.sh <command> "<connection_string>"
23633
```
237-
### Validate backup readiness
34+
35+
**Example: Check for blocking locks**
23836
```bash
239-
psql -U postgres -d mydb -f backup/backup_validation.sql
37+
./pgtools.sh locks "postgresql://user:pass@host:port/dbname"
24038
```
241-
### Analyze connection pooling efficiency
39+
40+
**Example: Check TimescaleDB chunk stats using a service name**
24241
```bash
243-
psql -U postgres -d mydb -f monitoring/connection_pools.sql
42+
./pgtools.sh chunk-stats "service=my_customer_db"
24443
```
24544

246-
### Automation / HOT report verification
247-
```bash
248-
# Quick automation sanity check (connection, syntax, permissions)
249-
./automation/test_pgtools.sh --fast
45+
## Available Commands
25046

251-
# Full automation suite with integration tests
252-
./automation/test_pgtools.sh --full --verbose
47+
Run `./pgtools.sh` with no arguments to see the full list of commands.
25348

254-
# HOT checklist JSON validation
255-
./automation/run_hot_update_report.sh --format json --database my_database --stdout
49+
### General Diagnostics
25650

257-
# HOT checklist text validation
258-
./automation/run_hot_update_report.sh --format text --database my_database --stdout
51+
* `locks`: Show current lock contention and blocking queries.
52+
* `activity`: Display current query activity from `pg_stat_activity`.
53+
* `top-queries`: Show most time-consuming queries (requires `pg_stat_statements`).
54+
* `bloat`: Identify table and index bloat.
55+
* `replication`: Monitor replication lag and status.
56+
* `disk-usage`: Show disk usage by table and index.
57+
* `cache-hit`: Show table and index cache hit rates.
25958

260-
# Full local pre-commit bundle
261-
./scripts/precommit_checks.sh --database my_database
262-
```
59+
### TimescaleDB Diagnostics
60+
61+
* `chunk-stats`: Show chunk count and size per hypertable.
62+
* `compression-stats`: Show compression ratio and job status per hypertable.
63+
* `cagg-stats`: Show continuous aggregate health and refresh policy status.
64+
* `job-errors`: Show recent errors from background jobs.
65+
* `uncompressed-chunks`: Show chunks that are old but not compressed.
66+
67+
### Administration
26368

264-
## Script Categories
69+
* `permissions`: Audit user and role permissions.
70+
* `ownership`: Display table and object ownership.
26571

266-
- **Monitoring** - Database health, locks, replication, bloating
267-
- **Maintenance** - VACUUM, ANALYZE, statistics collection
268-
- **Automation** - Health checks, scheduling, alerting
269-
- **Administration** - Extensions, ownership, constraints, partitions
270-
- **Optimization** - Index recommendations, HOT updates, missing indexes
271-
- **Performance** - Query profiling, wait events, resource monitoring
272-
- **Security** - Permission audits, compliance checks
273-
- **Troubleshooting** - Diagnostic queries and cheat sheets
274-
- **Backup & Recovery** - Backup validation and integrity checks
275-
- **Configuration** - Parameter tuning and analysis
276-
- **Integration** - Grafana dashboards, Prometheus exporters
72+
## Incident Response Workflow Example
73+
74+
A customer reports "the database is slow." Here's a typical triage flow using `pgtools`:
75+
76+
1. **Check for blocking locks.** This is the most common cause of a sudden slowdown.
77+
```bash
78+
./pgtools.sh locks "<conn_string>"
79+
```
80+
81+
2. **Check current activity.** See what queries are actively running or waiting.
82+
```bash
83+
./pgtools.sh activity "<conn_string>"
84+
```
85+
86+
3. **Check top queries.** If `pg_stat_statements` is enabled, find out which queries are consuming the most database time historically.
87+
```bash
88+
./pgtools.sh top-queries "<conn_string>"
89+
```
90+
91+
4. **Check cache hit rate.** A low hit rate points to I/O bottlenecks.
92+
```bash
93+
./pgtools.sh cache-hit "<conn_string>"
94+
```
27795

27896
## Contributing
27997

280-
Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
98+
Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines on how to add new diagnostic scripts.
28199

282100
## License
283101

284-
See [LICENSE](LICENSE) file for details.
285-
286-
## Support
102+
This project is licensed under the MIT License - see the LICENSE file for details.
287103

288-
For issues, questions, or contributions, please open an issue in the repository.

0 commit comments

Comments
 (0)