This document outlines the standards and best practices for database design, development, and management at Bayat.
- Introduction
- Database Selection
- Data Modeling
- Naming Conventions
- SQL Standards
- NoSQL Standards
- Schema Management
- Performance Optimization
- Security
- Backup and Recovery
- Monitoring
- Data Access Patterns
Database systems are critical components of our applications. This document provides guidelines for designing, implementing, and maintaining database systems that are secure, performant, and maintainable.
These standards apply to:
- Relational databases (PostgreSQL, MySQL, SQL Server, etc.)
- NoSQL databases (MongoDB, DynamoDB, Cassandra, etc.)
- In-memory databases (Redis, Memcached, etc.)
- Time-series databases (InfluxDB, TimescaleDB, etc.)
- Graph databases (Neo4j, Amazon Neptune, etc.)
Choose the appropriate database type based on:
- Data structure: Structured, semi-structured, or unstructured
- Query patterns: Read-heavy, write-heavy, or balanced
- Scalability requirements: Vertical vs. horizontal scaling
- Consistency requirements: Strong vs. eventual consistency
- Availability requirements: High availability needs
- Transaction requirements: ACID compliance needs
Database Type | Best For | Examples | When to Use |
---|---|---|---|
Relational | Structured data with relationships | PostgreSQL, MySQL | Complex queries, transactions, data integrity |
Document | Semi-structured data | MongoDB, Couchbase | Flexible schema, hierarchical data |
Key-Value | Simple lookups | Redis, DynamoDB | Caching, session storage, high throughput |
Column-Family | Wide column data | Cassandra, HBase | Time-series, high write throughput |
Graph | Highly connected data | Neo4j, Neptune | Relationship-heavy data, path finding |
Time-Series | Time-ordered measurements | InfluxDB, TimescaleDB | Metrics, monitoring, IoT data |
Search | Full-text search | Elasticsearch, Solr | Text search, log analysis |
For new projects, prefer these databases unless specific requirements dictate otherwise:
- Relational: PostgreSQL
- Document: MongoDB
- Key-Value: Redis
- Search: Elasticsearch
- Time-Series: InfluxDB
- Graph: Neo4j
- Use appropriate normalization level (typically 3NF)
- Denormalize strategically for performance when necessary
- Document denormalization decisions
- Each table should represent a single entity or concept
- Use appropriate data types for columns
- Define primary keys for all tables
- Use foreign keys to enforce referential integrity
- Consider using UUIDs for primary keys in distributed systems
- Define clear one-to-many, many-to-many, and one-to-one relationships
- Use junction tables for many-to-many relationships
- Consider using foreign key constraints with appropriate actions:
- ON DELETE: CASCADE, SET NULL, or RESTRICT
- ON UPDATE: CASCADE or RESTRICT
- Design for query patterns, not for normalization
- Embed related data that is queried together
- Use references for large sub-documents or frequently changing data
- Consider document size limits
- Design indexes based on query patterns
- Use consistent key naming conventions
- Consider key distribution for sharded systems
- Design for efficient key lookups
- Use composite keys when appropriate
- Design column families based on access patterns
- Group related columns that are accessed together
- Consider time-to-live (TTL) for temporal data
- Design row keys for efficient data distribution and access
- Identify entities as nodes and relationships as edges
- Use properties on both nodes and edges
- Design for traversal efficiency
- Consider bidirectional relationships
- Use clear, descriptive names
- Be consistent across the database
- Avoid reserved words and special characters
- Use singular or plural consistently (prefer singular)
- Use lowercase with underscores for multi-word names in SQL databases
- Use camelCase for NoSQL databases that support it
Object Type | Convention | Example |
---|---|---|
Table | snake_case, singular | user , order_item |
Column | snake_case | first_name , created_at |
Primary Key | id or {table_name}_id |
id , user_id |
Foreign Key | {referenced_table}_id |
user_id , product_id |
Index | idx_{table}_{columns} |
idx_user_email , idx_order_created_at |
Constraint | {type}_{table}_{column} |
pk_user_id , fk_order_user_id , uq_user_email |
View | vw_{description} |
vw_active_users , vw_monthly_sales |
Stored Procedure | sp_{action}_{entity} |
sp_get_user , sp_create_order |
Function | fn_{description} |
fn_calculate_tax , fn_format_phone |
Trigger | trg_{table}_{event} |
trg_user_after_insert , trg_order_before_update |
Database Type | Collection/Table | Field/Attribute | Example |
---|---|---|---|
MongoDB | PascalCase, singular | camelCase | Collection: User , Field: firstName |
DynamoDB | PascalCase, singular | camelCase | Table: Product , Attribute: productName |
Cassandra | snake_case | snake_case | Table: user_activity , Column: login_time |
Redis | colon:separated | n/a | Key: user:1000:session |
- Use uppercase for SQL keywords
- Use table aliases for complex queries
- Format queries for readability
- Use explicit column names in SELECT statements (avoid SELECT *)
- Use parameterized queries to prevent SQL injection
-- Good example
SELECT
u.id,
u.first_name,
u.last_name,
o.order_date
FROM
user u
JOIN
order o ON u.id = o.user_id
WHERE
u.status = 'active'
AND o.order_date > ?
ORDER BY
o.order_date DESC
LIMIT 10;
- Index columns used in WHERE, JOIN, and ORDER BY clauses
- Consider covering indexes for frequently used queries
- Avoid over-indexing (balance read vs. write performance)
- Monitor and maintain indexes regularly
- Consider partial indexes for filtered queries
- Use appropriate index types (B-tree, Hash, GiST, etc.)
- Keep transactions as short as possible
- Use appropriate isolation levels
- Handle deadlocks gracefully
- Consider using optimistic concurrency control
- Document transaction boundaries in code
- Design documents for query efficiency
- Use consistent document structure
- Implement schema validation where supported
- Use appropriate indexing strategies
- Consider document versioning for schema evolution
- Use consistent key naming patterns
- Consider key expiration for temporal data
- Implement key namespacing to avoid collisions
- Document key structure and patterns
- Define appropriate retention policies
- Use tags/labels for efficient filtering
- Consider downsampling for historical data
- Optimize for time-range queries
- Use incremental version numbers for schema changes
- Maintain a schema version history
- Document breaking vs. non-breaking changes
- Use database migration tools:
- SQL: Flyway, Liquibase, or framework-specific tools
- NoSQL: Custom migration scripts or framework tools
- Test migrations thoroughly before production
- Include rollback procedures for each migration
- Automate migration execution in CI/CD pipeline
- Prefer additive, non-breaking changes
- Implement multi-phase migrations for breaking changes:
- Add new structure
- Migrate data
- Update application code
- Remove old structure
- Coordinate schema changes with application deployments
- Consider database refactoring patterns
- Write efficient queries
- Use EXPLAIN/EXPLAIN ANALYZE to understand query plans
- Optimize JOIN operations
- Use appropriate filtering
- Implement pagination for large result sets
- Consider query caching where appropriate
- Index based on query patterns
- Monitor index usage and performance
- Rebuild/reorganize indexes regularly
- Consider specialized indexes (spatial, full-text, etc.)
- Use connection pooling
- Monitor connection usage
- Set appropriate connection timeouts
- Handle connection errors gracefully
- Implement appropriate caching layers:
- Application-level cache
- Database query cache
- Result cache
- Define cache invalidation strategy
- Monitor cache hit rates
- Use strong authentication mechanisms
- Implement principle of least privilege
- Create application-specific database users
- Avoid using root/admin accounts in applications
- Implement row-level security where appropriate
- Encrypt sensitive data at rest
- Use TLS/SSL for data in transit
- Implement column-level encryption for sensitive fields
- Consider using data masking for non-production environments
- Document database access requirements
- Regularly audit database permissions
- Implement role-based access control
- Use database proxies or connection poolers with authentication
- Enable database audit logging
- Track schema changes
- Monitor for suspicious activities
- Implement compliance-required auditing
- Use appropriate data masking and anonymization for non-production data
- Encrypt sensitive data at rest and in transit
- Follow the Data Protection Standards
- Implement regular automated backups
- Use point-in-time recovery capabilities
- Test backup restoration regularly
- Store backups securely and redundantly
- Document backup procedures
- Define recovery time objectives (RTO)
- Define recovery point objectives (RPO)
- Document step-by-step recovery procedures
- Test disaster recovery scenarios regularly
- Implement automated recovery where possible
- Monitor key database metrics:
- Query performance
- Resource utilization (CPU, memory, disk, network)
- Connection counts
- Lock contention
- Cache hit rates
- Set up alerting for performance thresholds
- Implement query performance logging
- Monitor database availability
- Check replication status
- Verify backup success
- Monitor disk space usage
- Set up proactive alerts
- Use database-specific monitoring tools
- Implement centralized monitoring dashboards
- Retain historical performance data
- Document common performance patterns
- Use consistent ORM frameworks
- Understand ORM-generated SQL
- Avoid N+1 query problems
- Use eager loading appropriately
- Consider performance implications of ORM abstractions
- Define clear data ownership boundaries
- Consider data duplication vs. shared database tradeoffs
- Implement appropriate consistency patterns
- Document cross-service data dependencies
- Implement data access through APIs when appropriate
- Define clear data access contracts
- Consider GraphQL for flexible data querying
- Implement proper API versioning for schema changes
- Document database technology choices
- Define clear boundaries between different database systems
- Implement appropriate synchronization mechanisms
- Consider eventual consistency implications
Version | Date | Description |
---|---|---|
1.0 | 2025-03-20 | Initial version |