This document outlines the deployment standards and practices for all Bayat projects. Following these guidelines ensures consistent, reliable, and secure deployments across all environments and project types.
- Deployment Principles
- Environments
- Deployment Process
- Versioning
- Artifacts and Registries
- Configuration Management
- Deployment Automation
- Deployment Strategies
- Rollback Procedures
- Release Notes
- Post-Deployment Verification
- Environment-Specific Guidelines
- Security Considerations
- Maintenance Windows
- Monitoring and Alerting
All deployments at Bayat should adhere to the following core principles:
- Automation: Automate deployment processes to ensure consistency and reduce human error
- Repeatability: Deployment processes should produce the same results when repeated
- Traceability: Track what was deployed, when, by whom, and with what configuration
- Security: Include security checks and validations at each stage
- Testability: Verify deployments against defined acceptance criteria
- Isolation: Changes to one environment should not affect others
- Rollback capability: Every deployment should have a defined rollback plan
- Minimal downtime: Strive for zero-downtime deployments where possible
Every project must have at least the following environments:
Environment | Purpose | Access | Deployment Frequency | Data Sensitivity |
---|---|---|---|---|
Development | Feature development and integration | Developers | Continuous | Sanitized/Fake |
Testing/QA | Formal testing and validation | QA, Developers | After dev approval | Sanitized/Fake |
Staging | Pre-production verification | Limited team members | After QA approval | Production-like sanitized |
Production | Live customer-facing environment | Highly restricted | Scheduled releases | Real data |
Maintain high parity between environments:
- Use the same operating systems, versions, and configurations
- Use the same deployment mechanisms across all environments
- Scale differences should be in resource allocation, not architecture
- Document any necessary differences between environments
All projects must implement a deployment pipeline with the following stages:
- Build: Compile code, run static analysis, create deployment artifacts
- Test: Run automated tests (unit, integration, etc.)
- Scan: Perform security scanning, dependency vulnerability checks
- Publish: Push artifacts to registries (container, package, etc.)
- Deploy: Release to target environment
- Verify: Run post-deployment checks and smoke tests
- Monitor: Track system health and metrics after deployment
Define approval requirements for each environment:
- Development: Automated checks only
- Testing/QA: Lead developer approval
- Staging: QA approval and product owner sign-off
- Production: Management approval and change management process
Every deployment must pass the following checks:
- All automated tests pass
- Security scan shows no critical or high vulnerabilities
- Infrastructure validation tests pass
- Required approvals have been obtained
- Release documentation is complete
Use Semantic Versioning (SemVer) for all deployable artifacts:
- MAJOR.MINOR.PATCH (e.g., 2.3.1)
- Increment MAJOR version for incompatible API changes
- Increment MINOR version for backward-compatible new features
- Increment PATCH version for backward-compatible bug fixes
- Tag all releases in the version control system with the version number
- Store the current version information in a dedicated location in each environment
- Include version information in logs and monitoring data
- Make the version visible in the application's admin interface or API
Standardize on the following artifact types:
- Docker containers for applications
- NPM/PyPI packages for libraries
- Helm charts for Kubernetes deployments
- Terraform modules for infrastructure
- OS-specific packages (RPM, DEB) for system components
All artifact registries must:
- Require authentication for uploads and restricted downloads
- Support artifact signing and verification
- Maintain artifact retention policies
- Be backed up regularly
- Support versioning and immutability
Use the following registries:
- Container images: Harbor or AWS ECR
- NPM packages: GitHub Packages or JFrog Artifactory
- Python packages: JFrog Artifactory
- Helm charts: Harbor or JFrog Artifactory
- Infrastructure modules: GitLab or GitHub
Manage configuration using the following hierarchy:
- Default values: Hard-coded in application
- Configuration files: Version-controlled with the application
- Environment variables: For environment-specific settings
- Configuration service: For dynamic runtime configuration
Handle sensitive information appropriately:
- Never commit secrets to version control
- Use a dedicated secrets management service (HashiCorp Vault, AWS Secrets Manager)
- Encrypt secrets at rest and in transit
- Rotate secrets regularly
- Use different secrets for each environment
Validate configurations before deployment:
- Verify required configuration is present
- Check types and formats
- Validate connections to external services
- Test with the configuration before promoting to next environment
Standardize on the following CI/CD tools:
- GitHub Actions or GitLab CI for pipeline orchestration
- Terraform or AWS CloudFormation for infrastructure as code
- Ansible or Puppet for configuration management
- ArgoCD or Flux for GitOps-based deployments
Store pipeline configurations as code:
- Keep pipeline definitions in the same repository as application code
- Version pipeline changes alongside application changes
- Review pipeline changes through the same process as code changes
- Test pipeline changes in development environments first
Include the following tests in deployment pipelines:
- Unit tests
- Integration tests
- End-to-end tests
- Performance tests (for staging/production)
- Security scans
- Infrastructure validation tests
Choose the appropriate deployment strategy based on application requirements:
For applications requiring minimal downtime with higher resource usage:
- Deploy new version (green) alongside existing version (blue)
- Test the green environment
- Switch traffic from blue to green
- Verify operation
- Decommission blue environment
For gradual rollouts with early feedback:
- Deploy new version to a small subset of infrastructure (e.g., 5%)
- Route a percentage of traffic to the new version
- Monitor for issues
- Gradually increase traffic percentage
- Proceed to full deployment or rollback based on monitoring
For resource-efficient deployments:
- Deploy new version to a subset of instances/pods
- Verify proper operation
- Continue deploying to more instances in batches
- Complete when all instances are updated
Use feature flags for controlled feature releases:
- Deploy code with new features behind disabled flags
- Enable features selectively for specific users or environments
- Roll out features gradually by increasing the percentage of users
- Monitor feature performance and issues
- Enable for all users or roll back as needed
Define clear criteria for initiating a rollback:
- Error rates exceed defined thresholds
- Response times exceed SLA limits
- Critical functionality is unavailable
- Security vulnerability is discovered
- Stakeholder decision based on business impact
Document and test rollback procedures for each application:
- Determine the need for rollback based on triggers
- Follow application-specific rollback procedure:
- Revert to previous version
- Restore database if necessary
- Reset configuration as needed
- Verify application functionality after rollback
- Communicate rollback to stakeholders
- Document the rollback incident
Regularly test rollback procedures:
- Include rollback tests in deployment pipelines
- Simulate failures and practice recovery
- Time rollback operations to ensure they meet SLA requirements
Create comprehensive release notes for each production deployment:
- Summary: Brief overview of the release
- Version: Clear version identifier
- Date: Deployment date
- Features: New features and enhancements
- Bug fixes: Issues resolved
- Known issues: Outstanding problems
- Dependencies: External system dependencies
- Configuration changes: Any new or modified configuration
- Migration steps: Required actions for users or administrators
- Rollback plan: Specific rollback instructions for this release
Make release notes available through:
- Internal documentation system for all environments
- Customer-facing documentation for production releases
- Email notifications to stakeholders
- Release management system (Jira, Azure DevOps, etc.)
Run automated smoke tests immediately after deployment:
- Verify basic functionality
- Check critical paths and workflows
- Validate integrations with external systems
- Confirm metrics and logging are working
Define and verify acceptance criteria for each deployment:
- Functional requirements are met
- Non-functional requirements (performance, security) are satisfied
- No regression in existing functionality
- Documentation is complete and accurate
For complex applications, use progressive exposure:
- Deploy to internal users first
- Expand to beta/early adopters
- Gradually increase to full user base
- Monitor each expansion phase
For Kubernetes deployments:
- Use Helm charts for application packaging
- Implement namespace isolation between environments
- Apply resource quotas and limits
- Use network policies to restrict traffic
- Store Kubernetes manifests in version control
- Follow GitOps practices using ArgoCD or Flux
Example Deployment Configuration:
# Helm values.yaml for a standard web application
replicaCount: 3
image:
repository: bayat/myapp
tag: 1.2.3
pullPolicy: IfNotPresent
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 200m
memory: 256Mi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: myapp.bayat.io
paths:
- path: /
pathType: Prefix
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
readOnlyRootFilesystem: true
For serverless deployments:
- Use infrastructure as code (AWS SAM, Serverless Framework)
- Implement separate AWS accounts for each environment
- Set appropriate resource limits and concurrency controls
- Use API Gateway for routing and authorization
- Configure monitoring and alerting
- Version Lambda functions and API configurations
Example Serverless Configuration:
# serverless.yml for a typical serverless application
service: bayat-service
provider:
name: aws
runtime: nodejs14.x
region: us-west-2
stage: ${opt:stage, 'dev'}
environment:
STAGE: ${self:provider.stage}
LOG_LEVEL: ${self:custom.logLevels.${self:provider.stage}}
iamRoleStatements:
- Effect: Allow
Action:
- dynamodb:Query
- dynamodb:GetItem
Resource: !GetAtt MyTable.Arn
custom:
logLevels:
dev: DEBUG
test: INFO
staging: INFO
prod: WARN
functions:
api:
handler: src/handlers/api.handler
events:
- http:
path: /users
method: get
authorizer:
type: COGNITO_USER_POOLS
authorizerId: !Ref ApiGatewayAuthorizer
- http:
path: /users/{id}
method: get
resources:
Resources:
MyTable:
Type: AWS::DynamoDB::Table
Properties:
BillingMode: PAY_PER_REQUEST
KeySchema:
- AttributeName: id
KeyType: HASH
AttributeDefinitions:
- AttributeName: id
AttributeType: S
For traditional VM/server deployments:
- Use infrastructure as code (Terraform, CloudFormation)
- Implement configuration management (Ansible, Puppet)
- Create standard machine images (AMIs, Vagrant boxes)
- Apply host-based security measures
- Set up proper backup and recovery procedures
- Document manual recovery steps
Example Terraform Configuration:
# main.tf for a typical web server
provider "aws" {
region = var.region
}
module "vpc" {
source = "./modules/vpc"
environment = var.environment
}
module "security_groups" {
source = "./modules/security_groups"
vpc_id = module.vpc.vpc_id
environment = var.environment
}
resource "aws_instance" "web" {
count = var.instance_count
ami = var.ami_id
instance_type = var.instance_type
subnet_id = module.vpc.public_subnets[count.index % length(module.vpc.public_subnets)]
vpc_security_group_ids = [module.security_groups.web_sg_id]
key_name = var.key_name
tags = {
Name = "web-${var.environment}-${count.index}"
Environment = var.environment
Project = var.project_name
ManagedBy = "terraform"
}
root_block_device {
volume_size = 50
volume_type = "gp2"
encrypted = true
}
lifecycle {
create_before_destroy = true
}
}
resource "aws_elb" "web" {
name = "web-${var.environment}-elb"
subnets = module.vpc.public_subnets
security_groups = [module.security_groups.elb_sg_id]
listener {
instance_port = 80
instance_protocol = "http"
lb_port = 443
lb_protocol = "https"
ssl_certificate_id = var.ssl_cert_arn
}
health_check {
healthy_threshold = 2
unhealthy_threshold = 2
timeout = 3
target = "HTTP:80/health"
interval = 30
}
instances = aws_instance.web[*].id
cross_zone_load_balancing = true
idle_timeout = 400
tags = {
Name = "web-${var.environment}-elb"
Environment = var.environment
Project = var.project_name
ManagedBy = "terraform"
}
}
Verify security before deployment:
- Run SAST (Static Application Security Testing)
- Perform dependency vulnerability scanning
- Review infrastructure configurations for security issues
- Validate IAM/RBAC configurations
- Check for secrets in code
Implement runtime security measures:
- Deploy web application firewalls (WAF)
- Enable runtime application security protection (RASP)
- Configure network security groups and access controls
- Implement API rate limiting
- Enable audit logging
For regulated environments:
- Document compliance requirements
- Include compliance checks in deployment pipelines
- Generate compliance artifacts for auditing
- Require security sign-off for production deployments
Define standard maintenance windows:
- Production: Weekly, during lowest traffic periods (e.g., Sundays 2-5 AM)
- Staging: Bi-weekly, during business hours with notification
- Testing/QA: As needed with team coordination
- Development: No formal window required
For each maintenance window:
- Announce maintenance period to stakeholders
- Prepare rollback plans
- Execute planned changes
- Verify functionality after changes
- Communicate completion of maintenance
- Document actions taken
For unscheduled urgent changes:
- Assess impact and urgency
- Obtain expedited approvals
- Communicate to critical stakeholders
- Implement changes with heightened monitoring
- Document incident and follow-up actions
Monitor deployments in progress:
- Track deployment progress and status
- Monitor system health during deployment
- Compare performance metrics before and after deployment
- Set up alerting for deployment failures
After deployment, monitor:
- Error rates and exceptions
- Response times and latency
- Resource utilization (CPU, memory, disk)
- Business metrics (transactions, user activity)
- Security events
Configure appropriate alerting:
- Define clear thresholds based on SLAs and normal behavior
- Set different alerting levels (info, warning, critical)
- Assign alerts to the right teams
- Implement alert aggregation to prevent alert fatigue
- Document response procedures for each alert type