The SecureBootDashboard API has been enhanced with enterprise-grade performance and scalability features to support high-throughput scenarios, including up to 5000 requests per second.
Purpose: Protect the API from being overwhelmed by too many requests and ensure fair resource distribution.
Configuration (appsettings.json):
"Performance": {
"RateLimiting": {
"Enabled": true,
"PermitLimit": 1000,
"WindowSeconds": 60,
"ConcurrencyLimit": 500,
"QueueLimit": 1000
}
}Parameters:
Enabled: Enable/disable rate limitingPermitLimit: Maximum requests allowed per time window (default: 1000 requests per 60 seconds)WindowSeconds: Time window in seconds for the sliding window limiterConcurrencyLimit: Maximum number of concurrent requests being processedQueueLimit: Maximum number of requests that can be queued when limits are reached
Policies:
api: Sliding window limiter for general API endpoints (1000 req/min)concurrent: Concurrency limiter for expensive operations (500 concurrent requests)health: Fixed window limiter for health checks (100 req/sec)
HTTP Response:
- Status Code:
429 Too Many Requestswhen limit is exceeded - Clients should implement exponential backoff retry logic
Purpose: Cache responses for frequently accessed endpoints to reduce database load and improve response times.
Configuration (appsettings.json):
"Performance": {
"OutputCaching": {
"Enabled": true,
"DeviceListCacheDuration": 30,
"DeviceDetailsCacheDuration": 60,
"StatisticsCacheDuration": 30,
"UseRedis": false,
"RedisConnectionString": null
}
}Parameters:
Enabled: Enable/disable output cachingDeviceListCacheDuration: Cache duration for device list endpoint (in seconds)DeviceDetailsCacheDuration: Cache duration for device details endpoint (in seconds)StatisticsCacheDuration: Cache duration for statistics endpoint (in seconds)UseRedis: Use Redis for distributed caching across multiple instancesRedisConnectionString: Connection string for Redis (required if UseRedis is true)
Cache Policies:
DeviceList: Caches device list responses, varies by query parametersDeviceDetails: Caches individual device details, varies by device IDStatistics: Caches dashboard statistics
Cache Invalidation:
- Caches automatically expire based on configured durations
- POST operations (new reports) do not invalidate caches immediately
- For real-time updates, use SignalR instead of polling cached endpoints
Purpose: Reduce bandwidth usage and improve response times by compressing API responses.
Configuration (appsettings.json):
"Performance": {
"Compression": {
"Enabled": true,
"Level": "Optimal"
}
}Parameters:
Enabled: Enable/disable response compressionLevel: Compression level -Fastest,Optimal, orSmallestSize
Supported Encodings:
- Brotli (preferred, better compression)
- Gzip (fallback)
Client Support:
- Modern browsers automatically support compression
- API clients should include
Accept-Encoding: br, gzipheader
Purpose: Optimize database connections for high-throughput scenarios.
Configuration (appsettings.json):
"Performance": {
"Database": {
"MaxPoolSize": 200,
"MinPoolSize": 10,
"CommandTimeout": 30,
"EnableQuerySplitting": true,
"EnableCompiledQueries": true
}
}Parameters:
MaxPoolSize: Maximum number of database connections in the pool (default: 200)MinPoolSize: Minimum number of database connections to maintain (default: 10)CommandTimeout: Command timeout in seconds (default: 30)EnableQuerySplitting: Split complex queries for better performanceEnableCompiledQueries: Use compiled queries for frequently executed operations
Best Practices:
- Connection pool is shared across all requests
- Connections are automatically returned to the pool when disposed
- Monitor pool usage to avoid connection exhaustion
Purpose: Optimize real-time communication for high-load scenarios.
Configuration (in Program.cs):
builder.Services.AddSignalR(options =>
{
options.EnableDetailedErrors = false; // Production
options.KeepAliveInterval = TimeSpan.FromSeconds(10);
options.ClientTimeoutInterval = TimeSpan.FromMinutes(2);
options.HandshakeTimeout = TimeSpan.FromSeconds(30);
options.MaximumReceiveMessageSize = null; // Unlimited
});Parameters:
KeepAliveInterval: How often server sends ping to client (10 seconds)ClientTimeoutInterval: Timeout before considering client disconnected (2 minutes)HandshakeTimeout: Timeout for initial connection handshake (30 seconds)
The API provides enhanced health check endpoints with detailed status information:
Endpoint: GET /health
Response Format:
{
"status": "Healthy",
"totalDuration": "00:00:00.0234567",
"entries": {
"database": {
"status": "Healthy",
"duration": "00:00:00.0123456"
}
}
}- Request Rate: Requests per second across all endpoints
- Response Time: P50, P95, P99 latency percentiles
- Error Rate: 4xx and 5xx response rates
- Cache Hit Ratio: Percentage of requests served from cache
- Database Connection Pool: Active connections, waiting requests
- Rate Limit Rejections: 429 response count
- SignalR Connections: Active WebSocket connections
With the configured settings:
- Maximum Throughput: 5000+ requests/second
- Average Response Time: < 100ms (cached), < 500ms (uncached)
- Concurrent Connections: 500 simultaneous requests
- Database Connections: 200 maximum pool size
Horizontal Scaling (Multiple Instances):
- Enable Redis for distributed caching:
"OutputCaching": { "UseRedis": true, "RedisConnectionString": "your-redis-connection-string" }
- Use Azure App Service scale-out or Kubernetes
- Configure sticky sessions for SignalR or use Redis backplane
- Use Azure Load Balancer or Application Gateway
Vertical Scaling (Larger Instance):
- Increase
MaxPoolSizebased on CPU cores (formula: cores × 2 + effective_spindle_count) - Increase
ConcurrencyLimitfor higher throughput - Allocate more memory for in-memory caching
Database Scaling:
- Use Azure SQL Database with appropriate service tier
- Enable read replicas for read-heavy workloads
- Implement database sharding for extreme scale
- Consider Azure SQL Hyperscale for unlimited storage
- Apache JMeter: Full-featured load testing
- k6: Modern load testing with JavaScript
- Azure Load Testing: Cloud-based load testing service
- wrk: High-performance HTTP benchmarking
import http from 'k6/http';
import { check, sleep } from 'k6';
export let options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp up to 100 users
{ duration: '5m', target: 100 }, // Stay at 100 users
{ duration: '2m', target: 1000 }, // Ramp up to 1000 users
{ duration: '5m', target: 1000 }, // Stay at 1000 users
{ duration: '2m', target: 0 }, // Ramp down to 0 users
],
};
export default function () {
// Test device list endpoint
let response = http.get('https://your-api/api/Devices', {
headers: { 'Accept-Encoding': 'br, gzip' }
});
check(response, {
'status is 200': (r) => r.status === 200,
'response time < 500ms': (r) => r.timings.duration < 500,
});
sleep(1);
}Target metrics for 5000 req/sec:
- P50 latency: < 50ms
- P95 latency: < 200ms
- P99 latency: < 500ms
- Error rate: < 0.1%
- Cache hit ratio: > 80%
Symptoms: Slow API responses, timeouts
Diagnosis:
- Check database query performance (enable query logging)
- Verify cache hit ratio
- Monitor database connection pool exhaustion
- Check for N+1 query problems
Solutions:
- Increase cache duration for stable data
- Optimize database queries
- Increase
MaxPoolSizeif pool is exhausted - Enable query splitting for complex queries
Symptoms: Clients receiving 429 responses
Diagnosis:
- Check rate limiter configuration
- Monitor request patterns
- Identify clients sending excessive requests
Solutions:
- Increase
PermitLimitorWindowSeconds - Implement client-side retry with exponential backoff
- Use separate rate limit policies for different client types
- Implement API key-based rate limiting
Symptoms: Stale data, outdated responses
Diagnosis:
- Check cache duration settings
- Verify cache invalidation strategy
- Monitor cache memory usage
Solutions:
- Reduce cache duration for frequently changing data
- Implement cache invalidation on data updates
- Use Redis for distributed caching across instances
- Add cache tags for selective invalidation
Symptoms: Connection timeout errors, slow responses
Diagnosis:
- Monitor active connections in pool
- Check for long-running queries
- Verify
usingstatements dispose connections properly
Solutions:
- Increase
MaxPoolSize - Optimize long-running queries
- Implement connection retry logic
- Use asynchronous database operations
- Rate Limiting: Protects against DDoS and brute-force attacks
- Compression: Can be exploited for compression-based attacks (BREACH)
- Mitigation: Use HTTPS, avoid including secrets in compressed responses
- Caching: May expose sensitive data if not configured properly
- Mitigation: Don't cache responses with user-specific data
- Connection Pooling: Ensure proper connection string security
- Always Monitor: Set up Application Insights or similar monitoring
- Load Test Regularly: Test before major releases and traffic spikes
- Optimize Queries: Use AsNoTracking() for read-only operations
- Async All The Way: Use async/await throughout the stack
- Cache Wisely: Cache stable, expensive operations; invalidate when necessary
- Scale Horizontally: Design for multiple instances from the start
- Use CDN: Offload static assets and reduce API load
- Implement Circuit Breakers: Protect against cascading failures
- Log Appropriately: Balance between observability and performance
- Plan for Failures: Implement retry logic, graceful degradation
- Configure appropriate rate limits for your workload
- Set cache durations based on data freshness requirements
- Enable response compression in production
- Configure database connection pool for expected load
- Set up monitoring and alerting
- Perform load testing at expected peak load
- Configure auto-scaling rules
- Implement health check monitoring
- Set up distributed caching if using multiple instances
- Document performance SLAs and monitor compliance