Connection Health Monitoring

This example demonstrates the enhanced connection health monitoring capabilities in Python libp2p, which provides sophisticated connection health tracking, proactive monitoring, health-aware load balancing, and advanced metrics collection.

Overview

Note

The code snippets throughout this document are excerpts demonstrating key concepts. They cannot be copy-pasted and run directly as they require an async context. For complete, runnable examples, see examples/health_monitoring/basic_example.py.

Connection health monitoring enhances the existing multiple connections per peer support by adding:

Health Metrics Tracking: Latency, success rates, stream counts, and more
Proactive Health Checks: Periodic monitoring and automatic connection replacement
Health-Aware Load Balancing: Route traffic to the healthiest connections
Automatic Recovery: Replace unhealthy connections automatically

Basic Setup

To enable connection health monitoring, configure the ConnectionConfig with health monitoring parameters and pass it to new_host(). The following is a snippet; the full runnable script is in examples/health_monitoring/basic_example.py.

from libp2p import new_host
from libp2p.network.config import ConnectionConfig
from libp2p.crypto.rsa import create_new_key_pair

# Enable health monitoring
connection_config = ConnectionConfig(
    enable_health_monitoring=True,
    health_check_interval=30.0,  # Check every 30 seconds
    ping_timeout=3.0,  # 3 second ping timeout
    min_health_threshold=0.4,  # Minimum health score
    min_connections_per_peer=2,  # Maintain at least 2 connections
    load_balancing_strategy="health_based"  # Use health-based selection
)

# Create host with health monitoring
host = new_host(
    key_pair=create_new_key_pair(),
    connection_config=connection_config
)

Configuration Options

Health Monitoring Settings

enable_health_monitoring: Enable/disable health monitoring (default: False)
health_check_interval: Interval between health checks in seconds (default: 60.0)
ping_timeout: Timeout for ping operations in seconds (default: 5.0)
min_health_threshold: Minimum health score (0.0-1.0) for connections (default: 0.3)
min_connections_per_peer: Minimum connections to maintain per peer (default: 1)

Load Balancing Strategies

round_robin: Simple round-robin selection (default)
least_loaded: Select connection with fewest streams
health_based: Select connection with highest health score
latency_based: Select connection with lowest latency

Health Metrics

The system tracks various connection health metrics:

Basic Metrics: - Ping Latency: Response time for health checks - Success Rate: Percentage of successful operations - Stream Count: Number of active streams - Connection Age: How long the connection has been established - Health Score: Overall health rating (0.0 to 1.0)

Advanced Metrics: - Bandwidth Usage: Real-time bandwidth tracking with time windows - Error History: Detailed error tracking with timestamps - Connection Events: Lifecycle event logging (establishment, closure, etc.) - Connection Stability: Error rate-based stability scoring - Peak/Average Bandwidth: Performance trend analysis

Host-Level Health Monitoring API

The health monitoring features are now accessible through the high-level host API:

# Access health information through the host interface

# Get health summary for a specific peer
peer_health = host.get_connection_health(peer_id)
print(f"Peer health: {peer_health}")

# Get global network health summary
network_health = host.get_network_health_summary()
print(f"Total peers: {network_health.get('total_peers', 0)}")
print(f"Total connections: {network_health.get('total_connections', 0)}")
print(f"Average health: {network_health.get('average_peer_health', 0.0)}")

# Export metrics in different formats
json_metrics = host.export_health_metrics("json")
prometheus_metrics = host.export_health_metrics("prometheus")

Example: Health-Based Load Balancing

from libp2p import new_host
from libp2p.network.config import ConnectionConfig
from libp2p.crypto.rsa import create_new_key_pair

# Configure for production use with health-based load balancing
connection_config = ConnectionConfig(
    enable_health_monitoring=True,
    max_connections_per_peer=5,  # More connections for redundancy
    health_check_interval=120.0,  # Less frequent checks in production
    ping_timeout=10.0,  # Longer timeout for slow networks
    min_health_threshold=0.6,  # Higher threshold for production
    min_connections_per_peer=3,  # Maintain more connections
    load_balancing_strategy="health_based"  # Prioritize healthy connections
)

host = new_host(
    key_pair=create_new_key_pair(),
    connection_config=connection_config
)

# Use host as normal - health monitoring works transparently
# In your async main() function:
#   async with host.run(listen_addrs=["/ip4/127.0.0.1/tcp/0"]):
#       # Health monitoring and load balancing happen automatically
#       stream = await host.new_stream(peer_id, ["/echo/1.0.0"])

Example: Advanced Health Monitoring

The enhanced health monitoring provides advanced capabilities:

from libp2p import new_host
from libp2p.network.config import ConnectionConfig
from libp2p.crypto.rsa import create_new_key_pair

# Advanced health monitoring with comprehensive tracking
connection_config = ConnectionConfig(
    enable_health_monitoring=True,
    health_check_interval=15.0,  # More frequent checks
    ping_timeout=2.0,  # Faster ping timeout
    min_health_threshold=0.5,  # Higher threshold
    min_connections_per_peer=2,
    load_balancing_strategy="health_based",
    # Advanced health scoring configuration
    latency_weight=0.4,
    success_rate_weight=0.4,
    stability_weight=0.2,
    max_ping_latency=1000.0,  # ms
    min_ping_success_rate=0.7,
    max_failed_streams=5
)

host = new_host(
    key_pair=create_new_key_pair(),
    connection_config=connection_config
)

# Access advanced health metrics through host API
# In your async main() function:
#   async with host.run(listen_addrs=["/ip4/127.0.0.1/tcp/0"]):
#       peer_health = host.get_connection_health(peer_id)
#       global_health = host.get_network_health_summary()
#       json_metrics = host.export_health_metrics("json")
#       prometheus_metrics = host.export_health_metrics("prometheus")

Example: Latency-Based Load Balancing

# Optimize for lowest latency connections
connection_config = ConnectionConfig(
    enable_health_monitoring=True,
    load_balancing_strategy="latency_based",  # Route to lowest latency
    health_check_interval=30.0,
    ping_timeout=5.0,
    max_connections_per_peer=3
)

host = new_host(
    key_pair=create_new_key_pair(),
    connection_config=connection_config
)

# Streams will automatically route to lowest latency connections

Example: Disabling Health Monitoring

For performance-critical scenarios, health monitoring can be disabled:

# Disable health monitoring for maximum performance
connection_config = ConnectionConfig(
    enable_health_monitoring=False,
    load_balancing_strategy="round_robin"  # Fall back to simple strategy
)

host = new_host(
    key_pair=create_new_key_pair(),
    connection_config=connection_config
)

# Host operates with minimal overhead, no health monitoring

Backwards Compatibility

Health monitoring is fully backwards compatible:

# Existing code continues to work unchanged
host = new_host()  # Uses default configuration (health monitoring disabled)

# Only when you explicitly enable it does health monitoring activate
config = ConnectionConfig(enable_health_monitoring=True)
host_with_health = new_host(connection_config=config)

Running the Example

To run the connection health monitoring example:

python examples/health_monitoring/basic_example.py

This will demonstrate:

Basic health monitoring setup through host API
Different load balancing strategies
Health metrics access and export
API consistency with existing examples

Benefits

API Consistency: Health monitoring now works with the same high-level new_host() API used in all examples
Production Reliability: Prevent silent failures by detecting unhealthy connections early
Performance Optimization: Route traffic to healthiest connections, reduce latency
Operational Visibility: Monitor connection quality in real-time through host interface
Automatic Recovery: Replace degraded connections automatically
Standard Compliance: Match capabilities of Go and JavaScript libp2p implementations

Integration with Existing Code

Health monitoring integrates seamlessly with existing host-based code:

All new features are optional and don't break existing code
Health monitoring can be enabled/disabled per host instance
Existing examples work unchanged - just add connection_config parameter
Backward compatibility is maintained
No need to switch from new_host() to low-level swarm APIs

host = new_host()  # Basic usage
host = new_host(connection_config=config)  # Health monitoring - same API

For more information, see the :doc:`../libp2p.network` module documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Connection Health Monitoring

Overview

Basic Setup

Configuration Options

Health Monitoring Settings

Load Balancing Strategies

Health Metrics

Host-Level Health Monitoring API

Example: Health-Based Load Balancing

Example: Advanced Health Monitoring

Example: Latency-Based Load Balancing

Example: Disabling Health Monitoring

Backwards Compatibility

Running the Example

Benefits

Integration with Existing Code

FilesExpand file tree

examples.connection_health_monitoring.rst

Latest commit

History

examples.connection_health_monitoring.rst

File metadata and controls

Connection Health Monitoring

Overview

Basic Setup

Configuration Options

Health Monitoring Settings

Load Balancing Strategies

Health Metrics

Host-Level Health Monitoring API

Example: Health-Based Load Balancing

Example: Advanced Health Monitoring

Example: Latency-Based Load Balancing

Example: Disabling Health Monitoring

Backwards Compatibility

Running the Example

Benefits

Integration with Existing Code