Skip to content

Commit f9ccc0d

Browse files
authored
Enhance K8s deployment health checks by adding "HealthyRecovery" to default log pattern categories in runbook.robot. Update Azure Service Bus log patterns in k8s_log.py to remove unnecessary conditions and introduce a new pattern for Azure Cosmos DB connection establishment, improving log analysis capabilities and clarity in monitoring Azure services. (#538)
1 parent 603bd00 commit f9ccc0d

2 files changed

Lines changed: 9 additions & 3 deletions

File tree

codebundles/k8s-deployment-healthcheck/runbook.robot

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ Suite Initialization
7878
... description=Comma-separated list of log pattern categories to scan for.
7979
... pattern=.*
8080
... example=GenericError,AppFailure,StackTrace,Connection
81-
... default=GenericError,AppFailure,StackTrace,Connection,Timeout,Auth,Exceptions,Resource
81+
... default=GenericError,AppFailure,StackTrace,Connection,Timeout,Auth,Exceptions,Resource,HealthyRecovery
8282
${ANOMALY_THRESHOLD}= RW.Core.Import User Variable ANOMALY_THRESHOLD
8383
... type=string
8484
... description=The threshold for detecting event anomalies based on events per minute.

libraries/RW/K8sLog/k8s_log.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -211,7 +211,7 @@ def _load_error_patterns(self) -> Dict[str, Any]:
211211
},
212212
{
213213
"name": "azure_servicebus_link_lifecycle",
214-
"pattern": r"(?i).*(?:Freeing resources due to error|link.*is force detached).*(?:IdleTimerExpired|Idle timeout)",
214+
"pattern": r"(?i).*(?:Freeing resources due to error|link.*is force detached)",
215215
"description": "Azure Service Bus normal link lifecycle and cleanup",
216216
"exclude": True
217217
},
@@ -220,6 +220,12 @@ def _load_error_patterns(self) -> Dict[str, Any]:
220220
"pattern": r"(?i).*Reactor selectable is being disposed.*connectionId",
221221
"description": "Azure Service Bus normal reactor cleanup",
222222
"exclude": True
223+
},
224+
{
225+
"name": "azure_cosmosdb_connection_establishment",
226+
"pattern": r"(?i).*Getting database account endpoint from.*\.documents\.azure\.com",
227+
"description": "Azure Cosmos DB normal connection establishment",
228+
"exclude": True
223229
}
224230
],
225231
"patterns": {
@@ -308,7 +314,7 @@ def _load_error_patterns(self) -> Dict[str, Any]:
308314
},
309315
{
310316
"name": "azure_servicebus_link_lifecycle",
311-
"pattern": r"(?i).*(?:Freeing resources due to error|link.*is force detached).*(?:IdleTimerExpired|Idle timeout)",
317+
"pattern": r"(?i).*(?:Freeing resources due to error|link.*is force detached)",
312318
"severity": 5,
313319
"next_steps": ["This is normal Azure Service Bus link lifecycle management", "Links are cleaned up after idle timeout and recreated as needed", "No action required - this indicates healthy connection management"]
314320
},

0 commit comments

Comments
 (0)