Skip to content

Commit 03b6eaa

Browse files
committed
Refactor HANA DB restore tasks for improved validation and clarity
1 parent b4f42cd commit 03b6eaa

16 files changed

Lines changed: 1652 additions & 503 deletions

docs/AZURE_BACKUP.md

Lines changed: 21 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,20 @@
1-
# Azure Backup Functional Testing for SAP HANA
1+
# Functional Test for Azure Backup for SAP HANA
22

3-
The SAP Testing Automation Framework includes an Azure Backup testing component that validates backup and restore operations for SAP HANA databases deployed on Azure. It exercises the [Azure Backup for SAP HANA](https://learn.microsoft.com/azure/backup/sap-hana-database-about) service through the Python SDK (`azure-mgmt-recoveryservicesbackup`) and native HANA recovery commands.
3+
The SAP Testing Automation Framework includes an Azure Backup testing component that validates the configuration of Azure Backup infrastructure and functionality of restore operations by performing actual restore for SAP HANA databases deployed on Azure.
4+
5+
> **Important:** This is a **testing and validation tool only**. It is designed to verify that Azure Backup is correctly configured and that restore operations function as expected. It should **not** be used as a substitute for actual SAP HANA database restore procedures in any scenario.
46
57
## Supported Scenarios
68

79
The framework supports both **HA (two-node cluster)** and **Non-HA (single-node)** HANA deployments. Five test cases cover the end-to-end backup-restore lifecycle:
810

911
| # | Test Case | Task Name | Description |
1012
|---|-----------|-----------|-------------|
11-
| 1 | Azure Backup Setup Verification | `backup-setup-verification` | Discovers all protected HANA databases in the Recovery Services vault, verifies backup configuration health, and checks that recent restore points exist. |
13+
| 1 | Azure Backup Configuration Validation | `backup-setup-verification` | Discovers all protected HANA databases in the Recovery Services vault, verifies backup configuration health, and checks that recent restore points exist. |
1214
| 2 | Restore Backup to HANA DB | `restore-to-db` | Triggers a full or point-in-time restore to the original HANA database via Azure Backup, monitors the restore job, then validates HANA is running. |
1315
| 3 | Restore Backup to FileSystem | `restore-to-filesystem` | Restores the HANA backup as files to a filesystem path, verifies the files are present, then recovers the HANA DB from those files and validates it is operational. |
1416
| 4 | Recover DB using Database Commands | `recover-db-commands` | Tests native HANA recovery using `recoverSys.py` / `RECOVER DATA`. Queries the backup catalog, stops HANA, performs recovery, restarts, and validates consistency. |
15-
| 5 | Cross-VM Restore | `restore-cross-vm` | Restores a HANA backup from VM-1 to VM-2 (AlternateWorkloadRestore). Validates the target HANA instance starts and the database is consistent. Requires ≥ 2 HANA nodes. |
17+
| 5 | Cross-VM Restore | `restore-cross-vm` | Restores **tenant databases only** from VM-1 to VM-2 (AlternateWorkloadRestore). SYSTEMDB is not restored in cross-VM scenarios. Validates the target HANA instance starts and the databases are consistent. |
1618

1719
## Prerequisites
1820

@@ -40,7 +42,7 @@ For identity setup, see [Setup Guide — Identity and Authorization](./SETUP.MD#
4042
- The management server must have SSH connectivity to all HANA DB hosts.
4143
- The `<sid>adm` user must be able to run `HDB stop`, `HDB start`, `sapcontrol`, and `hdbsql` commands.
4244
- For test case 3 (restore-to-filesystem), the target filesystem path must be writable.
43-
- For test case 5 (cross-VM restore), at least 2 HANA nodes must be in the inventory.
45+
- For test case 5 (cross-VM restore), target VM information must be provided.
4446

4547
## Configuration
4648

@@ -58,22 +60,25 @@ SAP_FUNCTIONAL_TEST_TYPE: AzureBackupDatabase
5860
Add the following variables to your system's `sap-parameters.yaml` file (under `WORKSPACES/SYSTEM/<SYSTEM_CONFIG_NAME>/`):
5961

6062
```yaml
61-
# Required: Recovery Services vault resource ID
62-
backup_vault_resource_id: "/subscriptions/xxxx/resourceGroups/my-backup-rg/providers/Microsoft.RecoveryServices/vaults/my-rsv-vault"
63+
# Recovery Services vault ARM resource ID
64+
backup_vault_resource_id: "/subscriptions/xxxx/resourceGroups/my-rg/providers/Microsoft.RecoveryServices/vaults/my-vault"
6365
64-
# Required for restore test cases (2-5)
65-
backup_container_name: "VMAppContainer;Compute;my-rg;hanavm01"
66-
backup_item_name: "saphanadatabase;h05;systemdb"
66+
# Whether to restore SYSTEMDB (set false to skip)
67+
backup_restore_systemdb: true
68+
# Restrict tenant restore to specific DBs (empty = all tenants)
69+
backup_restore_tenants: [] # e.g. ["HDB", "H01"]
6770
68-
# Required for filesystem restore (test case 3)
69-
backup_target_filesystem_path: "/hana/backup/restore"
71+
# Target path for file-based restore; must be writable
72+
backup_target_filesystem_path: "/hana/backup/restore/"
7073
71-
# Required for cross-VM restore (test case 5)
72-
backup_target_container_name: "VMAppContainer;Compute;my-rg;hanavm02"
73-
backup_target_database_name: "SYSTEMDB"
74+
# Target VM hostname (source VM should be able to resolve this hostname)
75+
backup_target_vm_name: ""
7476
75-
# Optional: point-in-time restore (ISO 8601 UTC timestamp)
77+
# Point-in-time restore (optional, ISO 8601 UTC)
7678
backup_restore_point_time: ""
79+
80+
# HANA Key (created as part of pre-registration for Azure Backup)
81+
hana_userstore_key: "SYSTEM"
7782
```
7883

7984
### 3. User-Assigned Managed Identity (Optional)

src/module_utils/backup_discovery.py

Lines changed: 121 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -31,10 +31,10 @@
3131
class BackupDiscovery:
3232
"""Discovery and validation for Azure Backup HANA items.
3333
34-
3534
:param client: Authenticated Recovery Services Backup client.
3635
:param vault_name: Name of the Recovery Services vault.
3736
:param vault_resource_group: Resource group of the vault.
37+
:param source_vm_name: Azure VM name to scope results to.
3838
:param parameter_definitions: YAML-loaded parameter defs for HTML report generation.
3939
:param log_fn: Optional callback ``(level, message)``.
4040
"""
@@ -49,17 +49,29 @@ def __init__(
4949
client: RecoveryServicesBackupClient,
5050
vault_name: str,
5151
vault_resource_group: str,
52+
source_vm_name: str = "",
5253
parameter_definitions: Optional[List[Dict[str, str]]] = None,
5354
log_fn: Optional[Callable[[int, str], None]] = None,
5455
) -> None:
5556
self._client = client
5657
self._vault_name = vault_name
5758
self._vault_rg = vault_resource_group
59+
self._source_vm = (source_vm_name or "").lower().strip()
5860
self._param_defs: List[Dict[str, str]] = (
5961
parameter_definitions if parameter_definitions else []
6062
)
6163
self._log = log_fn or (lambda _lvl, _msg: None)
6264

65+
def _matches_source_vm(self, container_name: str) -> bool:
66+
"""Check whether a container belongs to the source VM.
67+
68+
:param container_name: Backup container name.
69+
:returns: ``True`` when no filter is set or the container contains the configured VM name.
70+
"""
71+
if not self._source_vm:
72+
return True
73+
return self._source_vm in (container_name or "").lower()
74+
6375
@staticmethod
6476
def get_props(
6577
item: ProtectedItemResource,
@@ -169,12 +181,11 @@ def list_recovery_points(
169181
def fetch_recent_jobs(
170182
self,
171183
) -> Dict[str, Dict[str, Optional[AzureWorkloadJob]]]:
172-
"""Fetch recent backup jobs and index by DB name.
184+
"""Fetch recent backup jobs and index by container+DB name.
173185
174186
:returns: Per-DB dict mapping to
175187
``{"last_job": ..., "last_full_backup": ...}``
176-
where values are SDK ``AzureWorkloadJob`` or
177-
``None``.
188+
where values are SDK ``AzureWorkloadJob`` or ``None``.
178189
"""
179190
job_filter = f"backupManagementType eq '{BackupManagementType.AZURE_WORKLOAD}'"
180191
result: Dict[str, Dict[str, Optional[AzureWorkloadJob]]] = {}
@@ -196,47 +207,103 @@ def fetch_recent_jobs(
196207
if not raw_name:
197208
continue
198209

210+
op = (props.operation or "").lower()
211+
if not (op.startswith("backup") or op.startswith("restore")):
212+
continue
213+
214+
vm_hint = ""
199215
bracket_idx = raw_name.find("[")
200216
if bracket_idx > 0:
217+
vm_hint = raw_name[bracket_idx + 1 :].rstrip("] ").strip()
201218
raw_name = raw_name[:bracket_idx].rstrip()
202219

203-
keys = [raw_name]
220+
short_name = raw_name
204221
for sep in (":", ";"):
205222
if sep in raw_name:
206-
short = raw_name.rsplit(sep, 1)[-1]
207-
if short and short != raw_name:
208-
keys.append(short)
223+
short_name = raw_name.rsplit(sep, 1)[-1]
209224
break
210225

211-
for db_key in keys:
212-
entry = result.setdefault(
213-
db_key,
214-
dict(_empty),
215-
)
216-
if entry["last_job"] is None:
217-
entry["last_job"] = props
218-
if entry["last_full_backup"] is None and (
219-
props.operation or ""
220-
).lower().startswith("backup"):
221-
entry["last_full_backup"] = props
226+
if vm_hint:
227+
db_key = f"{vm_hint}::{short_name}"
228+
else:
229+
db_key = f"::{short_name}"
230+
231+
entry = result.setdefault(
232+
db_key,
233+
dict(_empty),
234+
)
235+
if entry["last_job"] is None:
236+
entry["last_job"] = props
237+
if entry["last_full_backup"] is None and (props.operation or "").lower().startswith(
238+
"backup"
239+
):
240+
entry["last_full_backup"] = props
222241
except Exception as exc:
223242
self._log(
224243
logging.WARNING,
225244
f"Could not fetch backup jobs: {exc}",
226245
)
227246
return result
228247

248+
@staticmethod
249+
def has_usable_restore_point(
250+
rp_list: List[RecoveryPointResource],
251+
) -> bool:
252+
"""Check whether at least one RP has a real recovery time.
253+
254+
:param rp_list: Recovery point resources from the SDK.
255+
:returns: ``True`` when a real restore point exists.
256+
"""
257+
for rp_resource in rp_list:
258+
if rp_resource.properties is None:
259+
continue
260+
rp = cast(AzureWorkloadSAPHanaRecoveryPoint, rp_resource.properties)
261+
if rp.recovery_point_time_in_utc is not None:
262+
return True
263+
if getattr(rp, "time_ranges", None):
264+
return True
265+
return False
266+
267+
@staticmethod
268+
def _match_jobs_for_item(
269+
job_index: Dict[str, Dict[str, Optional[AzureWorkloadJob]]],
270+
container_name: str,
271+
friendly_name: str,
272+
) -> Dict[str, Optional[AzureWorkloadJob]]:
273+
"""Find the best matching job entry for a protected item.
274+
275+
:param job_index: Index returned by ``fetch_recent_jobs``.
276+
:param container_name: Backup container name of the item.
277+
:param friendly_name: DB friendly name (e.g. ``hdb``).
278+
:returns: Dict with ``last_job`` and ``last_full_backup``.
279+
"""
280+
db_name = friendly_name.lower()
281+
empty: Dict[str, Optional[AzureWorkloadJob]] = {
282+
"last_job": None,
283+
"last_full_backup": None,
284+
}
285+
for key, entry in job_index.items():
286+
sep_idx = key.find("::")
287+
if sep_idx < 0:
288+
continue
289+
vm_hint = key[:sep_idx]
290+
key_db = key[sep_idx + 2 :]
291+
if key_db != db_name:
292+
continue
293+
if vm_hint and vm_hint in container_name.lower():
294+
return entry
295+
return job_index.get(f"::{db_name}", empty)
296+
229297
def discover(self) -> Dict[str, Any]:
230-
"""Discover and validate all protected HANA databases.
298+
"""Discover and validate protected HANA databases.
231299
232-
:returns: Dict with ``protected_items``,
233-
``restore_points``, ``details``, ``status``,
234-
and ``message`` keys.
300+
:returns: Dict with ``protected_items``
235301
:raises Exception: Propagated from SDK calls.
236302
"""
303+
vm_label = f" for VM '{self._source_vm}'" if self._source_vm else ""
237304
self._log(
238305
logging.INFO,
239-
"Discovering protected HANA items in " f"vault '{self._vault_name}'",
306+
f"Discovering protected HANA items in vault " f"'{self._vault_name}'{vm_label}",
240307
)
241308
protected: List[Dict[str, Any]] = []
242309
restore_pts: List[Dict[str, Any]] = []
@@ -247,33 +314,36 @@ def discover(self) -> Dict[str, Any]:
247314
}
248315
job_index = self.fetch_recent_jobs()
249316
parameters: List[Dict[str, Any]] = []
317+
skipped = 0
250318

251319
for item in self.list_protected_items():
252320
props = self.get_props(item)
253321
container = props.container_name or ""
254322
item_name = item.name or ""
255-
is_hsr = self.is_hsr_container(container)
323+
324+
if not self._matches_source_vm(container):
325+
skipped += 1
326+
continue
327+
328+
is_hsr = self.is_hsr_container(container_name=container)
256329

257330
rp_list = self.list_recovery_points(
258-
container,
259-
item_name,
331+
container_name=container,
332+
item_name=item_name,
260333
)
261334
rp_time, rp_type = self.latest_rp_summary(
262335
rp_list,
263336
)
264-
265-
db_jobs = job_index.get(
266-
(props.friendly_name or "").lower(),
267-
{},
268-
)
337+
has_rp = self.has_usable_restore_point(rp_list)
338+
db_jobs = self._match_jobs_for_item(job_index, container, props.friendly_name or "")
269339
last_job: Optional[AzureWorkloadJob] = db_jobs.get("last_job")
270340
last_full: Optional[AzureWorkloadJob] = db_jobs.get("last_full_backup")
271341

272342
db_status = self.evaluate_db_status(
273-
props,
274-
len(rp_list) > 0,
275-
is_hsr,
276-
last_job,
343+
props=props,
344+
has_restore_point=has_rp,
345+
is_hsr=is_hsr,
346+
last_job=last_job,
277347
)
278348
status_counts[db_status] = status_counts.get(db_status, 0) + 1
279349

@@ -290,7 +360,9 @@ def discover(self) -> Dict[str, Any]:
290360
"health_status": (props.protected_item_health_status or "Unknown"),
291361
"protection_status": (props.protection_status or "Unknown"),
292362
"last_backup_time": (
293-
props.last_backup_time.isoformat() if props.last_backup_time else ""
363+
last_full.start_time.isoformat()
364+
if last_full and last_full.start_time
365+
else ""
294366
),
295367
"latest_restore_point": rp_time,
296368
"backup_type": rp_type,
@@ -339,6 +411,12 @@ def discover(self) -> Dict[str, Any]:
339411
)
340412
)
341413

414+
if skipped:
415+
self._log(
416+
logging.INFO,
417+
f"Skipped {skipped} item(s) not matching " f"source VM '{self._source_vm}'.",
418+
)
419+
342420
if status_counts.get(TestStatus.ERROR.value, 0):
343421
status = TestStatus.ERROR.value
344422
elif status_counts.get(TestStatus.WARNING.value, 0):
@@ -373,16 +451,18 @@ def check_restore_points(self) -> Dict[str, Any]:
373451
"""
374452
self._log(
375453
logging.INFO,
376-
"Checking restore points for " "protected items.",
454+
"Checking restore points for protected items.",
377455
)
378456
all_points: List[Dict[str, Any]] = []
379457
items_without_rp = 0
380458
item_count = 0
381459

382460
for item in self.list_protected_items():
383-
item_count += 1
384461
props = self.get_props(item)
385462
container = props.container_name or ""
463+
if not self._matches_source_vm(container):
464+
continue
465+
item_count += 1
386466
item_name = item.name or ""
387467

388468
rp_list = self.list_recovery_points(
@@ -408,10 +488,10 @@ def check_restore_points(self) -> Dict[str, Any]:
408488

409489
if items_without_rp:
410490
status = TestStatus.WARNING.value
411-
message = f"{items_without_rp} of {item_count} " f"item(s) have no recovery points."
491+
message = f"{items_without_rp} of {item_count} item(s) have no recovery points."
412492
else:
413493
status = TestStatus.SUCCESS.value
414-
message = f"All {item_count} item(s) have " f"recovery points."
494+
message = f"All {item_count} item(s) have recovery points."
415495

416496
return {
417497
"restore_points": all_points,

src/module_utils/backup_parameters.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,11 @@ def compute_param_values(
3939
:param db_status: Computed PASSED/WARNING/FAILED status.
4040
:returns: Dict mapping parameter key to ``{"value": ..., "status": ...}``.
4141
"""
42-
last_backup_time = props.last_backup_time.isoformat() if props.last_backup_time else ""
42+
last_backup_time = (
43+
last_full.start_time.isoformat()
44+
if last_full and last_full.start_time
45+
else ""
46+
)
4347
policy_name = props.policy_name or ""
4448
if not policy_name and getattr(props, "policy_id", None):
4549
policy_name = (props.policy_id or "").rsplit("/", 1)[-1]

0 commit comments

Comments
 (0)