GDC-ConsumerEdge
diff --git a/‎examples/eps_to_csv/config.ini‎
Lines changed: 0 additions & 12 deletions b/‎examples/eps_to_csv/config.ini‎
Lines changed: 0 additions & 12 deletions
diff --git a/‎examples/eps_to_csv/config/sot_csv_config.ini‎
Lines changed: 21 additions & 0 deletions b/‎examples/eps_to_csv/config/sot_csv_config.ini‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎examples/eps_to_csv/README.md‎ renamed to ‎examples/eps_to_csv/resources/README.md‎
Lines changed: 8 additions & 5 deletions b/‎examples/eps_to_csv/README.md‎ renamed to ‎examples/eps_to_csv/resources/README.md‎
Lines changed: 8 additions & 5 deletions
@@ -0,0 +1,21 @@
+# Configuration for generating Source of Truth (SOT) CSV files from EPS data.
+
+[sot_columns]
+# Defines the exact column names expected in the *final* Cluster Intent and template SOT CSV files.
+# These are the target column names after any renaming specified in [rename_columns].
+# If a column is NOT renamed below, its name here MUST match the corresponding column name in EPS or else that column will be ignored in csv generation
+# The order of columns in the lists below dictates the exact column order in the generated CSV file.
+
+#Column names of the Cluster Intent SOT to be generated
+cluster_intent_sot=["store_id", "zone_name", "machine_project_id", "fleet_project_id", "cluster_name", "location", "node_count", "cluster_ipv4_cidr", "services_ipv4_cidr", "external_load_balancer_ipv4_address_pools", "sync_repo", "sync_branch", "sync_dir", "secrets_project_id", "git_token_secrets_manager_name", "cluster_version", "maintenance_window_start", "maintenance_window_end", "maintenance_window_recurrence", "subnet_vlans", "recreate_on_delete"]
+
+#Column names of the Cluster template SOT to be generated.
+cluster_data_sot = ["cluster_name","cluster_group","project_id","cluster_tags","country_code","store_id","gateway_ip","bos_vm_ip","qsrsoft_vm_ip","gsc01_vm_ip","gsc02_vm_ip","cluster_viewer_groups","vm_support_groups","vm_migrate_groups"]
+
+[rename_columns]
+# Specifies mappings for renaming columns from the source EPS data to the desired final column names defined in the [sot_columns] section above.
+# Format: <source_eps_column_name> = <target_sot_column_name>
+name = cluster_name
+group = cluster_group
+tags = cluster_tags
+unique_zone_id = store_id
@@ -1,10 +1,10 @@
 # EPS to CSV Converter
 
-This set of script and configuration is designed to fetch cluster data from an EPS API and generate CSV files suitable for use as Source of Truth (SoT) data, specifically for cluster intent and cluster template data.
+This set of script and csv configuration defined in [config directory](../config) is designed to fetch cluster data from an EPS API and generate CSV files suitable for use as Source of Truth (SoT) data, specifically for cluster intent and cluster template data.
 
 ## Purpose
 
-The primary goal is to automate the retrieval and transformation of cluster information from an EPS system into standardized CSV formats. These CSV files can then be used for potential integrations with HRM, Cluster Provisioner and other tools that use GitOps workflows.
+The primary goal is to automate the retrieval and transformation of cluster information from EPS system into standardized CSV formats. These CSV files can then be used for potential integrations with HRM, Cluster Provisioner and other tools that use GitOps workflows.
 
 ## Components
 
@@ -14,7 +14,7 @@ The script requires Google Cloud [Application Default Credentials](https://cloud
 * A service Account credentials file (`GOOGLE_APPLICATION_CREDENTIALS` env variable to be set to the path of Service Account credentials)
 * Run from a Workload Identity Federated location (which sets up `GOOGLE_APPLICATION_CREDENTIALS` for you eg: GitHub Actions)
 * If gcloud SDK is installed, the credentials generated by `gcloud auth application-default login --impersonate-service-account`
-* The attached service account returned by the metadata server if run from Compute Engine, Cloud Run
+* The attached service account returned by the metadata server if run from Compute Engine, Cloud Run, GKE ..etc
 
 ### Required Environment Variables
 
@@ -39,6 +39,9 @@ This file defines the structure and renaming rules for the output CSV files.
 * **`[sot_columns]`**: Specifies the exact column names expected in the final CSVs for both `cluster_intent_sot` and `cluster_data_sot`. These names should align with the data structure in the EPS system. If the columns are to be generated with different names to that of EPS, please specify them explicitly in `rename_columns` below.
 * **`[rename_columns]`**: Defines rules for renaming columns fetched from the EPS API to the desired names in the processed DataFrame before CSV generation (e.g., `name = cluster_name`.. This would fetch the cluster attribute `name` from EPS and generate the data under the csv column `cluster_name` ).
 
+* Please [check here](../config/sot_csv_config.ini) for a sample config file.
+
+
 ### Main Conversion Script (`eps_to_csv_converter.py`)
 
 This Python script orchestrates the entire process:
@@ -50,7 +53,7 @@ This Python script orchestrates the entire process:
     * Makes a GET request to the EPS API endpoint (`/api/v1/clusters`) to retrieve cluster data in JSON format.
 * **Processes Data**:
     * Flattens the nested JSON response from the API into a Pandas DataFrame.
-    * Removes specified prefixes (like `data_`, `intent_`) from flattened column names.
+    * Removes prefixes (like `data_`, `intent_`) from flattened column names for the defined source_of_truth_type.
     * Handles potential duplicate column names arising from prefix removal.
     * Applies the specific column renaming rules defined in `config.ini`.
 * **Generates CSVs**:
@@ -59,7 +62,7 @@ This Python script orchestrates the entire process:
 
 ### Install Dependencies
 
-These can be installed from the utils directory using : 
+These can be installed from the requirements file in the same directory as the python script using : 
 
 ``` pip install -r requirements.txt```