cloudera
diff --git a/‎Project-Axon/On-cloud/Project-Axon-cloud.json‎
Lines changed: 1 addition & 0 deletions b/‎Project-Axon/On-cloud/Project-Axon-cloud.json‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎Project-Axon/On-cloud/README.adoc‎
Lines changed: 273 additions & 0 deletions b/‎Project-Axon/On-cloud/README.adoc‎
Lines changed: 273 additions & 0 deletions
diff --git a/‎Project-Axon/On-prem/Project-Axon.json‎
Lines changed: 1 addition & 0 deletions b/‎Project-Axon/On-prem/Project-Axon.json‎
Lines changed: 1 addition & 0 deletions
@@ -0,0 +1,273 @@
+= Project Axon: Bank Branch Performance Analytics
+:author: Yash Gulati
+:revdate: 2025-06-20
+:toc:
+:toclevels: 2
+
+== Introduction
+
+*Project Axon* is a comprehensive end-to-end demonstration of Cloudera’s capabilities across the full data lifecycle — from data ingestion to dashboarding. 
+
+The goal of this project is to help partners:
+- Understand how to practically use Cloudera Private Cloud for real-time and batch analytics.
+- Identify a relevant and easy-to-explain use case.
+- Showcase a ready-to-deploy demo to customers after initial discovery conversations.
+
+**Use Case Chosen:** *Bank Branch Performance Analytics*
+
+This use case helps simulate and analyze the operational performance of various bank branches using dummy data, allowing visual insights via dashboards.
+
+== Prerequisites
+
+=== 1. Linux Server for running Dummy Data Generator App
+
+Ensure you have access to any running **Linux server** for hosting the dummy data generator application. 
+A minimal cloud instance like **t3.small** (2 vCPUs, 2 GB RAM) is sufficient for running the application.
+
+==== Required Open Ports
+Make sure the following ports are open on the server's firewall or cloud security group:
+
+- 8000
+- 8085
+- 5001
+- 5003
+- 5400
+- 5500
+
+=== 2. Cloudera Platform Requirements
+
+Ensure you have a running **Cloudera Public Cloud Environment** with the following components:
+
+- Data Lake
+- Cloudera Data Flow
+- Cloudera Data Warehouse
+- Cloudera Data Visualization
+
+This project was developed and tested on the following component versions:
+
+- **Datalake Version**: 7.2.18 
+- **Cloudera Data Flow**: 2.10.0-h3-b3  
+- **Cloudera Data Warehouse**: 1.10.3-b8
+- **Cloudera Data Visualization**: 7.2.9-b41
+
+== Technology Stack
+
+- **Data Generator**: Python (Flask + Faker)
+- **Data Ingestion**: Cloudera Data Flow
+- **Storage**: S3 (Parquet format)
+- **Data Query Layer**: Cloudera Data Warehouse via Hue
+- **Visualization**: Cloudera Data Visualization
+
+== Project Workflow
+
+image::../images/project_flow_cloud.png[project_flow]
+
+== Steps to Run
+
+=== 1. Clone the Dummy Data Generator Repository
+
+Clone the repository containing the dummy data generators and run the script to start all services:
+
+[source,shell]
+----
+git clone https://github.com/cloudera/cloudera-partners.git
+cd cloudera-partners
+git checkout project-axon
+cd Project-Axon/On-cloud
+----
+
+=== 2. Set Up Python Virtual Environment and Install Dependencies
+
+[source,shell]
+----
+sudo yum install -y python3 git
+python3 -m ensurepip --upgrade
+python3 -m venv venv
+source venv/bin/activate
+pip3 install -r ../assets/requirements.txt
+
+# Verify Flask version
+python3 -m flask --version
+----
+
+=== 3. Run the application
+
+[source,shell]
+----
+bash ../assets/run_all.sh
+----
+
+- After running the script, verify that the dummy data endpoints are active using a `curl` command.
+- Replace `<your-server-ip>` with the public IP of the node where you ran the script.
+
+Example:
+[source,shell]
+----
+curl http://<your-server-ip>:5400/footfall/summary
+curl http://<your-server-ip>:8000/campaign-details
+----
+
+Sample JSON response from the campaign API:
+[source,json]
+----
+{
+  "Budget": 351527.55,
+  "CampaignID": 17,
+  "CampaignName": "Mclean-Tran Loan Offer",
+  "Channel": "Bank Website",
+  "EndDate": "2025-07-21",
+  "SeasonID": 3,
+  "StartDate": "2025-07-14",
+  "Status": "Active"
+}
+----
+
+You should see a JSON response similar to the above.
+
+=== 4. Generate the CDP Workload Password for Your Profile
+
+- Login to the Cloudera Public Cloud Console using your credentials.   
+- click your login name at the lower-left corner → *Profile*.  
++
+image::../images/profile_name.png[profile name]
++
+- Click *Set Workload Password*.  
+- Enter `Changeme123!` (note capital C) or your desired password in both fields and click *Set Workload Password*.  
++
+image::../images/set_workload.png[set_workload]
++
+- A confirmation message will appear once your password is set successfully — **remember this password, as it will be used in later steps**.
+
+=== 5. Import the NiFi Flow into the Cloudera Flow Management Catalog
+
+. Navigate to the **Cloudera Flow Management** service and open the **Catalog**.
++
+image::../images/cloudera_data_flow.png[cloudera data flow, width=300, height=300]
++
+. Click on *Import Flow Definition*.
++
+image::../images/import_catalog.png[import catalog]
++
+. Enter a descriptive name for your flow (for example, `Project-Axon`) and choose the desired collection.
+. Upload the `Project-Axon` flow file as the *NiFi Flow Configuration File*, then click *Import*.
++
+image::../images/import_wizard.png[import wizard, width=400, height=500]
++
+. Once the flow appears in the Catalog, click to open it, then select *Deploy* to create a NiFi flow deployment.
++
+image::../images/deploy_flow.png[deploy flow, width=500, height=800]
+
+==== Deployment Steps
+
+. In the deployment wizard:
+.. Select the target workspace (your Cloudera Public Cloud environment) and click *Continue*.
++
+image::../images/deploy_target_env.png[deploy target env, width=450, height=600]
++
+.. Provide a name for your deployment, choose the target project, and click *Next*.
+.. Under *NiFi Configuration*, keep the default settings and click *Next*.
++
+image::../images/nifi_configuration.png[NiFi Configuration, width=700, height=900]
++
+.. In the *Parameters* section:
+   * Enter your **CDP Workload Username** and **CDP Workload User Password** for your tenant.
+   * In the `http url` parameter, update only the IP address portion with the *Public IP address* of the server running your dummy data generator app.
+   * Click *Next*.
++
+image::../images/update_parameters.png[Update Parameters, width=600, height=900]
++
+.. Under *Sizing and Scaling*, keep the default settings and click *Next*.
+.. Leave *Key Performance Indicators (KPIs)* empty unless you wish to define them.
+.. Review the configuration and click *Deploy*.
++
+image::../images/review_wizard.png[review wizard, width=600, height=700]
++
+. To open and view the deployed flow, go to *Actions* and select *View in NiFi*.
++
+image::../images/view_in_nifi.png[view in nifi, width=500, height=900]
++
+. After starting the flow, run it for no more than **5 minutes** to generate about **50–80 flow files**, then right-click the process group and select *Stop* to prevent it from running indefinitely.
++
+image::../images/stop_flow.png[stop flow]
+
+=== 6. Create Hive Tables via Hue
+
+Go to **Cloudera Data Warehouse** and under Virtual warehouses, click on `Hue` for hive virtual warehouse for your environment.
+
+To create all the required databases and tables at once, simply:
+
+- Open the https://github.com/cloudera/cloudera-partners/blob/project-axon/Project-Axon/create_queries.txt[create_queries.txt] file from the cloned folder.
+- Copy the entire content.
+- Paste it into the Hue Query Editor.
+- Select all and click the **Run** button.
++
+image::../images/hive_queries.png[hive_queries, width=800, height=500]
+
+This will create all the necessary Hive tables and databases for the project in one go.
+
+==== 6.1. Verify Table Creation & Data Load
+
+To verify that all tables were successfully created and contain data:
+
+- Copy the content of the file verify_tables.txt — this includes a Hive query to count rows across all expected tables.
+- Paste it into the *Hue Query Editor*.
+- Click *Run*.
+
+You should see a list of table names with their row counts.
+
+image::../images/table_verify.png[tables verify]
+
+If any table shows a count of `0`, you may need to revisit the data ingestion step for that table.
+
+=== 7. Connect Data Visualization to Impala
+
+To enable Data Visualization to read data from Impala, you need to create a connection in the Data Visualization UI. 
+
+While Hive is supported, it is *recommended to use Impala* for creating the connection, as Impala is a high-performance, distributed SQL engine optimized for fast, interactive analytics on large-scale datasets.
+
+- Go to *Cloudera Data Warehouse* and click on Data Visualization and click on your environment name.
++
+image::../images/cloudera_data_warehouse.png[cloudera data warehouse, width=300, height=300]
+- After getting inside, click on `Open Data Visualization` navigate to the *Data* tab.
++
+image::../images/cdw_dataviz.png[tables verify]
++
+- Click *+ New Connection* → *CDW Impala*.
++
+image::../images/connection.png[make connection, width=500, height=300]
++
+[width="90%",cols="40%,50%",options="header"]
+|===
+|**Parameter** |**Value**
+|*Connection Name* |Impala-Axon (or any name you prefer)
+|*Connection type* |CDW Impala
+|*CDW Warehouse* |Select the name of your Impala Virtual Warehouse
+|*Hostname* |It will be auto populated when you select CDW Warehouse
+|*Port* |28000 (for Impala)
+|*Credentials* |Leave it Empty
+|===
++
+- Click *Test Connection* to verify.
++
+image::../images/connection_cdw_impala.png[verify connection, width=450, height=500]
++
+- Once successful, click *Save*.
+- You can now use this connection to create/import datasets and build/import dashboards from Impala tables.
+
+=== 8. Import Dashboard into Cloudera Data Visualization
+
+- Go to *Cloudera Data Visualization*.
+
+- Navigate to the *Data* tab, then click on *Import visual artifacts*.
++
+image::../images/import_visual.png[Import Visual]
++
+- Upload the dashboard JSON file: https://github.com/cloudera/cloudera-partners/blob/project-axon/Project-Axon/project_axon_dashboard.json[project_axon_dashboard.json].
+- After uploading, click on *Accept and Import*, you will see an *Import Successful* message along with the list of datasets that were imported as part of the dashboard.
++
+image::../images/dashboard_import_verify.png[import success, width=800, height=450]
++
+- Once imported, navigate to the *Visuals* tab and click on the dashboard to open and view it.
++
+image::../images/dashboard.png[dashboard]
+