diff --git a/LICENSE b/LICENSE index da7e8f9fa1..367155fc74 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ -The MIT License (MIT) +MIT License -Copyright (c) 2015 EkStep +Copyright (c) 2021 Project Sunbird Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal @@ -19,4 +19,3 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. - diff --git a/README.md b/README.md index d9860997d2..d111ea54e6 100644 --- a/README.md +++ b/README.md @@ -12,3 +12,148 @@ The platform contains of the following projects: * platform-modules - Functional/Pedagogy modules to support game based learning and API +## learning-service local setup +This readme file contains the instruction to set up and run the learnin-service in local machine. + +### System Requirements: + +### Prerequisites: +* Java 11 is required for docker container setups +* Java 8 and Tomcat v9 is required for local service setup via IntelliJ + +### Prepare folders for database data and logs + +```shell +mkdir -p ~/sunbird-dbs/neo4j ~/sunbird-dbs/cassandra ~/sunbird-dbs/redis ~/sunbird-dbs/es ~/sunbird-dbs/kafka +export sunbird_dbs_path=~/sunbird-dbs +``` + + +### Elasticsearch database setup in docker: +```shell +docker run --name sunbird_es -d -p 9200:9200 -p 9300:9300 \ +-v $sunbird_dbs_path/es/data:/usr/share/elasticsearch/data \ +-v $sunbird_dbs_path/es/logs://usr/share/elasticsearch/logs \ +-v $sunbird_dbs_path/es/backups:/opt/elasticsearch/backup \ + -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:6.8.22 + +``` +> --name - Name your container (avoids generic id) +> +> -p - Specify container ports to expose +> +> Using the -p option with ports 7474 and 7687 allows us to expose and listen for traffic on both the HTTP and Bolt ports. Having the HTTP port means we can connect to our database with Neo4j Browser, and the Bolt port means efficient and type-safe communication requests between other layers and the database. +> +> -d - This detaches the container to run in the background, meaning we can access the container separately and see into all of its processes. +> +> -v - The next several lines start with the -v option. These lines define volumes we want to bind in our local directory structure so we can access certain files locally. +> +> --env - Set config as environment variables for Neo4j database +> + + +### Neo4j database setup in docker: +1. First, we need to get the neo4j image from docker hub using the following command. +```shell +docker pull neo4j:3.3.0 +``` +2. We need to create the neo4j instance, By using the below command we can create the same and run in a container. +```shell +docker run --name sunbird_neo4j -p7474:7474 -p7687:7687 -d \ + -v $sunbird_dbs_path/neo4j/data:/var/lib/neo4j/data \ +-v $sunbird_dbs_path/neo4j/logs:/var/lib/neo4j/logs \ +-v $sunbird_dbs_path/neo4j/plugins:/var/lib/neo4j/plugins \ +--env NEO4J_dbms_connector_https_advertised__address="localhost:7473" \ +--env NEO4J_dbms_connector_http_advertised__address="localhost:7474" \ +--env NEO4J_dbms_connector_bolt_advertised__address="localhost:7687" \ +--env NEO4J_AUTH=none \ +neo4j:3.3.0 +``` +> --name - Name your container (avoids generic id) +> +> -p - Specify container ports to expose +> +> Using the -p option with ports 7474 and 7687 allows us to expose and listen for traffic on both the HTTP and Bolt ports. Having the HTTP port means we can connect to our database with Neo4j Browser, and the Bolt port means efficient and type-safe communication requests between other layers and the database. +> +> -d - This detaches the container to run in the background, meaning we can access the container separately and see into all of its processes. +> +> -v - The next several lines start with the -v option. These lines define volumes we want to bind in our local directory structure so we can access certain files locally. +> +> --env - Set config as environment variables for Neo4j database +> +> Using Docker on Windows will also need a couple of additional configurations because the default 0.0.0.0 address that is resolved with the above command does not translate to localhost in Windows. We need to add environment variables to our command above to set the advertised addresses. +> +> By default, Neo4j requires authentication and requires us to first login with neo4j/neo4j and set a new password. We will skip this password reset by initializing the authentication none when we create the Docker container using the --env NEO4J_AUTH=none. + +3. Load seed data to neo4j using the instructions provided in the [link](master-data/loading-seed-data.md#loading-seed-data-to-neo4j-database) + +4. Verify whether neo4j is running or not by accessing neo4j browser(http://localhost:7474/browser). + +5. To SSH to neo4j docker container, run the below command. +```shell +docker exec -it sunbird_neo4j bash +``` + +### Redis database setup in docker: +1. we need to get the redis image from docker hub using the below command. +```shell +docker pull redis:6.0.8 +``` +2. We need to create the redis instance, By using the below command we can create the same and run in a container. +```shell +docker run --name sunbird_redis -d -p 6379:6379 redis:6.0.8 +``` +3. To SSH to redis docker container, run the below command +```shell +docker exec -it sunbird_redis bash +``` +### cassandra database setup in docker: +1. we need to get the cassandra image and can be done using the below command. +```shell +docker pull cassandra:3.11.8 +``` +2. We need to create the cassandra instance, By using the below command we can create the same and run in a container. +```shell +docker run --name sunbird_cassandra -d -p 9042:9042 \ +-v $sunbird_dbs_path/cassandra/data:/var/lib/cassandra \ +-v $sunbird_dbs_path/cassandra/logs:/opt/cassandra/logs \ +-v $sunbird_dbs_path/cassandra/backups:/mnt/backups \ +--network bridge cassandra:3.11.8 +``` +For network, we can use the existing network or create a new network using the following command and use it. +```shell +docker network create sunbird_db_network +``` +3. To start cassandra cypher shell run the below command. +```shell +docker exec -it sunbird_cassandra cqlsh +``` +4. To ssh to cassandra docker container, run the below command. +```shell +docker exec -it sunbird_cassandra /bin/bash +``` +5. Load seed data to cassandra using the instructions provided in the [link](master-data/loading-seed-data.md#loading-seed-data-to-cassandra-database) + + +### Steps to start learning-service in debug or development mode using IntelliJ: +1. Navigate to downloaded repository folder and run below command in terminal. +```shell +export JAVA_HOME = #JDK1.8 folder location +mvn clean install -DskipTests +``` +2. Open the project in IntelliJ. +3. Add 'Smart Tomcat' plugin. (File -> Settings -> plugin -> Search for 'Smart Tomcat' -> install) +4. Configure the tomcat server using 'Smart Tomcat' plugin under 'Add configuration'. +![img.png](img.png) +5. Give the 'name' as 'learning-service'. +6. Specify the Tomcat v9 server location in 'Tomcat Server'. +7. Specify the absolute path of the 'webapp' folder (under ../sunbird-learning-platform/platform-modules/service/src/main/webapp). +8. Mention '/learning-service' as the context path. Click on 'Apply'. +9. Click on 'File' -> 'Project Structure'. Check if Java version is set to jdk1.8 in both 'Project' and 'Modules'. +![img_1.png](img_1.png) +![img_2.png](img_2.png) +10. Click on 'File' -> Settings -> Build, Execution, Deployment -> compiler -> scala compiler -> scala compiler server. Check if Java version is set to jdk1.8. +![img_3.png](img_3.png) +11. Service configuration file is available at '../sunbird-learning-platform/platform-modules/service/src/main/resources/application.conf'. Update config file to connect to local databases. +12. Start Redis, neo4j, cassandra and ElasticSearch docker containers. +13. Start the 'learning-service' configuration in IntelliJ. Verify the health of learning service by trying to connect to 'http://localhost:8080/learning-service/health'. diff --git a/ansible/artifacts-download.yml b/ansible/artifacts-download.yml index bf675c8bd3..2fb5a11ffa 100644 --- a/ansible/artifacts-download.yml +++ b/ansible/artifacts-download.yml @@ -3,8 +3,53 @@ become: yes vars_files: - "{{inventory_dir}}/secrets.yml" - environment: - AZURE_STORAGE_ACCOUNT: "{{sunbird_artifact_storage_account_name}}" - AZURE_STORAGE_SAS_TOKEN: "{{sunbird_artifact_storage_account_sas}}" - roles: - - artifacts-download-azure \ No newline at end of file + tasks: + - name: download artifact from azure storage + include_role: + name: azure-cloud-storage + tasks_from: blob-download.yml + vars: + blob_container_name: "{{ cloud_storage_artifacts_bucketname }}" + blob_file_name: "{{ artifact }}" + local_file_or_folder_path: "{{ artifact_path }}" + storage_account_name: "{{ cloud_artifact_storage_accountname }}" + storage_account_key: "{{ cloud_artifact_storage_secret }}" + when: cloud_service_provider == "azure" + + - name: download artifact from gcloud storage + include_role: + name: gcp-cloud-storage + tasks_from: download.yml + vars: + gcp_storage_service_account_name: "{{ cloud_artifact_storage_accountname }}" + gcp_storage_key_file: "{{ cloud_artifact_storage_secret }}" + gcp_bucket_name: "{{ cloud_storage_artifacts_bucketname }}" + gcp_path: "{{ artifact }}" + local_file_or_folder_path: "{{ artifact_path }}" + when: cloud_service_provider == "gcloud" + + - name: download artifact from aws s3 + include_role: + name: aws-cloud-storage + tasks_from: download.yml + vars: + local_file_or_folder_path: "{{ artifact_path }}" + s3_bucket_name: "{{ cloud_storage_artifacts_bucketname }}" + s3_path: "{{ artifact }}" + aws_default_region: "{{ cloud_public_storage_region }}" + aws_access_key_id: "{{ cloud_artifact_storage_accountname }}" + aws_secret_access_key: "{{ cloud_artifact_storage_secret }}" + when: cloud_service_provider == "aws" + + - name: download artifact from oci oss storage + include_role: + name: oci-cloud-storage + apply: + environment: + OCI_CLI_AUTH: "instance_principal" + tasks_from: download.yml + vars: + oss_bucket_name: "{{ cloud_storage_artifacts_bucketname }}" + oss_object_name: "{{ artifact }}" + local_file_or_folder_path: "{{ artifact_path }}" + when: cloud_service_provider == "oci" \ No newline at end of file diff --git a/ansible/artifacts-upload.yml b/ansible/artifacts-upload.yml index 4b651f6dd0..cd296004e4 100644 --- a/ansible/artifacts-upload.yml +++ b/ansible/artifacts-upload.yml @@ -3,8 +3,55 @@ become: yes vars_files: - "{{inventory_dir}}/secrets.yml" - environment: - AZURE_STORAGE_ACCOUNT: "{{sunbird_artifact_storage_account_name}}" - AZURE_STORAGE_SAS_TOKEN: "{{sunbird_artifact_storage_account_sas}}" - roles: - - artifacts-upload-azure \ No newline at end of file + tasks: + - name: upload artifact to azure storage + include_role: + name: azure-cloud-storage + tasks_from: blob-upload.yml + vars: + blob_container_name: "{{ cloud_storage_artifacts_bucketname }}" + container_public_access: "off" + blob_file_name: "{{ artifact }}" + local_file_or_folder_path: "{{ artifact_path }}" + storage_account_name: "{{ cloud_artifact_storage_accountname }}" + storage_account_key: "{{ cloud_artifact_storage_secret }}" + when: cloud_service_provider == "azure" + + - name: upload artifact to gcloud storage + include_role: + name: gcp-cloud-storage + tasks_from: upload.yml + vars: + gcp_storage_service_account_name: "{{ cloud_artifact_storage_accountname }}" + gcp_storage_key_file: "{{ cloud_artifact_storage_secret }}" + gcp_bucket_name: "{{ cloud_storage_artifacts_bucketname }}" + gcp_path: "{{ artifact }}" + local_file_or_folder_path: "{{ artifact_path }}" + when: cloud_service_provider == "gcloud" + + + - name: upload artifact to aws s3 + include_role: + name: aws-cloud-storage + tasks_from: upload.yml + vars: + local_file_or_folder_path: "{{ artifact_path }}" + s3_bucket_name: "{{ cloud_storage_artifacts_bucketname }}" + s3_path: "{{ artifact }}" + aws_default_region: "{{ cloud_public_storage_region }}" + aws_access_key_id: "{{ cloud_artifact_storage_accountname }}" + aws_secret_access_key: "{{ cloud_artifact_storage_secret }}" + when: cloud_service_provider == "aws" + + - name: upload artifact to oci oss + include_role: + name: oci-cloud-storage + apply: + environment: + OCI_CLI_AUTH: "instance_principal" + tasks_from: upload.yml + vars: + local_file_or_folder_path: "{{ artifact_path }}" + oss_bucket_name: "{{ cloud_storage_artifacts_bucketname }}" + oss_path: "{{ artifact }}" + when: cloud_service_provider == "oci" \ No newline at end of file diff --git a/ansible/cassandra-backup.yml b/ansible/cassandra-backup.yml index 204653b815..c9e4d37d22 100644 --- a/ansible/cassandra-backup.yml +++ b/ansible/cassandra-backup.yml @@ -2,8 +2,5 @@ become: yes vars_files: - ['{{inventory_dir}}/secrets.yml'] - environment: - AZURE_STORAGE_ACCOUNT: "{{sunbird_management_storage_account_name}}" - AZURE_STORAGE_KEY: "{{sunbird_management_storage_account_key}}" roles: - cassandra-backup diff --git a/ansible/cassandra-restore.yml b/ansible/cassandra-restore.yml index 2504153192..ad2ccfd44e 100644 --- a/ansible/cassandra-restore.yml +++ b/ansible/cassandra-restore.yml @@ -3,8 +3,5 @@ gather_facts: no vars_files: - ['{{inventory_dir}}/secrets.yml'] - environment: - AZURE_STORAGE_ACCOUNT: "{{sunbird_management_storage_account_name}}" - AZURE_STORAGE_KEY: "{{sunbird_management_storage_account_key}}" roles: - cassandra-restore diff --git a/ansible/es_backup.yml b/ansible/es_backup.yml index 44e6c3d0b6..8134759173 100644 --- a/ansible/es_backup.yml +++ b/ansible/es_backup.yml @@ -4,19 +4,29 @@ vars_files: - ['{{inventory_dir}}/secrets.yml'] tasks: - - name: Create a container for storing backups - command: az storage container create --name elasticsearch-snapshots --public-access blob - environment: - AZURE_STORAGE_ACCOUNT: "{{sunbird_management_storage_account_name}}" - AZURE_STORAGE_KEY: "{{sunbird_management_storage_account_key}}" - + - name: Ensure azure blob storage container exists + include_role: + name: azure-cloud-storage + tasks_from: container-create.yml + vars: + blob_container_name: "elasticsearch-snapshots" + container_public_access: "off" + storage_account_name: "{{ cloud_management_storage_accountname }}" + storage_account_key: "{{ cloud_management_storage_secret }}" + when: cloud_service_provider == "azure" - hosts: composite-search-cluster become: yes vars_files: - ['{{inventory_dir}}/secrets.yml'] roles: - - es-azure-snapshot + - role: es-azure-snapshot + when: cloud_service_provider == "azure" + - role: es-s3-snapshot + when: cloud_service_provider == "aws" + - role: es-gcs-snapshot + when: cloud_service_provider == "gcloud" + - role: es5-snapshot-purge tags: - es_backup run_once: true diff --git a/ansible/inventory/env/group_vars/all.yml b/ansible/inventory/env/group_vars/all.yml index 42fb4d1385..781983feef 100644 --- a/ansible/inventory/env/group_vars/all.yml +++ b/ansible/inventory/env/group_vars/all.yml @@ -85,8 +85,8 @@ search_es7_host: "{{ groups['es7']|join(':9200,')}}:9200" mlworkbench: "{{ groups['mlworkbench'][0]}}" -azure_account: "{{ sunbird_public_storage_account_name }}" -azure_secret: "{{ sunbird_public_storage_account_key }}" +azure_account: "{{ cloud_public_storage_accountname }}" +azure_secret: "{{ cloud_public_storage_secret }}" dedup_redis_host: "{{ dp_redis_host }}" kp_redis_host: "{{ groups['redisall'][0] }}" neo4j_route_path: "bolt://{{ groups['learning-neo4j-node1'][0] }}:7687" @@ -97,6 +97,7 @@ middleware_hierarchy_keyspace: "{{ instance }}_hierarchy_store" kp_print_service_base_url: "http://{{private_ingressgateway_ip}}/print" cert_reg_service_base_url: "http://{{private_ingressgateway_ip}}/certreg" kp_search_service_base_url: "http://{{private_ingressgateway_ip}}/search" +kp_dial_service_base_url: "http://{{private_ingressgateway_ip}}/dial" lms_service_base_url: "http://{{private_ingressgateway_ip}}/lms" learner_service_base_url: "http://{{private_ingressgateway_ip}}/learner" sourcing_content_service_base_url: "http://{{dock_private_ingressgateway_ip | default('localhost') }}/content" @@ -116,3 +117,13 @@ cert_azure_storage_key: "{{sunbird_private_storage_account_name}}" default_channel: "org.sunbird" download_neo4j: true neo4j_upstream_download: false +enable_suppress_exception: false +enable_rc_certificate: true + +# SB-31155 +plugin_storage: "{{ plugin_container_name }}" + +cloud_storage_endpoint: "{{ cloud_public_storage_endpoint }}" +cloudstorage_relative_path_prefix_content: "CONTENT_STORAGE_BASE_PATH" +cloudstorage_relative_path_prefix_dial: "DIAL_STORAGE_BASE_PATH" +cloudstorage_metadata_list: '["appIcon", "artifactUrl", "posterImage", "previewUrl", "thumbnail", "assetsMap", "certTemplate", "itemSetPreviewUrl", "grayScaleAppIcon", "sourceURL", "variants", "downloadUrl", "streamingUrl", "toc_url", "data", "question", "solutions", "editorState", "media", "pdfUrl", "transcripts"]' \ No newline at end of file diff --git a/ansible/lp_neo4j-backup.yml b/ansible/lp_neo4j-backup.yml index 33a6d7bded..35e3c63c1b 100644 --- a/ansible/lp_neo4j-backup.yml +++ b/ansible/lp_neo4j-backup.yml @@ -1,9 +1,6 @@ - hosts: learning-neo4j-node1 #if it is a cluster learning-neo4j-node1 should be always master node vars_files: - "{{inventory_dir}}/secrets.yml" - environment: - AZURE_STORAGE_ACCOUNT: "{{sunbird_management_storage_account_name}}" - AZURE_STORAGE_KEY: "{{sunbird_management_storage_account_key}}" become: yes become_user: "{{ learner_user }}" roles: diff --git a/ansible/lp_samza_deploy.yml b/ansible/lp_samza_deploy.yml deleted file mode 100644 index f66377a98d..0000000000 --- a/ansible/lp_samza_deploy.yml +++ /dev/null @@ -1,37 +0,0 @@ ---- -- name: "Start Nodemanager on Slaves" - hosts: "yarn-slave" - vars: - hadoop_version: 2.7.2 - vars_files: - - "{{inventory_dir}}/secrets.yml" - become: yes - tasks: - - name: Ensure yarn nodemanager is running - become_user: hduser - shell: | - (ps aux | grep yarn-hduser-nodemanager | grep -v grep) || /usr/local/hadoop/sbin/yarn-daemon.sh --config /usr/local/hadoop-{{hadoop_version}}/conf/ start nodemanager || sleep 10 - - - name: Install mysql client - apt: name=mysql-client state=present - - - name: install imagemagick - apt: name=imagemagick state=present update_cache=yes - -- name: "Copy Samza jobs additional configuration to slaves" - become: yes - hosts: "yarn-slave" - vars_files: - - "{{inventory_dir}}/secrets.yml" - roles: - - samza-jobs-additional-config - -- name: "Deploy Samza jobs" - hosts: "yarn-master" - become: yes - vars_files: - - "{{inventory_dir}}/secrets.yml" - vars: - deploy_jobs: true - roles: - - samza-jobs diff --git a/ansible/lp_samza_telemetry_schemas.yml b/ansible/lp_samza_telemetry_schemas.yml deleted file mode 100644 index c3b56eae4d..0000000000 --- a/ansible/lp_samza_telemetry_schemas.yml +++ /dev/null @@ -1,28 +0,0 @@ ---- -- name: "Copy validation schema files to Yarn Slaves" - hosts: "yarn-slave" - vars_files: - - "{{inventory_dir}}/secrets.yml" - become: yes - tasks: - - name: cloning the telemetry schema repo - git: - repo: "{{schema_repo_url}}" - dest: "{{telemetry_schema_directory}}" - version: "{{version}}" - force: yes - - name: Create schema directory - file: path={{telemetry_schema_directory}} owner=hduser group=hadoop recurse=yes state=directory - - - name: get schema dir names - raw: find {{telemetry_schema_path}} -type f -name "*.*" - register: schemas - - - name: change internal schema file reference - replace: - dest: "{{item}}" - regexp: "http://localhost:7070/schemas/" - replace: "file://{{telemetry_schema_path}}/" - owner: hduser - group: hadoop - with_items: "{{ schemas.stdout_lines }}" diff --git a/ansible/lp_yarn_provision.yml b/ansible/lp_yarn_provision.yml deleted file mode 100644 index b7866e0fa3..0000000000 --- a/ansible/lp_yarn_provision.yml +++ /dev/null @@ -1,47 +0,0 @@ ---- -- hosts: yarn - become: yes - vars_files: - - "{{inventory_dir}}/secrets.yml" - tasks: - - name: Create group - group: - name: hadoop - state: present - - name: Create user - user: - name: hduser - comment: "hduser" - group: hadoop - groups: sudo - shell: /bin/bash - -- name: Install samza job server - hosts: "yarn-master" - vars_files: - - "{{inventory_dir}}/secrets.yml" - become: yes - roles: - - jdk-1.8.0_121 - - yarn - - samza-job-server - -# # As of ansible2.5 delegation and inlcude tasks file won't work -# # with looping. So this is a possible workaround -# - name: Running common tasks in yarn slaves -# vars_files: -# - ./roles/yarn/defaults/main.yml -# hosts: "lp-yarn-slave" -# become: yes -# tasks: -# - include: ./roles/yarn/tasks/common.yml -# - include: ./roles/yarn/templates - -- name: Install java on all yarn slaves - hosts: "yarn-slave" - vars_files: - - "{{inventory_dir}}/secrets.yml" - become: yes - remote_user: hduser - roles: - - jdk-1.8.0_121 diff --git a/ansible/redis-backup.yml b/ansible/redis-backup.yml index 9d3a8df1aa..114cec8627 100644 --- a/ansible/redis-backup.yml +++ b/ansible/redis-backup.yml @@ -3,9 +3,6 @@ gather_facts: true vars_files: - ['{{inventory_dir}}/secrets.yml'] - environment: - AZURE_STORAGE_ACCOUNT: "{{sunbird_management_storage_account_name}}" - AZURE_STORAGE_KEY: "{{sunbird_management_storage_account_key}}" roles: - redis-backup run_once: true diff --git a/ansible/roles/aws-cli/defaults/main.yml b/ansible/roles/aws-cli/defaults/main.yml new file mode 100644 index 0000000000..53d866eafa --- /dev/null +++ b/ansible/roles/aws-cli/defaults/main.yml @@ -0,0 +1 @@ +aws_cli_url: https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip \ No newline at end of file diff --git a/ansible/roles/aws-cli/tasks/main.yml b/ansible/roles/aws-cli/tasks/main.yml new file mode 100644 index 0000000000..5907fb1aaf --- /dev/null +++ b/ansible/roles/aws-cli/tasks/main.yml @@ -0,0 +1,24 @@ +--- +- name: Download the installation file + get_url: + url: "{{ aws_cli_url }}" + dest: /tmp/awscliv2.zip + +- name: Installing unzip + apt: + name: "{{item}}" + state: latest + with_items: + - zip + - unzip + +- name: Unzip the installer + unarchive: + src: /tmp/awscliv2.zip + dest: /tmp/ + remote_src: yes + +- name: install aws cli + shell: ./aws/install + args: + chdir: /tmp/ diff --git a/ansible/roles/aws-cloud-storage/defaults/main.yml b/ansible/roles/aws-cloud-storage/defaults/main.yml new file mode 100644 index 0000000000..6f3f6f86d6 --- /dev/null +++ b/ansible/roles/aws-cloud-storage/defaults/main.yml @@ -0,0 +1,3 @@ +s3_bucket_name: "" +s3_path: "" +local_file_or_folder_path: "" diff --git a/ansible/roles/aws-cloud-storage/tasks/delete-folder.yml b/ansible/roles/aws-cloud-storage/tasks/delete-folder.yml new file mode 100644 index 0000000000..c912b14edb --- /dev/null +++ b/ansible/roles/aws-cloud-storage/tasks/delete-folder.yml @@ -0,0 +1,9 @@ +--- +- name: delete files and folders recursively + environment: + AWS_DEFAULT_REGION: "{{ aws_default_region }}" + AWS_ACCESS_KEY_ID: "{{ aws_access_key_id }}" + AWS_SECRET_ACCESS_KEY: "{{ aws_secret_access_key }}" + shell: "aws s3 rm s3://{{ s3_bucket_name }}/{{ s3_path }} --recursive" + async: 3600 + poll: 10 diff --git a/ansible/roles/aws-cloud-storage/tasks/delete.yml b/ansible/roles/aws-cloud-storage/tasks/delete.yml new file mode 100644 index 0000000000..414ea52e6b --- /dev/null +++ b/ansible/roles/aws-cloud-storage/tasks/delete.yml @@ -0,0 +1,9 @@ +--- +- name: delete files from s3 + environment: + AWS_DEFAULT_REGION: "{{ aws_default_region }}" + AWS_ACCESS_KEY_ID: "{{ aws_access_key_id }}" + AWS_SECRET_ACCESS_KEY: "{{ aws_secret_access_key }}" + shell: "aws s3 rm s3://{{ s3_bucket_name }}/{{ s3_path }}" + async: 3600 + poll: 10 diff --git a/ansible/roles/aws-cloud-storage/tasks/download.yml b/ansible/roles/aws-cloud-storage/tasks/download.yml new file mode 100644 index 0000000000..138024af78 --- /dev/null +++ b/ansible/roles/aws-cloud-storage/tasks/download.yml @@ -0,0 +1,9 @@ +--- +- name: download files to s3 + environment: + AWS_DEFAULT_REGION: "{{ aws_default_region }}" + AWS_ACCESS_KEY_ID: "{{ aws_access_key_id }}" + AWS_SECRET_ACCESS_KEY: "{{ aws_secret_access_key }}" + shell: "aws s3 cp s3://{{ s3_bucket_name }}/{{ s3_path }} {{ local_file_or_folder_path }}" + async: 3600 + poll: 10 diff --git a/ansible/roles/aws-cloud-storage/tasks/main.yml b/ansible/roles/aws-cloud-storage/tasks/main.yml new file mode 100644 index 0000000000..62f204a9d2 --- /dev/null +++ b/ansible/roles/aws-cloud-storage/tasks/main.yml @@ -0,0 +1,18 @@ +--- +- name: delete files from aws S3 bucket + include: delete.yml + +- name: delete folders from aws S3 bucket recursively + include: delete-folder.yml + + +- name: download file from S3 + include: download.yml + +- name: upload files from a local to aws S3 + include: upload.yml + +- name: upload files and folder from local directory to aws S3 + include: upload-folder.yml + + diff --git a/ansible/roles/aws-cloud-storage/tasks/upload-folder.yml b/ansible/roles/aws-cloud-storage/tasks/upload-folder.yml new file mode 100644 index 0000000000..3e03b068b7 --- /dev/null +++ b/ansible/roles/aws-cloud-storage/tasks/upload-folder.yml @@ -0,0 +1,9 @@ +--- +- name: upload folder to s3 + environment: + AWS_DEFAULT_REGION: "{{ aws_default_region }}" + AWS_ACCESS_KEY_ID: "{{ aws_access_key_id }}" + AWS_SECRET_ACCESS_KEY: "{{ aws_secret_access_key }}" + shell: "aws s3 cp {{ local_file_or_folder_path }} s3://{{ s3_bucket_name }}/{{ s3_path }} --recursive" + async: 3600 + poll: 10 diff --git a/ansible/roles/aws-cloud-storage/tasks/upload.yml b/ansible/roles/aws-cloud-storage/tasks/upload.yml new file mode 100644 index 0000000000..af8de990e2 --- /dev/null +++ b/ansible/roles/aws-cloud-storage/tasks/upload.yml @@ -0,0 +1,9 @@ +--- +- name: upload files to s3 + environment: + AWS_DEFAULT_REGION: "{{ aws_default_region }}" + AWS_ACCESS_KEY_ID: "{{ aws_access_key_id }}" + AWS_SECRET_ACCESS_KEY: "{{ aws_secret_access_key }}" + shell: "aws s3 cp {{ local_file_or_folder_path }} s3://{{ s3_bucket_name }}/{{ s3_path }}" + async: 3600 + poll: 10 diff --git a/ansible/roles/azure-cli/tasks/main.yml b/ansible/roles/azure-cli/tasks/main.yml index 4bc0c13b3b..abcad5025b 100644 --- a/ansible/roles/azure-cli/tasks/main.yml +++ b/ansible/roles/azure-cli/tasks/main.yml @@ -1,24 +1,36 @@ -- name: Import Azure signing key - #become: yes - shell: curl -L https://packages.microsoft.com/keys/microsoft.asc | apt-key add - +--- +- name: Add Microsfot signing key + become: yes + become_user: root + apt_key: + url: https://packages.microsoft.com/keys/microsoft.asc + state: present -- name: Add Azure apt repository - apt_repository: repo='deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ {{ ansible_distribution_release }} main' state=present +- name: Add Microsoft repository into sources list + become: yes + become_user: root + apt_repository: + repo: "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ {{ ansible_distribution_release | lower }} main" + state: present -- name: Add distribution release security apt repository - apt_repository: repo='deb http://security.ubuntu.com/ubuntu bionic-security main' state=present - -- name: install azure cli dependency - apt: name={{ item }} state=present update_cache=yes - #allow_unauthenticated: yes - with_items: - - libssl1.0-dev - when: ansible_distribution_release == "focal" - -- name: ensure azure-cli and apt-transport-https is installed - apt: name={{ item }} state=present update_cache=yes - #allow_unauthenticated: yes - with_items: - - apt-transport-https - - azure-cli +- name: Install azue cli and dependent packages + become: yes + become_user: root + apt: + pkg: + - ca-certificates + - curl + - apt-transport-https + - lsb-release + - gnupg + - "azure-cli=2.33.1-1~{{ ansible_distribution_release | lower }}" +- name: Install azcopy + become: yes + become_user: root + shell: | + which azcopy || ( \ + mkdir /tmp/azcopy && cd /tmp/azcopy && \ + wget -O azcopy_v10.tar.gz https://aka.ms/downloadazcopy-v10-linux && tar -xf azcopy_v10.tar.gz --strip-components=1 \ + && mv azcopy /usr/local/bin \ + && rm -rf /tmp/azcopy ) diff --git a/ansible/roles/azure-cloud-storage/defaults/main.yml b/ansible/roles/azure-cloud-storage/defaults/main.yml new file mode 100644 index 0000000000..ceef7bd46e --- /dev/null +++ b/ansible/roles/azure-cloud-storage/defaults/main.yml @@ -0,0 +1,72 @@ +# The name of the blob container in the azure storage account +# Example - +# blob_container_name: "my-container" +blob_container_name: "" + +# The delete pattern to delete files and folder +# Example - +# blob_delete_pattern: "my-drectory/*" +# blob_delete_pattern: "my-drectory/another-directory/*" +# blob_delete_pattern: "*" +blob_delete_pattern: "" + +# The storage account name +# Example - +# storage_account_name: "sunbird-dev-public" +storage_account_name: "" + +# The storage account key +# Example - +# storage_account_name: "cmFuZG9tcmFuZG9tcmFuZG9tcmFuZG9tCg==" +storage_account_key: "" + +# The path to local file which has to be uploaded to azure storage +# The local path to store the file after downloading from azure storage +# Example - +# local_file_or_folder_path: "/workspace/my-folder/myfile.json" +# local_file_or_folder_path: "/workspace/my-folder" +local_file_or_folder_path: "" + +# The name of the file in azure storage after uploading from local +# The name of the file in azure storage that has to be downloaded +# Example - +# blob_file_name: "myfile-blob.json" +# You can also pass folder path in order to upload / download the file from a speciic folder +# blob_file_name "my-folder/my-file.json" +blob_file_name: "" + +# The storage account sas token +# Example - +# storage_account_sas_token: "?sv=2022-01-01&ss=abc&srt=rws%3D" +storage_account_sas_token: "" + +# The folder path in azure storage to upload the files starting from the root of the container +# This path should alwasy start with a slash / as we are going to append this value as shown in below example +# Example - +# blob_container_name: "my-container" +# blob_container_folder_path: "/my-folder-path" +# {{ blob_container_name }}{{ blob_container_folder_path }} +# The above translates to "my-container/my-folder-path" + +# The variable can also be empty as shown below, which means we will upload directly at the root path of the container +# Example - +# blob_container_name: "my-container" +# blob_container_folder_path: "" +# The above translates to "my-container" +blob_container_folder_path: "" + +# At what access level the container should be created +# Example - +# container_public_access: "off" +# container_public_access: "blob" +# container_public_access: "container" +# Allowed values are - off, blob, container +# This variable affects only new containers and has no affect on a container if it already exists +# If the container already exists, the access level will not be changed +# You will need to change the access level from Azure portal or using az storage container set-permission command +container_public_access: "" + +# Creates the container by default before running the specific azure blob tasks +# If you would like to skip container creation (in case of a looped execution), +# you can set this value to False in order to skip the contatiner creation task for every iteration +create_container: True diff --git a/ansible/roles/azure-cloud-storage/tasks/blob-delete-batch.yml b/ansible/roles/azure-cloud-storage/tasks/blob-delete-batch.yml new file mode 100644 index 0000000000..4e8ad68a2d --- /dev/null +++ b/ansible/roles/azure-cloud-storage/tasks/blob-delete-batch.yml @@ -0,0 +1,5 @@ +--- +- name: delete files and folders from a blob container recursively + shell: "az storage blob delete-batch --source {{ blob_container_name }} --pattern '{{ blob_delete_pattern }}' --account-name {{ storage_account_name }} --account-key {{ storage_account_key }}" + async: 3600 + poll: 10 \ No newline at end of file diff --git a/ansible/roles/azure-cloud-storage/tasks/blob-download.yml b/ansible/roles/azure-cloud-storage/tasks/blob-download.yml new file mode 100644 index 0000000000..3bbf4b607a --- /dev/null +++ b/ansible/roles/azure-cloud-storage/tasks/blob-download.yml @@ -0,0 +1,5 @@ +--- +- name: download a file from azure storage + shell: "az storage blob download --container-name {{ blob_container_name }} --file {{ local_file_or_folder_path }} --name {{ blob_file_name }} --account-name {{ storage_account_name }} --account-key {{ storage_account_key }}" + async: 3600 + poll: 10 \ No newline at end of file diff --git a/ansible/roles/azure-cloud-storage/tasks/blob-upload-batch.yml b/ansible/roles/azure-cloud-storage/tasks/blob-upload-batch.yml new file mode 100644 index 0000000000..3043da46cc --- /dev/null +++ b/ansible/roles/azure-cloud-storage/tasks/blob-upload-batch.yml @@ -0,0 +1,10 @@ +--- +- name: create container in azure storage if it doesn't exist + include_role: + name: azure-cloud-storage + tasks_from: container-create.yml + +- name: upload files and folders from a local directory to azure storage container + shell: "az storage blob upload-batch --destination {{ blob_container_name }}{{ blob_container_folder_path }} --source {{ local_file_or_folder_path }} --account-name {{ storage_account_name }} --account-key {{ storage_account_key }}" + async: 3600 + poll: 10 \ No newline at end of file diff --git a/ansible/roles/azure-cloud-storage/tasks/blob-upload.yml b/ansible/roles/azure-cloud-storage/tasks/blob-upload.yml new file mode 100644 index 0000000000..4b493ffb73 --- /dev/null +++ b/ansible/roles/azure-cloud-storage/tasks/blob-upload.yml @@ -0,0 +1,10 @@ +--- +- name: create container in azure storage if it doesn't exist + include_role: + name: azure-cloud-storage + tasks_from: container-create.yml + +- name: upload file to azure storage container + shell: "az storage blob upload --container-name {{ blob_container_name }} --file {{ local_file_or_folder_path }} --name {{ blob_file_name }} --account-name {{ storage_account_name }} --account-key {{ storage_account_key }}" + async: 3600 + poll: 10 \ No newline at end of file diff --git a/ansible/roles/azure-cloud-storage/tasks/container-create.yml b/ansible/roles/azure-cloud-storage/tasks/container-create.yml new file mode 100644 index 0000000000..419510cc19 --- /dev/null +++ b/ansible/roles/azure-cloud-storage/tasks/container-create.yml @@ -0,0 +1,8 @@ +--- +- name: create container in azure storage if it doesn't exist + shell: "az storage container create --name {{ blob_container_name }} --public-access {{ container_public_access }} --account-name {{ storage_account_name }} --account-key {{ storage_account_key }}" + when: storage_account_key | length > 0 + +- name: create container in azure storage if it doesn't exist + shell: "az storage container create --name {{ blob_container_name }} --public-access {{ container_public_access }} --account-name {{ storage_account_name }} --sas-token '{{ storage_account_sas_token }}'" + when: storage_account_sas_token | length > 0 \ No newline at end of file diff --git a/ansible/roles/azure-cloud-storage/tasks/delete-using-azcopy.yml b/ansible/roles/azure-cloud-storage/tasks/delete-using-azcopy.yml new file mode 100644 index 0000000000..196de9c9b3 --- /dev/null +++ b/ansible/roles/azure-cloud-storage/tasks/delete-using-azcopy.yml @@ -0,0 +1,17 @@ +--- +- name: generate SAS token for azcopy + shell: | + sas_expiry=`date -u -d "1 hour" '+%Y-%m-%dT%H:%MZ'` + sas_token=?`az storage container generate-sas -n {{ blob_container_name }} --account-name {{ storage_account_name }} --account-key {{ storage_account_key }} --https-only --permissions dlrw --expiry $sas_expiry -o tsv` + echo $sas_token + register: sas_token + +- set_fact: + container_sas_token: "{{ sas_token.stdout}}" + +- name: delete files and folders from azure storage using azcopy + shell: "azcopy rm 'https://{{ storage_account_name }}.blob.core.windows.net/{{ blob_container_name }}{{ blob_container_folder_path }}{{ container_sas_token }}' --recursive" + environment: + AZCOPY_CONCURRENT_FILES: "10" + async: 10800 + poll: 10 diff --git a/ansible/roles/azure-cloud-storage/tasks/main.yml b/ansible/roles/azure-cloud-storage/tasks/main.yml new file mode 100644 index 0000000000..eb435ecfe2 --- /dev/null +++ b/ansible/roles/azure-cloud-storage/tasks/main.yml @@ -0,0 +1,21 @@ +--- +- name: delete files and folders from azure storage container recursively + include: blob-delete-batch.yml + +- name: download a file from azure storage + include: blob-download.yml + +- name: upload files and folders from a local directory to azure storage container + include: blob-upload-batch.yml + +- name: upload file to azure storage container + include: blob-upload.yml + +- name: create container in azure storage if it doesn't exist + include: container-create.yml + +- name: delete files and folders from azure storage using azcopy + include: delete-using-azcopy.yml + +- name: upload files and folders to azure storage using azcopy + include: upload-using-azcopy.yml diff --git a/ansible/roles/azure-cloud-storage/tasks/upload-using-azcopy.yml b/ansible/roles/azure-cloud-storage/tasks/upload-using-azcopy.yml new file mode 100644 index 0000000000..95da584c9b --- /dev/null +++ b/ansible/roles/azure-cloud-storage/tasks/upload-using-azcopy.yml @@ -0,0 +1,23 @@ +--- +- name: generate SAS token for azcopy + shell: | + sas_expiry=`date -u -d "1 hour" '+%Y-%m-%dT%H:%MZ'` + sas_token=?`az storage container generate-sas -n {{ blob_container_name }} --account-name {{ storage_account_name }} --account-key {{ storage_account_key }} --https-only --permissions dlrw --expiry $sas_expiry -o tsv` + echo $sas_token + register: sas_token + +- set_fact: + container_sas_token: "{{ sas_token.stdout}}" + +- name: create container in azure storage if it doesn't exist + include_role: + name: azure-cloud-storage + tasks_from: container-create.yml + when: create_container == True + +- name: upload files and folders to azure storage using azcopy + shell: "azcopy copy {{ local_file_or_folder_path }} 'https://{{ storage_account_name }}.blob.core.windows.net/{{ blob_container_name }}{{ blob_container_folder_path }}{{ container_sas_token }}' --recursive" + environment: + AZCOPY_CONCURRENT_FILES: "10" + async: 10800 + poll: 10 diff --git a/ansible/roles/cassandra-backup/defaults/main.yml b/ansible/roles/cassandra-backup/defaults/main.yml index f523c62620..ac9d48a629 100644 --- a/ansible/roles/cassandra-backup/defaults/main.yml +++ b/ansible/roles/cassandra-backup/defaults/main.yml @@ -1,4 +1,7 @@ cassandra_root_dir: /etc/cassandra cassandra_backup_dir: /data/cassandra/backup -cassandra_backup_azure_container_name: lp-cassandra-backup +data_dir: '/var/lib/cassandra/data' + +cloud_storage_cassandrabackup_bucketname: "{{cloud_storage_management_bucketname}}" +cloud_storage_cassandrabackup_foldername: lp-cassandra-backup diff --git a/ansible/roles/cassandra-backup/meta/main.yml b/ansible/roles/cassandra-backup/meta/main.yml deleted file mode 100644 index 23b18a800a..0000000000 --- a/ansible/roles/cassandra-backup/meta/main.yml +++ /dev/null @@ -1,2 +0,0 @@ -dependencies: - - azure-cli \ No newline at end of file diff --git a/ansible/roles/cassandra-backup/tasks/main.yml b/ansible/roles/cassandra-backup/tasks/main.yml index 83a8bb263d..9b0e0875ff 100755 --- a/ansible/roles/cassandra-backup/tasks/main.yml +++ b/ansible/roles/cassandra-backup/tasks/main.yml @@ -1,37 +1,74 @@ +- name: Make sure backup dir is empty + file: path="{{ cassandra_backup_dir }}" state=absent + ignore_errors: true + - name: Create the directory - file: path={{ cassandra_backup_dir }} state=directory recurse=yes - -- name: copy the backup script - template: src=cassandra_backup.j2 dest={{ cassandra_backup_dir }}/cassandra_backup.py mode=0755 + become: true + file: path=/data/cassandra/backup state=directory recurse=yes + +- name: copy the backup script + become: true + template: + src: cassandra_backup.j2 + dest: /data/cassandra/backup/cassandra_backup.py + mode: 0755 - set_fact: - cassandra_backup_gzip_file_name: "cassandra-backup-{{ lookup('pipe', 'date +%Y%m%d') }}.tar.gz" - -- set_fact: - cassandra_backup_gzip_file_path: "{{ cassandra_backup_dir }}/{{ cassandra_backup_gzip_file_name }}.tar.gz" + cassandra_backup_folder_name: "cassandra-backup-{{ lookup('pipe', 'date +%Y%m%d') }}-{{ ansible_hostname }}-new" - name: run the backup script - shell: python cassandra_backup.py {{ cassandra_backup_gzip_file_name }} -d {{ data_dir }} + become: true + shell: python3 cassandra_backup.py --snapshotname "{{ cassandra_backup_folder_name }}" --snapshotdirectory "{{ cassandra_backup_folder_name }}" args: - chdir: "{{ cassandra_backup_dir }}" - async: 3600 - poll: 10 - + chdir: /data/cassandra/backup + async: 14400 + poll: 30 + - name: Check doc_root path - shell: ls -all {{ cassandra_backup_dir }} + shell: ls -all /data/cassandra/backup/ register: doc_data - name: print doc_root to console debug: - var: doc_data + var: doc_data -- name: upload to azure +- name: upload file to azure storage using azcopy + include_role: + name: azure-cloud-storage + tasks_from: upload-using-azcopy.yml + vars: + blob_container_name: "{{ cloud_storage_cassandrabackup_foldername }}" + container_public_access: "off" + blob_container_folder_path: "" + local_file_or_folder_path: "/data/cassandra/backup/{{ cassandra_backup_folder_name }}" + storage_account_name: "{{ cloud_management_storage_accountname }}" + storage_account_key: "{{ cloud_management_storage_secret }}" + when: cloud_service_provider == "azure" + +- name: upload backup to S3 + include_role: + name: aws-cloud-storage + tasks_from: upload-folder.yml + vars: + local_file_or_folder_path: "/data/cassandra/backup/{{ cassandra_backup_folder_name }}" + s3_bucket_name: "{{ cloud_storage_cassandrabackup_bucketname }}" + s3_path: "{{ cloud_storage_cassandrabackup_foldername }}" + aws_default_region: "{{ cloud_public_storage_region }}" + aws_access_key_id: "{{ cloud_management_storage_accountname }}" + aws_secret_access_key: "{{ cloud_management_storage_secret }}" + when: cloud_service_provider == "aws" + +- name: upload file to gcloud storage include_role: - name: artifacts-upload-azure + name: gcp-cloud-storage + tasks_from: upload-batch.yml vars: - artifact: "{{ cassandra_backup_gzip_file_name }}" - artifact_path: "{{ cassandra_backup_gzip_file_path }}" - artifacts_container: "{{ cassandra_backup_azure_container_name }}" + gcp_storage_service_account_name: "{{ cloud_management_storage_accountname }}" + gcp_storage_key_file: "{{ cloud_management_storage_secret }}" + gcp_bucket_name: "{{ cloud_storage_cassandrabackup_bucketname }}" + gcp_path: "{{ cloud_storage_cassandrabackup_foldername }}" + local_file_or_folder_path: "/data/cassandra/backup/{{ cassandra_backup_folder_name }}" + when: cloud_service_provider == "gcloud" - name: clean up backup dir after upload - file: path={{ cassandra_backup_dir }} state=absent + file: path="{{ cassandra_backup_dir }}" state=absent diff --git a/ansible/roles/cassandra-backup/templates/cassandra_backup.j2 b/ansible/roles/cassandra-backup/templates/cassandra_backup.j2 index b077bc9ac2..dc581d042a 100644 --- a/ansible/roles/cassandra-backup/templates/cassandra_backup.j2 +++ b/ansible/roles/cassandra-backup/templates/cassandra_backup.j2 @@ -1,68 +1,200 @@ + #!/usr/bin/env python3 # Author: Rajesh Rajendran ''' -Create a snapshot and create tar ball in targetdirectory name +Create cassandra snapshot with specified name, +and create tar ball in targetdirectory name + +By default + +Cassandra data directory : /var/lib/cassandra/data +Snapshot name : cassandra_backup-YYYY-MM-DD +Backup name : cassandra_backup-YYYY-MM-DD.tar.gz usage: script snapshot_name -eg: ./cassandra_backup.py my_snapshot +eg: ./cassandra_backup.py + +for help ./cassandra_backup.py -h ''' -from os import path, walk, sep, system, getcwd, makedirs -from argparse import ArgumentParser -from shutil import rmtree, ignore_patterns, copytree -from re import match, compile +from argparse import ArgumentParser +import concurrent.futures +from os import cpu_count, getcwd, link, makedirs, makedirs, sep, system, walk, path +from re import compile, match +from shutil import copytree, ignore_patterns, rmtree +import socket +from subprocess import check_output from sys import exit -from tempfile import mkdtemp +from time import strftime + + +''' +Returns the ip address of current host machine +''' +def get_ip(): + s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) + try: + # doesn't even have to be reachable + s.connect(('10.255.255.255', 1)) + IP = s.getsockname()[0] + except Exception: + print("Couldn't get the correct, please pass the current node's cassandra ip address using flag '--host '") + raise + finally: + s.close() + return str(IP) + +# Create temporary directory to copy data +default_snapshot_name = "cassandra_backup" + strftime("%Y-%m-%d-%H%M%S") +tmpdir = getcwd()+sep+default_snapshot_name parser = ArgumentParser(description="Create a snapshot and create tar ball inside tardirectory") -parser.add_argument("-d","--datadirectory", metavar="datadir", default='/var/lib/cassandra/data', help="path to create the tarball. Default /var/lib/cassadra/data") -parser.add_argument("snapshotname", help="name in which you want to take the snapshot") -parser.add_argument("-t","--tardirectory", metavar="tardir", default=getcwd(), help="path to create the tarball. Default {}".format(getcwd())) +parser.add_argument("-d", "--datadirectory", metavar="datadir", default='/var/lib/cassandra/data', + help="Path to cassadandra keyspaces. Default /var/lib/cassadra/data") +parser.add_argument("-s", "--snapshotdirectory", metavar="snapdir", default=tmpdir, + help="Path to take cassandra snapshot. Default {}".format(tmpdir)) +parser.add_argument("-n", "--snapshotname", metavar="snapshotname", + default="cassandra_backup-"+strftime("%Y-%m-%d"), + help="Name with which snapshot to be taken. Default {}".format(default_snapshot_name)) +parser.add_argument("-t", "--tardirectory", metavar="tardir", + default='', help="Path to create the tarball. Disabled by Default") +parser.add_argument("-w", "--workers", metavar="workers", + default=cpu_count(), help="Number of workers to use. Default same as cpu cores {}".format(cpu_count())) +parser.add_argument("--disablesnapshot", action="store_true", + help="disable taking snapshot, snapshot name can be given via -s flag") +parser.add_argument("--host", default=get_ip(), metavar="< Default: "+get_ip()+" >", help="ip address of cassandra instance. Used to take the token ring info. If the ip address is not correct, Please update the ip address, else your token ring won't be correct.") args = parser.parse_args() -# Create temporary directory to copy data -tmpdir=mkdtemp() -makedirs(tmpdir+sep+"cassandra_backup") +tmpdir = args.snapshotdirectory +# Trying to create the directory if not exists +try: + makedirs(tmpdir+sep+"cassandra_backup") +except OSError as e: + raise + +def customCopy(root, root_target_dir): + print("copying {} to {}".format(root, root_target_dir)) + copytree(src=root, dst=root_target_dir, copy_function=link, ignore=ignore_patterns('.*')) + +# Names of the keyspaces to take schema backup +ignore_keyspace_names = [] def copy(): ''' Copying the data sanpshots to the target directory ''' - root_levels = args.datadirectory.count(sep) - ignore_list = compile(tmpdir+sep+"cassandra_backup"+sep+'(system|system|systemtauth|system_traces|system_schema|system_distributed)') - + root_levels = args.datadirectory.rstrip('/').count(sep) + # List of system keyspaces, which we don't need. + # We need system_schema keyspace, as it has all the keyspace,trigger,types information. + ignore_keyspaces = ["system", "system_auth", "system_traces", "system_distributed", "lock_db"] + # ignore_list = compile('^'+tmpdir+sep+"cassandra_backup"+sep+'(system|system_auth|system_traces|system_distributed|lock_db)/.*$') + ignore_list = compile('^'+tmpdir+sep+"cassandra_backup"+sep+"("+"|".join(ignore_keyspaces)+')/.*$') + # List of the threds running in background + futures = [] try: - for root, dirs, files in walk(args.datadirectory): - root_target_dir=tmpdir+sep+"cassandra_backup"+sep+sep.join(root.split(sep)[root_levels+1:-2]) - if match(ignore_list, root_target_dir): - continue - if root.split(sep)[-1] == args.snapshotname: - copytree(src=root, dst=root_target_dir, ignore=ignore_patterns('.*')) + with concurrent.futures.ThreadPoolExecutor(max_workers=args.workers) as executor: + for root, _, _ in walk(args.datadirectory): + keyspace = sep+sep.join(root.split(sep)[root_levels+1:-2]) + # We don't need tables and other inner directories for keyspace. + if len(keyspace.split('/')) != 3: + continue + root_target_dir = tmpdir+sep+"cassandra_backup"+keyspace + if match(ignore_list, root_target_dir): + continue + if root.split(sep)[-1] == args.snapshotname: + # Keeping copy operation in background with threads + tmp_arr = [root, root_target_dir] + futures.append( executor.submit( lambda p: customCopy(*p), tmp_arr)) except Exception as e: print(e) + # Checking status of the copy operation + for future in concurrent.futures.as_completed(futures): + try: + print("Task completed. Result: {}".format(future.result())) + except Exception as e: + print(e) + +keyspaces_schema_dict = {} +def create_schema(schema_file): + cmd = "cqlsh -e 'SELECT * from system_schema.keyspaces;' | tail -n +4 | head -n -2" + output = check_output('{}'.format(cmd), shell=True).decode().strip() + for line in output.split('\n'): + tmpline = line.split("|") + keyspaces_schema_dict[tmpline[0].strip()] = {"durable_writes": tmpline[1].strip(),"replication": tmpline[2].strip()} -# Creating schema -command = "cqlsh -e 'DESC SCHEMA' > {}/cassandra_backup/db_schema.cql".format(tmpdir) + # Creating table schema + for root, _, files in walk(tmpdir): + for file in files: + if file.endswith(".cql"): + with open(path.join(root, file),'r') as f: + with open(schema_file,'a') as w: + w.write(f.read()) + w.write('\n') + +# Creating complete schema +# For `ALTER DROP COLUMN`, +# This schema will have issues. +# So you'll have to create and drop the column. +# For details about that table/column, look at snapshot_table_schema.sql +command = "cqlsh -e 'DESC SCHEMA' > {}/cassandra_backup/complete_db_schema.cql".format(tmpdir) rc = system(command) if rc != 0: print("Couldn't backup schema, exiting...") exit(1) -print("Schema backup completed. saved in {}/cassandra_backup/db_schema.sql".format(tmpdir)) -# Cleaning all old snapshots -command = "nodetool clearsnapshot" -system(command) -# Creating snapshots -command = "nodetool snapshot -t {}".format(args.snapshotname) +print("Schema backup completed. saved in {}/cassandra_backup/complete_db_schema.sql".format(tmpdir)) + +# Backing up tokenring +command = """ nodetool ring | grep ^""" + get_ip() + """ | awk '{print $NF ","}' | xargs | tee -a """ + tmpdir + """/cassandra_backup/tokenring.txt """ #.format(args.host, tmpdir) +print(command) rc = system(command) -if rc == 0: +if rc != 0: + print("Couldn't backup tokenring, exiting...") + exit(1) +print("Token ring backup completed. saved in {}/cassandra_backup/tokenring.txt".format(tmpdir)) + +# Creating snapshots +if not args.disablesnapshot: + # Cleaning all old snapshots + command = "nodetool clearsnapshot" + system(command) + # Taking new snapshot + command = "nodetool snapshot -t {}".format(args.snapshotname) + rc = system(command) + if rc != 0: + print("Backup failed") + exit(1) print("Snapshot taken.") - copy() + +# Copying the snapshot to proper folder structure, this is not a copy but a hard link. +copy() + +# Dropping unwanted keyspace schema +# Including system schemas +## deduplicating Ignore Keyspace list +ignore_keyspace_names = list(dict.fromkeys(ignore_keyspace_names)) +# Creating schema for keyspaces. +create_schema("{}/cassandra_backup/snapshot_table_schema.cql".format(tmpdir)) + +# Clearing the snapshot. +# We've the data now available in the copied directory. +command = "nodetool clearsnapshot -t {}".format(args.snapshotname) +print("Clearing snapshot {} ...".format(args.snapshotname)) +rc = system(command) +if rc != 0: + print("Clearing snapshot {} failed".format(args.snapshotname)) + exit(1) + +# Creating tarball +if args.tardirectory: print("Making a tarball: {}.tar.gz".format(args.snapshotname)) - command = "cd {} && tar -czvf {}/{}.tar.gz *".format(tmpdir, args.tardirectory, args.snapshotname) - system(command) + command = "cd {} && tar --remove-files -czvf {}/{}.tar.gz *".format(tmpdir, args.tardirectory, args.snapshotname) + rc = system(command) + if rc != 0: + print("Creation of tar failed") + exit(1) # Cleaning up backup directory rmtree(tmpdir) - print("Cassandra backup completed and stored in {}/{}.tar.gz".format(args.tardirectory,args.snapshotname)) \ No newline at end of file + print("Cassandra backup completed and stored in {}/{}.tar.gz".format(args.tardirectory, args.snapshotname)) \ No newline at end of file diff --git a/ansible/roles/cassandra-db-update/templates/data.cql.j2 b/ansible/roles/cassandra-db-update/templates/data.cql.j2 index 6c1b623494..de964cfcbf 100644 --- a/ansible/roles/cassandra-db-update/templates/data.cql.j2 +++ b/ansible/roles/cassandra-db-update/templates/data.cql.j2 @@ -51,11 +51,9 @@ ALTER KEYSPACE {{ hierarchy_keyspace_name }} WITH replication = { 'class': 'NetworkTopologyStrategy', 'datacenter1' : 2 }; - -ALTER TABLE {{ hierarchy_keyspace_name }}.content_hierarchy ADD relational_metadata text; - {% endif %} +ALTER TABLE {{ hierarchy_keyspace_name }}.content_hierarchy ADD relational_metadata text; CREATE TRIGGER IF NOT EXISTS content_data_trigger ON {{ content_keyspace_name }}.content_data USING 'org.sunbird.cassandra.triggers.TransactionEventTrigger'; CREATE TRIGGER IF NOT EXISTS question_data_trigger ON {{ content_keyspace_name }}.question_data USING 'org.sunbird.cassandra.triggers.TransactionEventTrigger'; diff --git a/ansible/roles/cassandra-restore/defaults/main.yml b/ansible/roles/cassandra-restore/defaults/main.yml index db4bea795c..45a9017f2e 100644 --- a/ansible/roles/cassandra-restore/defaults/main.yml +++ b/ansible/roles/cassandra-restore/defaults/main.yml @@ -1,6 +1,7 @@ -cassandra_backup_azure_container_name: lp-cassandra-backup - user: "{{ ansible_ssh_user }}" restore_path: /home/{{user}} backup_folder_name: cassandra_backup backup_dir: "{{restore_path}}/cassandra_backup" + +cloud_storage_cassandrabackup_bucketname: "{{cloud_storage_management_bucketname}}" +cloud_storage_cassandrabackup_foldername: lp-cassandra-backup diff --git a/ansible/roles/cassandra-restore/meta/main.yml b/ansible/roles/cassandra-restore/meta/main.yml deleted file mode 100644 index 23b18a800a..0000000000 --- a/ansible/roles/cassandra-restore/meta/main.yml +++ /dev/null @@ -1,2 +0,0 @@ -dependencies: - - azure-cli \ No newline at end of file diff --git a/ansible/roles/cassandra-restore/tasks/main.yml b/ansible/roles/cassandra-restore/tasks/main.yml index 22e5f41614..95a193be8f 100755 --- a/ansible/roles/cassandra-restore/tasks/main.yml +++ b/ansible/roles/cassandra-restore/tasks/main.yml @@ -5,11 +5,44 @@ - set_fact: cassandra_restore_gzip_file_path: "{{ restore_path }}/{{ cassandra_restore_file_name }}" -- name: Download backup from azure - command: az storage blob download -c {{ cassandra_backup_azure_container_name }} --name {{ cassandra_restore_file_name }} -f {{ cassandra_restore_file_name }} - args: - chdir: "{{ restore_path }}" - become_user: "{{user}}" +- name: download a file from azure storage + become: true + include_role: + name: azure-cloud-storage + tasks_from: blob-download.yml + vars: + blob_container_name: "{{ cloud_storage_cassandrabackup_foldername }}" + blob_file_name: "{{ cassandra_restore_file_name }}" + local_file_or_folder_path: "{{restore_path}}/{{ cassandra_restore_file_name }}" + storage_account_name: "{{ cloud_management_storage_accountname }}" + storage_account_key: "{{ cloud_management_storage_secret }}" + when: cloud_service_provider == "azure" + +- name: download a file from aws s3 + become: true + include_role: + name: aws-cloud-storage + tasks_from: download.yml + vars: + s3_bucket_name: "{{ cloud_storage_cassandrabackup_bucketname }}" + aws_access_key_id: "{{ cloud_management_storage_accountname }}" + aws_secret_access_key: "{{ cloud_management_storage_secret }}" + aws_default_region: "{{ cloud_public_storage_region }}" + local_file_or_folder_path: "{{restore_path}}/{{ cassandra_restore_file_name }}" + s3_path: "{{ cloud_storage_cassandrabackup_foldername }}/{{ cassandra_restore_file_name }}" + when: cloud_service_provider == "aws" + +- name: download file from gcloud storage + include_role: + name: gcp-cloud-storage + tasks_from: download.yml + vars: + gcp_storage_service_account_name: "{{ cloud_management_storage_accountname }}" + gcp_storage_key_file: "{{ cloud_management_storage_secret }}" + gcp_bucket_name: "{{ cloud_storage_cassandrabackup_bucketname }}" + gcp_path: "{{ cloud_storage_cassandrabackup_foldername }}/{{ cassandra_restore_file_name }}" + local_file_or_folder_path: "{{restore_path}}/{{ cassandra_restore_file_name }}" + when: cloud_service_provider == "gcloud" - name: unarchieve backup file unarchive: src={{restore_path}}/{{ cassandra_restore_file_name }} dest={{restore_path}}/ copy=no diff --git a/ansible/roles/es-azure-restore/tasks/main.yml b/ansible/roles/es-azure-restore/tasks/main.yml index f45a97587b..ff71c19a11 100644 --- a/ansible/roles/es-azure-restore/tasks/main.yml +++ b/ansible/roles/es-azure-restore/tasks/main.yml @@ -1,7 +1,7 @@ --- - name: Set azure snapshot for the first time uri: - url: "http://{{ es_restore_host }}:9200/_snapshot/azurebackup" + url: "http://{{ es_restore_host }}:9200/_snapshot/{{ snapshot_base_path }}" method: PUT body: "{{ snapshot_create_request_body | to_json }}" headers: @@ -9,12 +9,12 @@ - name: Restore ES from Azure backup uri: - url: "http://{{ es_restore_host }}:9200/_snapshot/azurebackup/{{snapshot_number}}/_restore" + url: "http://{{ es_restore_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{ snapshot_number }}/_restore" method: POST - name: "Wait for restore to be completed" uri: - url: "http://{{ es_restore_host }}:9200/_snapshot/azurebackup/{{snapshot_number}}/_status" + url: "http://{{ es_restore_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{ snapshot_number }}/_status" method: GET return_content: yes status_code: 200 diff --git a/ansible/roles/es-azure-snapshot/defaults/main.yml b/ansible/roles/es-azure-snapshot/defaults/main.yml index 0c6d6d90ff..df52870977 100644 --- a/ansible/roles/es-azure-snapshot/defaults/main.yml +++ b/ansible/roles/es-azure-snapshot/defaults/main.yml @@ -1,7 +1,14 @@ snapshot_create_request_body: { type: azure, settings: { - container: "elasticsearch-snapshots", + container: "{{ cloud_storage_esbackup_foldername }}", base_path: "{{ snapshot_base_path }}_{{ base_path_date }}" } } + +# Override these values +es_snapshot_host: "localhost" +snapshot_base_path: "default" + +cloud_storage_esbackup_bucketname: "{{ cloud_storage_management_bucketname }}" +cloud_storage_esbackup_foldername: "elasticsearch-snapshots" diff --git a/ansible/roles/es-azure-snapshot/tasks/main.yml b/ansible/roles/es-azure-snapshot/tasks/main.yml index 03986ff04d..c3e2ba3343 100644 --- a/ansible/roles/es-azure-snapshot/tasks/main.yml +++ b/ansible/roles/es-azure-snapshot/tasks/main.yml @@ -1,9 +1,9 @@ --- - set_fact: base_path_date="{{ lookup('pipe','date +%Y-%m') }}" -- name: Create azure snapshot +- name: Create Azure Repository uri: - url: "http://{{ es_snapshot_host }}:9200/_snapshot/azurebackup" + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}" method: PUT body: "{{ snapshot_create_request_body | to_json }}" headers: @@ -13,7 +13,7 @@ - name: Take new snapshot uri: - url: "http://{{ es_snapshot_host }}:9200/_snapshot/azurebackup/{{snapshot_number}}" + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{snapshot_number}}" method: PUT body: > {"indices":"*","include_global_state":false} @@ -22,17 +22,17 @@ - name: Print all snapshots uri: - url: "http://{{ es_snapshot_host }}:9200/_snapshot/azurebackup/_all" + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/_all" method: GET - name: Print status of current snapshot uri: - url: "http://{{ es_snapshot_host }}:9200/_snapshot/azurebackup/{{snapshot_number}}" + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{snapshot_number}}" method: GET - name: "Wait for backup to be completed" uri: - url: "http://{{ es_snapshot_host }}:9200/_snapshot/azurebackup/{{snapshot_number}}" + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{snapshot_number}}" method: GET return_content: yes status_code: 200 diff --git a/ansible/roles/es-curator/defaults/main.yml b/ansible/roles/es-curator/defaults/main.yml new file mode 100644 index 0000000000..9fd4efe2c8 --- /dev/null +++ b/ansible/roles/es-curator/defaults/main.yml @@ -0,0 +1,2 @@ +es_curator_major_version: 5 +es_curator_version: 5.8.4 \ No newline at end of file diff --git a/ansible/roles/es-curator/tasks/main.yml b/ansible/roles/es-curator/tasks/main.yml new file mode 100644 index 0000000000..c4a8bacee7 --- /dev/null +++ b/ansible/roles/es-curator/tasks/main.yml @@ -0,0 +1,13 @@ +- name: Debian - Add Elasticsearch repository key + apt_key: url="https://artifacts.elastic.co/GPG-KEY-elasticsearch" state=present + +- name: Add curator {{ es_curator_major_version }} repo + apt_repository: repo='deb [arch=amd64] http://packages.elastic.co/curator/{{ es_curator_major_version }}/debian stable main' state=present update_cache=yes + +- debug: + msg: "{{ es_curator_version }}" + +- name: Install elasticsearch curator + apt: + name: elasticsearch-curator={{ es_curator_version }} + force: yes diff --git a/ansible/roles/es-gcs-snapshot/defaults/main.yml b/ansible/roles/es-gcs-snapshot/defaults/main.yml new file mode 100644 index 0000000000..7222b0c06b --- /dev/null +++ b/ansible/roles/es-gcs-snapshot/defaults/main.yml @@ -0,0 +1,14 @@ +snapshot_create_request_body: { + type: gcs, + settings: { + bucket: "{{ cloud_storage_management_bucketname }}", + base_path: "{{ cloud_storage_esbackup_foldername }}/{{ snapshot_base_path }}_{{ base_path_date }}" + } +} + +# Override these values +es_snapshot_host: "localhost" +snapshot_base_path: "default" + +cloud_storage_esbackup_bucketname: "{{ cloud_storage_management_bucketname }}" +cloud_storage_esbackup_foldername: "elasticsearch-snapshots" diff --git a/ansible/roles/es-gcs-snapshot/tasks/main.yml b/ansible/roles/es-gcs-snapshot/tasks/main.yml new file mode 100644 index 0000000000..55f50b17ad --- /dev/null +++ b/ansible/roles/es-gcs-snapshot/tasks/main.yml @@ -0,0 +1,42 @@ +--- + +- set_fact: base_path_date="{{ lookup('pipe','date +%Y-%m') }}" + +- set_fact: snapshot_number="snapshot_{{ lookup('pipe','date +%s') }}" + +- name: Create GCS Repository + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}" + method: PUT + body: "{{ snapshot_create_request_body | to_json }}" + headers: + Content-Type: "application/json" + +- name: Take new snapshot + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{ snapshot_number }}" + method: PUT + headers: + Content-Type: "application/json" + +- name: Print all snapshots + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/_all" + method: GET + +- name: Print status of current snapshot + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{ snapshot_number }}" + method: GET + +- name: "Wait for backup to be completed" + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{ snapshot_number }}" + method: GET + return_content: yes + status_code: 200 + body_format: json + register: result + until: result.json.snapshots[0].state == 'SUCCESS' + retries: 120 + delay: 10 diff --git a/ansible/roles/es-s3-snapshot/defaults/main.yml b/ansible/roles/es-s3-snapshot/defaults/main.yml new file mode 100644 index 0000000000..316ae512fb --- /dev/null +++ b/ansible/roles/es-s3-snapshot/defaults/main.yml @@ -0,0 +1,14 @@ +snapshot_create_request_body: { + type: s3, + settings: { + bucket: "{{ cloud_storage_esbackup_bucketname }}", + base_path: "{{ cloud_storage_esbackup_foldername }}/{{ snapshot_base_path }}_{{ base_path_date }}" + } +} + +# Override these values +es_snapshot_host: "localhost" +snapshot_base_path: "default" + +cloud_storage_esbackup_bucketname: "{{ cloud_storage_management_bucketname }}" +cloud_storage_esbackup_foldername: "elasticsearch-snapshots" diff --git a/ansible/roles/es-s3-snapshot/tasks/main.yml b/ansible/roles/es-s3-snapshot/tasks/main.yml new file mode 100644 index 0000000000..aee768626c --- /dev/null +++ b/ansible/roles/es-s3-snapshot/tasks/main.yml @@ -0,0 +1,42 @@ +--- + +- set_fact: base_path_date="{{ lookup('pipe','date +%Y-%m') }}" + +- set_fact: snapshot_number="snapshot_{{ lookup('pipe','date +%s') }}" + +- name: Create S3 Repository + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}" + method: PUT + body: "{{ snapshot_create_request_body | to_json }}" + headers: + Content-Type: "application/json" + +- name: Take new snapshot + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{ snapshot_number }}" + method: PUT + headers: + Content-Type: "application/json" + +- name: Print all snapshots + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/_all" + method: GET + +- name: Print status of current snapshot + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{ snapshot_number }}" + method: GET + +- name: "Wait for backup to be completed" + uri: + url: "http://{{ es_snapshot_host }}:9200/_snapshot/{{ snapshot_base_path }}/{{ snapshot_number }}" + method: GET + return_content: yes + status_code: 200 + body_format: json + register: result + until: result.json.snapshots[0].state == 'SUCCESS' + retries: 120 + delay: 10 diff --git a/ansible/roles/es5-snapshot-purge/defaults/main.yml b/ansible/roles/es5-snapshot-purge/defaults/main.yml new file mode 100644 index 0000000000..cab1182a3d --- /dev/null +++ b/ansible/roles/es5-snapshot-purge/defaults/main.yml @@ -0,0 +1,5 @@ +es_snapshot_host: localhost +es_snapshot_repository: "{{ snapshot_base_path }}" +es_snapshot_retention_days: 30 +es_curator_config_dir: /etc/curator +es_curator_config_file: "{{ es_curator_config_dir }}/curator.yml" diff --git a/ansible/roles/es5-snapshot-purge/meta/main.yml b/ansible/roles/es5-snapshot-purge/meta/main.yml new file mode 100644 index 0000000000..8b4e268b5d --- /dev/null +++ b/ansible/roles/es5-snapshot-purge/meta/main.yml @@ -0,0 +1,3 @@ +--- +dependencies: + - { role: es-curator, es_curator_major_version: 5, es_curator_version: 5.8.4 } diff --git a/ansible/roles/es5-snapshot-purge/tasks/main.yml b/ansible/roles/es5-snapshot-purge/tasks/main.yml new file mode 100644 index 0000000000..21364440ce --- /dev/null +++ b/ansible/roles/es5-snapshot-purge/tasks/main.yml @@ -0,0 +1,14 @@ +# See meta folder for curator installation +- name: Ensure curator config dir exists + file: dest="{{es_curator_config_dir}}" state=directory + +- name: Create curator.yml + template: src=curator.yml dest="{{es_curator_config_file}}" + +- name: Create snapshot-purge-action.yml + template: src=snapshot-purge-action.yml dest="{{es_curator_config_dir}}/snapshot-purge-action.yml" + +- name: Delete snapshots older than {{ es_snapshot_retention_days }} days + shell: "curator --config {{ es_curator_config_file }} {{es_curator_config_dir}}/snapshot-purge-action.yml" + async: 800 + poll: 20 diff --git a/ansible/roles/es5-snapshot-purge/templates/curator.yml b/ansible/roles/es5-snapshot-purge/templates/curator.yml new file mode 100644 index 0000000000..6c9c6c9976 --- /dev/null +++ b/ansible/roles/es5-snapshot-purge/templates/curator.yml @@ -0,0 +1,10 @@ +client: + hosts: + - {{ es_snapshot_host }} + port: 9200 + timeout: 100 +logging: + loglevel: INFO + logfile: + logformat: default + blacklist: ['elasticsearch', 'urllib3'] \ No newline at end of file diff --git a/ansible/roles/es5-snapshot-purge/templates/snapshot-purge-action.yml b/ansible/roles/es5-snapshot-purge/templates/snapshot-purge-action.yml new file mode 100644 index 0000000000..bd6cc05b06 --- /dev/null +++ b/ansible/roles/es5-snapshot-purge/templates/snapshot-purge-action.yml @@ -0,0 +1,13 @@ +actions: + 1: + action: delete_snapshots + description: Delete snapshots older than {{ es_snapshot_retention_days }} days + options: + repository: {{ es_snapshot_repository }} + ignore_empty_list: True + filters: + - filtertype: age + source: creation_date + direction: older + unit: days + unit_count: {{ es_snapshot_retention_days }} diff --git a/ansible/roles/gcloud-cli/tasks/main.yml b/ansible/roles/gcloud-cli/tasks/main.yml new file mode 100644 index 0000000000..4e39b7ceaf --- /dev/null +++ b/ansible/roles/gcloud-cli/tasks/main.yml @@ -0,0 +1,19 @@ +--- +- name: Add gcloud signing key + apt_key: + url: https://packages.cloud.google.com/apt/doc/apt-key.gpg + state: present + +- name: Add gcloud repository into sources list + apt_repository: + repo: "deb https://packages.cloud.google.com/apt cloud-sdk main" + state: present + +- name: Install google cloud cli with specific version and dependent packages + apt: + pkg: + - ca-certificates + - curl + - apt-transport-https + - gnupg + - google-cloud-cli=406.0.0-0 diff --git a/ansible/roles/gcp-cloud-storage/defaults/main.yml b/ansible/roles/gcp-cloud-storage/defaults/main.yml new file mode 100644 index 0000000000..a9f4247d42 --- /dev/null +++ b/ansible/roles/gcp-cloud-storage/defaults/main.yml @@ -0,0 +1,54 @@ +# GCP service account name +# Example - +# gcp_storage_service_account_name: test@sunbird.iam.gserviceaccount.com +gcp_storage_service_account_name: "" + +# GCP bucket name +# Example - +# bucket_name: "sunbird-dev-public" +gcp_bucket_name: "" + +# The service account key file +# Example - +# gcp_storage_key_file: "/tmp/gcp.json" +gcp_storage_key_file: "" + +# Folder name in GCP bucket +# Example - +# gcp_path: "my-destination-folder" +gcp_path: "" + +# The delete pattern to delete files and folder +# Example - +# file_delete_pattern: "my-drectory/*" +# file_delete_pattern: "my-drectory/another-directory/*" +# file_delete_pattern: "*" +file_delete_pattern: "" + +# The path to local file which has to be uploaded to gcloud storage +# The local path to store the file after downloading from gcloud storage +# Example - +# local_file_or_folder_path: "/workspace/my-folder/myfile.json" +# local_file_or_folder_path: "/workspace/my-folder" +local_file_or_folder_path: "" + +# The name of the file in gcloud storage after uploading from local path +# The name of the file in gcloud storage that has to be downloaded +# Example - +# dest_file_name: "/myfile-blob.json" +dest_file_name: "" + + +# The folder path in gcloud storage to upload the files starting from the root of the bucket +# This path should start with / if we provide a value for this variable since we are going to append this path as below +# {{ bucket_name }}{{ gcp_path }} +# The above translates to "my-bucket/my-folder-path" +# Example - +# dest_folder_path: "/my-folder/json-files-folder" +# This variable can also be empty as shown below, which means we will upload directly at the root path of the bucket +dest_folder_path: "" + +# The local folder path which has to be uploaded to gcloud storage +# Example - +# local_source_folder: "/workspace/my-folder/json-files-folder" +local_source_folder: "" diff --git a/ansible/roles/gcp-cloud-storage/tasks/delete-batch.yml b/ansible/roles/gcp-cloud-storage/tasks/delete-batch.yml new file mode 100644 index 0000000000..17fe952b16 --- /dev/null +++ b/ansible/roles/gcp-cloud-storage/tasks/delete-batch.yml @@ -0,0 +1,11 @@ +--- +- name: Authenticate to gcloud + include_tasks: gcloud-auth.yml + +- name: Delete folder recursively in gcp storage + shell: gsutil rm -r "gs://{{ gcp_bucket_name }}/{{ file_delete_pattern }}" + async: 3600 + poll: 10 + +- name: Revoke gcloud access + include_tasks: gcloud-revoke.yml diff --git a/ansible/roles/gcp-cloud-storage/tasks/download.yml b/ansible/roles/gcp-cloud-storage/tasks/download.yml new file mode 100644 index 0000000000..73bf76bb04 --- /dev/null +++ b/ansible/roles/gcp-cloud-storage/tasks/download.yml @@ -0,0 +1,11 @@ +--- +- name: Authenticate to gcloud + include_tasks: gcloud-auth.yml + +- name: Download from gcloud storage + shell: gsutil cp "gs://{{ gcp_bucket_name }}/{{ gcp_path }}" "{{ local_file_or_folder_path }}" + async: 3600 + poll: 10 + +- name: Revoke gcloud access + include_tasks: gcloud-revoke.yml diff --git a/ansible/roles/gcp-cloud-storage/tasks/gcloud-auth.yml b/ansible/roles/gcp-cloud-storage/tasks/gcloud-auth.yml new file mode 100644 index 0000000000..a480bdc275 --- /dev/null +++ b/ansible/roles/gcp-cloud-storage/tasks/gcloud-auth.yml @@ -0,0 +1,14 @@ +--- +- name: create tmp gcp service key file + tempfile: + state: file + suffix: gcp + register: config_key + +- name: Copy service account key file + copy: + content: "{{ gcp_storage_key_file }}" + dest: "{{ config_key.path }}" + +- name: Configure gcloud service account + shell: gcloud auth activate-service-account "{{ gcp_storage_service_account_name }}" --key-file="{{ config_key.path }}" diff --git a/ansible/roles/gcp-cloud-storage/tasks/gcloud-revoke.yml b/ansible/roles/gcp-cloud-storage/tasks/gcloud-revoke.yml new file mode 100644 index 0000000000..8c26cd0ef0 --- /dev/null +++ b/ansible/roles/gcp-cloud-storage/tasks/gcloud-revoke.yml @@ -0,0 +1,8 @@ +- name: Revoke gcloud service account access + shell: gcloud auth revoke "{{ gcp_storage_service_account_name }}" + +- name: Remove key file + file: + path: "{{ config_key.path }}" + state: absent + when: config_key.path is defined diff --git a/ansible/roles/gcp-cloud-storage/tasks/main.yml b/ansible/roles/gcp-cloud-storage/tasks/main.yml new file mode 100644 index 0000000000..aa41c090ed --- /dev/null +++ b/ansible/roles/gcp-cloud-storage/tasks/main.yml @@ -0,0 +1,20 @@ +--- +- name: upload file to gcloud storage + include: upload.yml + tags: + - file-upload + +- name: upload batch of files to gcloud storage + include: upload-batch.yml + tags: + - upload-batch + +- name: delete batch of files from gcloud storage + include: delete-batch.yml + tags: + - delete-batch + +- name: download a file from gcloud storage + include: download.yml + tags: + - file-download \ No newline at end of file diff --git a/ansible/roles/gcp-cloud-storage/tasks/upload-batch.yml b/ansible/roles/gcp-cloud-storage/tasks/upload-batch.yml new file mode 100644 index 0000000000..dc103969aa --- /dev/null +++ b/ansible/roles/gcp-cloud-storage/tasks/upload-batch.yml @@ -0,0 +1,11 @@ +--- +- name: Authenticate to gcloud + include_tasks: gcloud-auth.yml + +- name: Upload files from a local directory gcp storage + shell: gsutil -m cp -r "{{ local_file_or_folder_path }}" "gs://{{ gcp_bucket_name }}/{{ gcp_path}}" + async: 3600 + poll: 10 + +- name: Revoke gcloud access + include_tasks: gcloud-revoke.yml diff --git a/ansible/roles/gcp-cloud-storage/tasks/upload.yml b/ansible/roles/gcp-cloud-storage/tasks/upload.yml new file mode 100644 index 0000000000..de766a94c7 --- /dev/null +++ b/ansible/roles/gcp-cloud-storage/tasks/upload.yml @@ -0,0 +1,11 @@ +--- +- name: Authenticate to gcloud + include_tasks: gcloud-auth.yml + +- name: Upload to gcloud storage + shell: gsutil cp "{{ local_file_or_folder_path }}" "gs://{{ gcp_bucket_name }}/{{ gcp_path }}" + async: 3600 + poll: 10 + +- name: Revoke gcloud access + include_tasks: gcloud-revoke.yml diff --git a/ansible/roles/learning-service/defaults/main.yml b/ansible/roles/learning-service/defaults/main.yml index ced3460453..4787dcb15a 100644 --- a/ansible/roles/learning-service/defaults/main.yml +++ b/ansible/roles/learning-service/defaults/main.yml @@ -50,4 +50,4 @@ cloud_storage_config_environment: "{{env}}" tomcat_init_mem: -Xms1024m tomcat_max_mem: -Xmx4096m search_index_host: "{{ groups['composite-search-cluster']|join(':9200,')}}:9200" -compositesearch_index_name: "compositesearch" +compositesearch_index_name: "compositesearch" \ No newline at end of file diff --git a/ansible/roles/learning-service/tasks/main.yml b/ansible/roles/learning-service/tasks/main.yml index 00feb5ba6c..ca5e06ffa9 100644 --- a/ansible/roles/learning-service/tasks/main.yml +++ b/ansible/roles/learning-service/tasks/main.yml @@ -1,5 +1,5 @@ -- name: checking the list of installed services - service_facts: +# - name: checking the list of installed services +# service_facts: - name: Stop the monit service: name=monit state=stopped diff --git a/ansible/roles/learning-service/templates/application.conf.j2 b/ansible/roles/learning-service/templates/application.conf.j2 index 5ba385b4e1..4590e18bb7 100644 --- a/ansible/roles/learning-service/templates/application.conf.j2 +++ b/ansible/roles/learning-service/templates/application.conf.j2 @@ -165,37 +165,61 @@ kafka.topics.instruction="{{ kafka_topics_instruction }}" kafka.publish.request.topic="{{ kafka_publish_request_topic }}" kafka.urls="{{ graphevent_kafka_url }}" kafka.topic.system.command="{{ kafka_topic_system_command }}" -job.request.event.mimetype=["application/pdf", "video/avi", "video/mpeg", "video/quicktime", "video/3gpp", "video/mpeg", "video/mp4", "video/ogg", "video/webm", "application/vnd.ekstep.html-archive","application/vnd.ekstep.ecml-archive","application/vnd.ekstep.content-collection" - "application/vnd.ekstep.ecml-archive", - "application/vnd.ekstep.html-archive", - "application/vnd.android.package-archive", - "application/vnd.ekstep.content-archive", - "application/octet-stream", - "application/json", - "application/javascript", - "application/xml", - "text/plain", - "text/html", - "text/javascript", - "text/xml", - "text/css", - "image/jpeg", "image/jpg", "image/png", "image/tiff", "image/bmp", "image/gif", "image/svg+xml", - "image/x-quicktime", - "video/avi", "video/mpeg", "video/quicktime", "video/3gpp", "video/mpeg", "video/mp4", "video/ogg", "video/webm", - "video/msvideo", - "video/x-msvideo", - "video/x-qtc", - "video/x-mpeg", - "audio/mp3", "audio/mp4", "audio/mpeg", "audio/ogg", "audio/webm", "audio/x-wav", "audio/wav", - "audio/mpeg3", - "audio/x-mpeg-3", - "audio/vorbis", - "application/x-font-ttf", - "application/pdf", "application/epub", "application/msword", - "application/vnd.ekstep.h5p-archive", - "application/vnd.ekstep.plugin-archive", - "video/x-youtube", "video/youtube", - "text/x-url"] +job.request.event.mimetype=["application/pdf", + "application/vnd.ekstep.ecml-archive", + "application/vnd.ekstep.html-archive", + "application/vnd.android.package-archive", + "application/vnd.ekstep.content-archive", + "application/epub", + "application/msword", + "application/vnd.ekstep.h5p-archive", + "video/webm", + "video/mp4", + "application/vnd.ekstep.content-collection", + "video/quicktime", + "application/octet-stream", + "application/json", + "application/javascript", + "application/xml", + "text/plain", + "text/html", + "text/javascript", + "text/xml", + "text/css", + "image/jpeg", + "image/jpg", + "image/png", + "image/tiff", + "image/bmp", + "image/gif", + "image/svg+xml", + "image/x-quicktime", + "video/avi", + "video/mpeg", + "video/quicktime", + "video/3gpp", + "video/mp4", + "video/ogg", + "video/webm", + "video/msvideo", + "video/x-msvideo", + "video/x-qtc", + "video/x-mpeg", + "audio/mp3", + "audio/mp4", + "audio/mpeg", + "audio/ogg", + "audio/webm", + "audio/x-wav", + "audio/wav", + "audio/mpeg3", + "audio/x-mpeg-3", + "audio/vorbis", + "application/x-font-ttf", + "application/vnd.ekstep.plugin-archive", + "video/x-youtube", + "video/youtube", + "text/x-url"] #Youtube Standard Licence Validation learning.content.youtube.validate.license=true @@ -224,11 +248,12 @@ learning.content.type.not.copied.list=["Asset"] #Youtube License Validation Regex Pattern youtube.license.regex.pattern=["\\?vi?=([^&]*)", "watch\\?.*v=([^&]*)", "(?:embed|vi?)/([^/?]*)","^([A-Za-z0-9\\-\\_]*)"] -#Azure Storage details -cloud_storage_type="{{ cloud_store }}" -azure_storage_key="{{sunbird_public_storage_account_name}}" -azure_storage_secret="{{sunbird_public_storage_account_key}}" -azure_storage_container="{{ azure_public_container }}" +#Cloud Storage details +cloud_storage_type="{{ cloud_service_provider }}" +cloud_storage_key="{{ cloud_public_storage_accountname }}" +cloud_storage_secret="{{ cloud_public_storage_secret }}" +cloud_storage_container="{{ cloud_storage_content_bucketname }}" +cloud_storage_endpoint="{{ cloud_public_storage_endpoint }}" installation.id="{{ instance_name }}" @@ -280,3 +305,12 @@ content.tagging.property="subject,medium" # Search Service Config kp.search_service.base_url="{{ kp_search_service_base_url }}" + +# CNAME migration variables +cloudstorage { + metadata.replace_absolute_path=true + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} +} \ No newline at end of file diff --git a/ansible/roles/logstash-deploy/tasks/main.yml b/ansible/roles/logstash-deploy/tasks/main.yml index fdeb62fe28..350c02f979 100644 --- a/ansible/roles/logstash-deploy/tasks/main.yml +++ b/ansible/roles/logstash-deploy/tasks/main.yml @@ -1,5 +1,5 @@ -- name: checking the list of installed services - service_facts: +# - name: checking the list of installed services +# service_facts: - name: Stop the monit service: name=monit state=stopped diff --git a/ansible/roles/lp-contenttool/templates/application.conf.j2 b/ansible/roles/lp-contenttool/templates/application.conf.j2 index 7444400ad2..7a990df718 100644 --- a/ansible/roles/lp-contenttool/templates/application.conf.j2 +++ b/ansible/roles/lp-contenttool/templates/application.conf.j2 @@ -16,7 +16,7 @@ content.extract_mimetype="application/vnd.ekstep.h5p-archive,application/vnd.eks cloud.src.baseurl="https://ekstep-public-{{ ekstep_env_name }}.s3-ap-south-1.amazonaws.com" -cloud.dest.baseurl="https://{{ sunbird_public_storage_account_name }}.blob.core.windows.net/{{ azure_public_container }}" +cloud.dest.baseurl="{{ cloud_storage_url }}/{{ plugin_storage }}" aws_storage_key="" aws_storage_secret="" diff --git a/ansible/roles/lp-synctool-deploy/defaults/main.yml b/ansible/roles/lp-synctool-deploy/defaults/main.yml index b9919c28d4..4eac4bf5ad 100644 --- a/ansible/roles/lp-synctool-deploy/defaults/main.yml +++ b/ansible/roles/lp-synctool-deploy/defaults/main.yml @@ -25,4 +25,11 @@ search_lms_index_host: "{{ groups['core-es']|join(':9200,')}}:9200" cloud_store: azure azure_public_container: -azure_account_name: \ No newline at end of file +azure_account_name: + +csp_migration_batch_size: 100 +csp_migration_topic_name: "{{ env }}.csp.migration.job.request" + +# Default Values For QuML Data Migration +quml_migration_batch_size: 50 +quml_migration_topic_name: "{{ env }}.quml.migration.job.request" \ No newline at end of file diff --git a/ansible/roles/lp-synctool-deploy/templates/application.conf.j2 b/ansible/roles/lp-synctool-deploy/templates/application.conf.j2 index 686acdda15..188f5e7211 100644 --- a/ansible/roles/lp-synctool-deploy/templates/application.conf.j2 +++ b/ansible/roles/lp-synctool-deploy/templates/application.conf.j2 @@ -46,7 +46,7 @@ search.batch.size=500 search.connection.timeout=30 search.index.name="{{ compositesearch_index_name }}" -nested.fields=["badgeAssertions","targets","badgeAssociations","plugins","me_totalTimeSpent","me_totalPlaySessionCount","me_totalTimeSpentInSec","batches","trackable","credentials"] +nested.fields=["badgeAssertions", "targets", "badgeAssociations", "plugins", "me_totalTimeSpent", "me_totalPlaySessionCount", "me_totalTimeSpentInSec", "batches", "trackable", "credentials", "discussionForum", "provider", "osMetadata", "actions", "transcripts", "accessibility"] channel.default="in.ekstep" # Cassandra Configurations @@ -76,11 +76,11 @@ content.postpublish.topic="{{ env }}.content.postpublish.request" search.lms_es_conn_info="{{ search_lms_index_host }}" -#Azure Storage details -cloud_storage_type="{{ cloud_store }}" -azure_storage_key="{{sunbird_public_storage_account_name}}" -azure_storage_secret="{{sunbird_public_storage_account_key}}" -azure_storage_container="{{ azure_public_container }}" +#Cloud Storage details +cloud_storage_type="{{ cloud_service_provider }}" +cloud_storage_key="{{ cloud_public_storage_accountname }}" +cloud_storage_secret="{{ cloud_public_storage_secret }}" +cloud_storage_container="{{ cloud_storage_content_bucketname }}" contentTypeToPrimaryCategory { ClassroomTeachingVideo: "Explanation Content" @@ -115,4 +115,16 @@ contentTypeToPrimaryCategory { LessonPlanUnit: "Lesson Plan Unit" CourseUnit: "Course Unit" TextBookUnit: "Textbook Unit" -} \ No newline at end of file +} +csp.migration.request.topic="{{ csp_migration_topic_name }}" +csp.migration.batch.size={{ csp_migration_batch_size }} +is_replace_string={{ sync_tool_is_replace_string | default('false') }} +replace_src_string= "{{ sync_tool_replace_src_string | default('CONTENT_STORAGE_BASE_PATH') }}" +replace_dest_string="{{ sync_tool_replace_dest_string | default('https://sunbirddevbbpublic.blob.core.windows.net/sunbird-content-dev') }}" + +replace_src_string_DIAL_store= "{{ sync_tool_replace_src_string_DIAL_store | default('DIAL_STORAGE_BASE_PATH') }}" +replace_dest_string_DIAL_store="{{ sync_tool_replace_dest_string_DIAL_store | default('https://sunbirddevbbpublic.blob.core.windows.net/dial') }}" + +# Config For QuML Data Migration +quml.migration.request.topic="{{ quml_migration_topic_name }}" +quml.migration.batch.size={{ quml_migration_batch_size }} \ No newline at end of file diff --git a/ansible/roles/neo4j-backup/defaults/main.yml b/ansible/roles/neo4j-backup/defaults/main.yml index b2bb981a3a..ecc58634ee 100644 --- a/ansible/roles/neo4j-backup/defaults/main.yml +++ b/ansible/roles/neo4j-backup/defaults/main.yml @@ -6,4 +6,9 @@ backup_add: "127.0.0.1:7362" var1: "_graph" service: learning graph_machine: "{{service}}{{var1}}" -neo4j_backup_azure_container_name: neo4j-backup + +neo4j_backup_dir: "{{ learner_user_home }}/backup" + +cloud_storage_neo4jbackup_bucketname: "{{ cloud_storage_management_bucketname }}" +cloud_storage_neo4jbackup_foldername: neo4j-backup + diff --git a/ansible/roles/neo4j-backup/tasks/main.yml b/ansible/roles/neo4j-backup/tasks/main.yml index 59f4a29954..fa230974d7 100755 --- a/ansible/roles/neo4j-backup/tasks/main.yml +++ b/ansible/roles/neo4j-backup/tasks/main.yml @@ -1,6 +1,3 @@ -- name: clean up backup dir after upload - file: path={{learner_user_home}}/backup state=absent - - name: delete learning_graph or language_graph #become: yes file: path={{learner_user_home}}/{{graph_machine}} state=absent @@ -21,30 +18,52 @@ - name: ls backup directory #become: yes #become_user: "{{ learner_user }}" - command: ls {{learner_user_home}}/backup/ + command: ls {{ neo4j_backup_dir }} register: var1 - name: debugging variable debug: - var: var1.stdout + var: var1.stdout -- name: Ensure azure blob storage container exists - command: az storage container create --name {{ neo4j_backup_azure_container_name }} - ignore_errors: true +- name: upload file to azure storage using azcopy + include_role: + name: azure-cloud-storage + tasks_from: blob-upload.yml + vars: + blob_container_name: "{{ cloud_storage_neo4jbackup_foldername }}" + blob_file_name: "{{ var1.stdout }}" + container_public_access: "off" + local_file_or_folder_path: "/home/learning/backup/{{ var1.stdout }}" + storage_account_name: "{{ cloud_management_storage_accountname }}" + storage_account_key: "{{ cloud_management_storage_secret }}" + when: cloud_service_provider == "azure" - #environment: - # AZURE_STORAGE_ACCOUNT: "{{ backup_azure_storage_account_name }}" - # AZURE_STORAGE_KEY: "{{ backup_azure_storage_access_key }}" +- name: upload file to aws s3 + include_role: + name: aws-cloud-storage + tasks_from: upload.yml + vars: + s3_bucket_name: "{{ cloud_storage_neo4jbackup_bucketname }}" + aws_access_key_id: "{{ cloud_management_storage_accountname }}" + aws_secret_access_key: "{{ cloud_management_storage_secret }}" + aws_default_region: "{{ cloud_public_storage_region }}" + local_file_or_folder_path: "/home/learning/backup/{{ var1.stdout }}" + s3_path: "{{ cloud_storage_neo4jbackup_foldername }}/{{ var1.stdout }}" + when: cloud_service_provider == "aws" -- name: Upload to azure blob storage - command: "az storage blob upload --name {{ var1.stdout }} --file /home/learning/backup/{{ var1.stdout }} --container-name {{ neo4j_backup_azure_container_name }}" - #environment: - # AZURE_STORAGE_ACCOUNT: "{{ backup_azure_storage_account_name }}" - # AZURE_STORAGE_KEY: "{{ backup_azure_storage_access_key }}" - async: 3600 - poll: 10 +- name: upload file to gcloud storage + include_role: + name: gcp-cloud-storage + tasks_from: upload.yml + vars: + gcp_storage_service_account_name: "{{ cloud_management_storage_accountname }}" + gcp_storage_key_file: "{{ cloud_management_storage_secret }}" + gcp_bucket_name: "{{ cloud_storage_neo4jbackup_bucketname }}" + gcp_path: "{{ cloud_storage_neo4jbackup_foldername }}/{{ var1.stdout }}" + local_file_or_folder_path: "/home/learning/backup/{{ var1.stdout }}" + when: cloud_service_provider == "gcloud" - name: clean up backup dir after upload - file: path={{learner_user_home}}/backup state=absent + file: path={{ neo4j_backup_dir }} state=absent diff --git a/ansible/roles/neo4j-community/tasks/main.yml b/ansible/roles/neo4j-community/tasks/main.yml index d2460ebd13..ca450efe78 100644 --- a/ansible/roles/neo4j-community/tasks/main.yml +++ b/ansible/roles/neo4j-community/tasks/main.yml @@ -62,3 +62,9 @@ template: src=neo4j-wrapper.conf.j2 dest={{ neo4j_home }}/conf/neo4j-wrapper.conf group={{learner_user}} owner={{learner_user}} when: dbms_mode != "ARBITER" +- name: Start neo4j + become: yes + become_user: "{{ learner_user }}" + shell: bin/neo4j start + args: + chdir: "{{ neo4j_home }}" diff --git a/ansible/roles/neo4j-deploy/tasks/main.yml b/ansible/roles/neo4j-deploy/tasks/main.yml index 02d94a9858..3281c49a2f 100644 --- a/ansible/roles/neo4j-deploy/tasks/main.yml +++ b/ansible/roles/neo4j-deploy/tasks/main.yml @@ -1,5 +1,5 @@ -- name: checking the list of installed services - service_facts: +#- name: checking the list of installed services +# service_facts: - name: Stop the monit service: name=monit state=stopped diff --git a/ansible/roles/neo4j-restore/defaults/main.yml b/ansible/roles/neo4j-restore/defaults/main.yml index 4676400443..e338cee980 100644 --- a/ansible/roles/neo4j-restore/defaults/main.yml +++ b/ansible/roles/neo4j-restore/defaults/main.yml @@ -1,6 +1,5 @@ neo4j_restore_dir: /home/{{learner_user}}/restore learner_user: learning -neo4j_backup_azure_container_name: neo4j-backup neo4j_home: "{{learner_user_home}}/neo4j-learning/neo4j-enterprise-3.3.0" learner_user_home: /home/{{learner_user}} path_to_neo4j_db: "{{neo4j_home}}/data/databases" @@ -8,4 +7,7 @@ path_to_neo4j_db: "{{neo4j_home}}/data/databases" ##### #neo4j_backup_file_name: input from jenkins job #backup_azure_storage_account_name: defined in private repo -#backup_azure_storage_access_key: defined in secrets.yml \ No newline at end of file +#backup_azure_storage_access_key: defined in secrets.yml + +cloud_storage_neo4jbackup_bucketname: "{{ cloud_storage_management_bucketname }}" +cloud_storage_neo4jbackup_foldername: neo4j-backup diff --git a/ansible/roles/neo4j-restore/meta/main.yml b/ansible/roles/neo4j-restore/meta/main.yml deleted file mode 100644 index 23b18a800a..0000000000 --- a/ansible/roles/neo4j-restore/meta/main.yml +++ /dev/null @@ -1,2 +0,0 @@ -dependencies: - - azure-cli \ No newline at end of file diff --git a/ansible/roles/neo4j-restore/tasks/main.yml b/ansible/roles/neo4j-restore/tasks/main.yml index 5d3edcfd47..1e79c759b1 100644 --- a/ansible/roles/neo4j-restore/tasks/main.yml +++ b/ansible/roles/neo4j-restore/tasks/main.yml @@ -4,13 +4,43 @@ - set_fact: neo4j_backup_file_path: "{{ neo4j_restore_dir }}/{{ neo4j_backup_file_name }}" -- name: Download restore file from azure - command: az storage blob download --container-name {{ neo4j_backup_azure_container_name }} --name {{ neo4j_backup_file_name }} --file {{ neo4j_backup_file_path }} - environment: - AZURE_STORAGE_ACCOUNT: "{{sunbird_management_storage_account_name}}" - AZURE_STORAGE_KEY: "{{sunbird_management_storage_account_key}}" - async: 3600 - poll: 10 +- name: download a file from azure storage + become: true + include_role: + name: azure-cloud-storage + tasks_from: blob-download.yml + vars: + blob_container_name: "{{ cloud_storage_neo4jbackup_foldername }}" + blob_file_name: "{{ neo4j_backup_file_name }}" + local_file_or_folder_path: "{{ neo4j_backup_file_path }}" + storage_account_name: "{{ cloud_management_storage_accountname }}" + storage_account_key: "{{ cloud_management_storage_secret }}" + when: cloud_service_provider == "azure" + +- name: download file from aws s3 + include_role: + name: aws-cloud-storage + tasks_from: download.yml + vars: + s3_bucket_name: "{{ cloud_storage_neo4jbackup_bucketname }}" + aws_access_key_id: "{{ cloud_management_storage_accountname }}" + aws_secret_access_key: "{{ cloud_management_storage_secret }}" + aws_default_region: "{{ cloud_public_storage_region }}" + local_file_or_folder_path: "{{ neo4j_backup_file_path }}" + s3_path: "{{ cloud_storage_neo4jbackup_foldername }}/{{ neo4j_backup_file_name }}" + when: cloud_service_provider == "aws" + +- name: download file from gcloud storage + include_role: + name: gcp-cloud-storage + tasks_from: download.yml + vars: + gcp_storage_service_account_name: "{{ cloud_management_storage_accountname }}" + gcp_storage_key_file: "{{ cloud_management_storage_secret }}" + gcp_bucket_name: "{{ cloud_storage_neo4jbackup_bucketname }}" + gcp_path: "{{ cloud_storage_neo4jbackup_foldername }}/{{ neo4j_backup_file_name }}" + local_file_or_folder_path: "{{ neo4j_backup_file_path }}" + when: cloud_service_provider == "gcloud" - name: Check if neo4j is running become_user: "{{ learner_user }}" diff --git a/ansible/roles/oci-cloud-storage/defaults/main.yml b/ansible/roles/oci-cloud-storage/defaults/main.yml new file mode 100644 index 0000000000..72727de167 --- /dev/null +++ b/ansible/roles/oci-cloud-storage/defaults/main.yml @@ -0,0 +1,3 @@ +oss_bucket_name: "" +oss_path: "" +local_file_or_folder_path: "" diff --git a/ansible/roles/oci-cloud-storage/tasks/delete-folder.yml b/ansible/roles/oci-cloud-storage/tasks/delete-folder.yml new file mode 100644 index 0000000000..6ed4e6b8b4 --- /dev/null +++ b/ansible/roles/oci-cloud-storage/tasks/delete-folder.yml @@ -0,0 +1,5 @@ +--- +- name: delete files and folders recursively + shell: "oci os object bulk-delete -ns {{oss_namespace}} -bn {{oss_bucket_name}} --prefix {{oss_path}} --force" + async: 3600 + poll: 10 diff --git a/ansible/roles/oci-cloud-storage/tasks/delete.yml b/ansible/roles/oci-cloud-storage/tasks/delete.yml new file mode 100644 index 0000000000..65d18843ca --- /dev/null +++ b/ansible/roles/oci-cloud-storage/tasks/delete.yml @@ -0,0 +1,7 @@ +- name: Ensure oci oss bucket exists + command: oci os bucket get --name {{ oss_bucket_name }} + +- name: Upload to oci oss bucket + command: oci os object delete -bn {{ oss_bucket_name }} --name {{ oss_path }} --force + async: 3600 + poll: 10 \ No newline at end of file diff --git a/ansible/roles/oci-cloud-storage/tasks/download.yml b/ansible/roles/oci-cloud-storage/tasks/download.yml new file mode 100644 index 0000000000..838ecd544e --- /dev/null +++ b/ansible/roles/oci-cloud-storage/tasks/download.yml @@ -0,0 +1,7 @@ +- name: Ensure oci oss bucket exists + command: oci os bucket get --name {{ oss_bucket_name }} + +- name: download files from oci oss bucket + command: oci os object get -bn {{ oss_bucket_name }} --name {{ oss_object_name }} --file {{ local_file_or_folder_path }} + async: 3600 + poll: 10 \ No newline at end of file diff --git a/ansible/roles/oci-cloud-storage/tasks/main.yml b/ansible/roles/oci-cloud-storage/tasks/main.yml new file mode 100644 index 0000000000..6f9dca6b63 --- /dev/null +++ b/ansible/roles/oci-cloud-storage/tasks/main.yml @@ -0,0 +1,18 @@ +--- +- name: delete files from oci oss bucket + include: delete.yml + +- name: delete folders from oci oss bucket recursively + include: delete-folder.yml + + +- name: download file from oss + include: download.yml + +- name: upload files from a local to oci oss + include: upload.yml + +- name: upload files and folder from local directory to oci oss + include: upload-folder.yml + + diff --git a/ansible/roles/oci-cloud-storage/tasks/upload-folder.yml b/ansible/roles/oci-cloud-storage/tasks/upload-folder.yml new file mode 100644 index 0000000000..6e4d06562c --- /dev/null +++ b/ansible/roles/oci-cloud-storage/tasks/upload-folder.yml @@ -0,0 +1,8 @@ +--- +- name: Ensure oci oss bucket exists + command: oci os bucket get --name {{ oss_bucket_name }} + +- name: Upload folder to oci oss bucket + command: oci os object bulk-upload -bn {{ oss_bucket_name }} --prefix {{ oss_path }} --src-dir {{ local_file_or_folder_path }} --content-type auto + async: 3600 + poll: 10 diff --git a/ansible/roles/oci-cloud-storage/tasks/upload.yml b/ansible/roles/oci-cloud-storage/tasks/upload.yml new file mode 100644 index 0000000000..9e1ceb4289 --- /dev/null +++ b/ansible/roles/oci-cloud-storage/tasks/upload.yml @@ -0,0 +1,8 @@ +--- +- name: Ensure oci oss bucket exists + command: oci os bucket get --name {{ oss_bucket_name }} + +- name: Upload to oci oss bucket + command: oci os object put -bn {{ oss_bucket_name }} --name {{ oss_path }} --file {{ local_file_or_folder_path }} --content-type auto --force + async: 3600 + poll: 10 diff --git a/ansible/roles/redis-backup/defaults/main.yml b/ansible/roles/redis-backup/defaults/main.yml index 2bb4e0e31c..234379b85c 100644 --- a/ansible/roles/redis-backup/defaults/main.yml +++ b/ansible/roles/redis-backup/defaults/main.yml @@ -1,6 +1,8 @@ redis_backup_dir: /tmp/redis-backup -redis_backup_azure_container_name: redis-backup learner_user: learning redis_data_dir: /data redis_version: 6.2.5 redis_dir: "/home/{{ learner_user }}/redis-{{ redis_version }}" + +cloud_storage_redisbackup_bucketname: "{{ cloud_storage_management_bucketname }}" +cloud_storage_redisbackup_foldername: redis-backup diff --git a/ansible/roles/redis-backup/meta/main.yml b/ansible/roles/redis-backup/meta/main.yml deleted file mode 100644 index a124d4f7cb..0000000000 --- a/ansible/roles/redis-backup/meta/main.yml +++ /dev/null @@ -1,2 +0,0 @@ -dependencies: - - azure-cli diff --git a/ansible/roles/redis-backup/tasks/main.yml b/ansible/roles/redis-backup/tasks/main.yml index b18cf1412f..c73c524221 100644 --- a/ansible/roles/redis-backup/tasks/main.yml +++ b/ansible/roles/redis-backup/tasks/main.yml @@ -14,15 +14,44 @@ src: "{{ redis_data_dir }}/dump.rdb" dest: "{{ redis_backup_dir }}/{{ redis_backup_file_name }}" remote_src: yes - -- name: upload to azure +- name: upload file to azure storage include_role: - name: artifacts-upload-azure + name: azure-cloud-storage + tasks_from: blob-upload.yml vars: - artifact: "{{ redis_backup_file_name }}" - artifact_path: "{{ redis_backup_file_path }}" - artifacts_container: "{{ redis_backup_azure_container_name }}" + blob_container_name: "{{ cloud_storage_redisbackup_foldername }}" + container_public_access: "off" + blob_file_name: "{{ redis_backup_file_name }}" + local_file_or_folder_path: "{{ redis_backup_file_path }}" + storage_account_name: "{{ cloud_management_storage_accountname }}" + storage_account_key: "{{ cloud_management_storage_secret }}" + when: cloud_service_provider == "azure" + +- name: upload file to aws s3 + include_role: + name: aws-cloud-storage + tasks_from: upload.yml + vars: + s3_bucket_name: "{{ cloud_storage_redisbackup_bucketname }}" + aws_access_key_id: "{{ cloud_management_storage_accountname }}" + aws_secret_access_key: "{{ cloud_management_storage_secret }}" + aws_default_region: "{{ cloud_public_storage_region }}" + local_file_or_folder_path: "{{ redis_backup_file_path }}" + s3_path: "{{ cloud_storage_redisbackup_foldername }}/{{ redis_backup_file_name }}" + when: cloud_service_provider == "aws" + +- name: upload file to gcloud storage + include_role: + name: gcp-cloud-storage + tasks_from: upload.yml + vars: + gcp_storage_service_account_name: "{{ cloud_management_storage_accountname }}" + gcp_storage_key_file: "{{ cloud_management_storage_secret }}" + gcp_bucket_name: "{{ cloud_storage_redisbackup_bucketname }}" + gcp_path: "{{ cloud_storage_redisbackup_foldername }}/{{ redis_backup_file_name }}" + local_file_or_folder_path: "{{ redis_backup_file_path }}" + when: cloud_service_provider == "gcloud" - name: clean up backup dir after upload file: path={{ redis_backup_dir }} state=absent diff --git a/ansible/roles/redis-restore/defaults/main.yml b/ansible/roles/redis-restore/defaults/main.yml index fec539e0d6..5861b60c55 100644 --- a/ansible/roles/redis-restore/defaults/main.yml +++ b/ansible/roles/redis-restore/defaults/main.yml @@ -1,2 +1,4 @@ -redis_backup_azure_container_name: redis-backup learning_user_home: /home/learning + +cloud_storage_redisbackup_bucketname: "{{ cloud_storage_management_bucketname }}" +cloud_storage_redisbackup_foldername: redis-backup diff --git a/ansible/roles/redis-restore/meta/main.yml b/ansible/roles/redis-restore/meta/main.yml deleted file mode 100644 index a124d4f7cb..0000000000 --- a/ansible/roles/redis-restore/meta/main.yml +++ /dev/null @@ -1,2 +0,0 @@ -dependencies: - - azure-cli diff --git a/ansible/roles/redis-restore/tasks/main.yml b/ansible/roles/redis-restore/tasks/main.yml index e435883030..e073c1d178 100644 --- a/ansible/roles/redis-restore/tasks/main.yml +++ b/ansible/roles/redis-restore/tasks/main.yml @@ -1,9 +1,41 @@ --- -- name: Download backup file - shell: "az storage blob download --container-name {{ redis_backup_azure_container_name }} --file {{ redis_restore_file_name }} --name {{ redis_restore_file_name }} --account-name {{sunbird_management_storage_account_name}} --account-key {{sunbird_management_storage_account_key}}" - args: - chdir: /tmp/ +- name: download a file from azure storage + include_role: + name: azure-cloud-storage + tasks_from: blob-download.yml + vars: + blob_container_name: "{{ cloud_storage_redisbackup_foldername }}" + blob_file_name: "{{ redis_restore_file_name }}" + local_file_or_folder_path: "/tmp/{{ redis_restore_file_name }}" + storage_account_name: "{{ cloud_management_storage_accountname }}" + storage_account_key: "{{ cloud_management_storage_secret }}" + when: cloud_service_provider == "azure" + +- name: download file from aws s3 + include_role: + name: aws-cloud-storage + tasks_from: download.yml + vars: + s3_bucket_name: "{{ cloud_storage_redisbackup_bucketname }}" + aws_access_key_id: "{{ cloud_management_storage_accountname }}" + aws_secret_access_key: "{{ cloud_management_storage_secret }}" + aws_default_region: "{{ cloud_public_storage_region }}" + local_file_or_folder_path: "/tmp/{{ redis_restore_file_name }}" + s3_path: "{{ cloud_storage_redisbackup_foldername }}/{{ redis_restore_file_name }}" + when: cloud_service_provider == "aws" + +- name: download file from gcloud storage + include_role: + name: gcp-cloud-storage + tasks_from: download.yml + vars: + gcp_storage_service_account_name: "{{ cloud_management_storage_accountname }}" + gcp_storage_key_file: "{{ cloud_management_storage_secret }}" + gcp_bucket_name: "{{ cloud_storage_redisbackup_bucketname }}" + gcp_path: "{{ cloud_storage_redisbackup_foldername }}/{{ redis_restore_file_name }}" + local_file_or_folder_path: "/tmp/{{ redis_restore_file_name }}" + when: cloud_service_provider == "gcloud" - name: stop redis to take backup become: yes diff --git a/ansible/roles/samza-job-monitor/files/samza_alerts.zip b/ansible/roles/samza-job-monitor/files/samza_alerts.zip deleted file mode 100644 index 8a690455df..0000000000 Binary files a/ansible/roles/samza-job-monitor/files/samza_alerts.zip and /dev/null differ diff --git a/ansible/roles/samza-job-monitor/tasks/main.yml b/ansible/roles/samza-job-monitor/tasks/main.yml deleted file mode 100644 index 066a80e58f..0000000000 --- a/ansible/roles/samza-job-monitor/tasks/main.yml +++ /dev/null @@ -1,62 +0,0 @@ ---- -- include_vars: "{{inventory_dir}}/secrets.yml" - -- name: Install unzip - apt: name=unzip state=present - become: yes - -- name: Copy and unarchive monitor code - unarchive: src=samza_alerts.zip dest=/opt/ - -- name: Install bundler - apt: name=bundler state=present - -- name: Install bundler - command: bash -lc "bundle install" - args: - chdir: "/opt/samza_alerts" - -- name: change dir permisson - file: path=/opt/samza_alerts owner=hduser group=hadoop recurse=yes mode=0755 - -- name: make job_alerts.rb executable - file: path=/opt/samza_alerts/job_alerts.rb state=touch mode=744 - -- name: Copy file - template: src=samza-monitor dest=/etc/init.d/samza-monitor mode=755 - -- name: Detect if this is a systemd based system - command: cat /proc/1/comm - register: init - -- set_fact: use_systemd=True - when: init.stdout == 'systemd' - -- set_fact: use_systemd=False - when: init.stdout != 'systemd' - -- name: Copy file - template: src=samza-monitor-systemd dest=/etc/systemd/system/samza-monitor.service mode=755 - sudo: yes - when: use_systemd - -- command: systemctl enable samza-monitor.service - sudo: yes - ignore_errors: true - when: use_systemd - -- name: Create log directory - file: path=/var/log/samza-monitor state=directory owner=hduser group=hadoop recurse=yes mode=0755 - -- name: Create log file - file: path=/var/log/samza-monitor/samza-monitor.log state=touch owner=hduser group=hadoop mode=0644 - -- name: Restart samza-monitor - service: name=samza-monitor state=restarted - sudo: yes - when: init.stdout != 'systemd' - -- name: Restart samza-monitor - systemd: name=samza-monitor state=restarted - sudo: yes - when: use_systemd diff --git a/ansible/roles/samza-job-monitor/templates/samza-monitor b/ansible/roles/samza-job-monitor/templates/samza-monitor deleted file mode 100644 index 176fcf244f..0000000000 --- a/ansible/roles/samza-job-monitor/templates/samza-monitor +++ /dev/null @@ -1,109 +0,0 @@ -#! /bin/sh -### BEGIN INIT INFO -# Provides: samza monitor -# Default-Start: 2 3 4 5 -# Default-Stop: S 0 1 6 -# Short-Description: Samza Monitor -# Description: Starts Samza Monitor as a daemon. -### END INIT INFO - -DESC="Samaza Job Monitor Daemon" -NAME=/opt/samza_alerts/job_alerts.rb -SCRIPTNAME=/etc/init.d/samza-monitor -PID="/var/run/samza-monitor.pid" - -ARGS="" -JOBS_COUNT="{{ monitor_jobs_count }}" -YARN_URL="{{ monitor_yarn_url }}" -SAMZA_ENV="{{ env }}" -MONITORINGING_ENV_NAME="{{ monitor_jobs_env_name }}" -SLACK_CHANNEL="{{ slack_channel }}" -SLACK_URL="{{ monitor_slack_url }}" -CHECK_DELAY="60" - -# Exit if the package is not installed -if [ ! -x "$NAME" ]; then -{ - echo "Couldn't find $NAME" - exit 99 -} -fi - -# Define LSB log_* functions. -# Depend on lsb-base (>= 3.0-6) to ensure that this file is present. -. /lib/lsb/init-functions - -# -# Function that starts the daemon/service -# -do_start() -{ - - start-stop-daemon --start --pidfile $PID --quiet --exec $NAME --test > /dev/null \ - || return 1 - - start-stop-daemon −−chdir=/opt/samza_alerts --start --make-pidfile --pidfile $PID --quiet --background --exec /usr/bin/env JOBS_COUNT=$JOBS_COUNT YARN_URL=$YARN_URL SAMZA_ENV=$SAMZA_ENV MONITORINGING_ENV_NAME=$MONITORINGING_ENV_NAME SLACK_CHANNEL=$SLACK_CHANNEL SLACK_URL=$SLACK_URL CHECK_DELAY=$CHECK_DELAY $NAME -- $ARGS \ - || return 2 -} - -# -# Function that stops the daemon/service -# -do_stop() -{ - # Return - # 0 if daemon has been stopped - # 1 if daemon was already stopped - # 2 if daemon could not be stopped - # other if a failure occurred - start-stop-daemon --stop --pidfile $PID --quiet --oknodo - RETVAL="$?" - rm -f $PID - return "$RETVAL" -} - -case "$1" in - start) - log_daemon_msg "Starting $DESC" - do_start - case "$?" in - 0|1) log_end_msg 0 ;; - 2) log_end_msg 1 ;; - esac - ;; - stop) - log_daemon_msg "Stopping $DESC" - do_stop - case "$?" in - 0|1) log_end_msg 0 ;; - 2) log_end_msg 1 ;; - esac - ;; - status) - status_of_proc -p $PID $NAME $DESC && exit 0 || exit $? - ;; - restart) - log_daemon_msg "Restarting $DESC" - do_stop - case "$?" in - 0|1) - do_start - case "$?" in - 0) log_end_msg 0 ;; - 1) log_end_msg 1 ;; # Old process is still running - *) log_end_msg 1 ;; # Failed to start - esac - ;; - *) - # Failed to stop - log_end_msg 1 - ;; - esac - ;; - *) - echo "Usage: $SCRIPTNAME {start|stop|restart}" >&2 - exit 3 - ;; -esac - -exit 0 diff --git a/ansible/roles/samza-job-monitor/templates/samza-monitor-systemd b/ansible/roles/samza-job-monitor/templates/samza-monitor-systemd deleted file mode 100644 index bd8dce7efd..0000000000 --- a/ansible/roles/samza-job-monitor/templates/samza-monitor-systemd +++ /dev/null @@ -1,14 +0,0 @@ -[Unit] -Description=samza-monitor deamon - -[Service] -Type=forking -User=root -Group=root -LimitNOFILE=32768 -Restart=on-failure -ExecStart=/etc/init.d/samza-monitor start -ExecStop=/etc/init.d/samza-monitor stop - -[Install] -WantedBy=multi-user.target diff --git a/ansible/roles/samza-job-server/tasks/main.yml b/ansible/roles/samza-job-server/tasks/main.yml deleted file mode 100644 index ce685a8d5d..0000000000 --- a/ansible/roles/samza-job-server/tasks/main.yml +++ /dev/null @@ -1,14 +0,0 @@ -- name: Create Directory for Jobs - file: path={{item}} owner=hduser group=hadoop recurse=yes state=directory - with_items: - - /home/hduser/samza-jobs - become: yes -- name: Install python - apt: name=python state=present - become: yes -- name: Copy init file - template: src=samza-job-server.sh dest=/etc/init.d/samza-job-server mode=755 - become: yes -- name: Start samza job server - service: name=samza-job-server state=restarted enabled=yes - become: yes \ No newline at end of file diff --git a/ansible/roles/samza-job-server/templates/samza-job-server.sh b/ansible/roles/samza-job-server/templates/samza-job-server.sh deleted file mode 100644 index c0535ef699..0000000000 --- a/ansible/roles/samza-job-server/templates/samza-job-server.sh +++ /dev/null @@ -1,104 +0,0 @@ -#! /bin/sh -### BEGIN INIT INFO -# Provides: Ecosystem-Platform-API -# Default-Start: 2 3 4 5 -# Default-Stop: S 0 1 6 -# Short-Description: Ecosystem-Platform-API -# Description: Starts samza-job-server as a daemon. -### END INIT INFO - -DESC="Samza-Job-Server Daemon" -NAME=/usr/bin/python -LOGFILE="/var/log/samza-job-server/" -SCRIPTNAME=/etc/init.d/samza-job-server -PID="/var/run/samza-job-server.pid" - -ARGS="-m SimpleHTTPServer" - -SERVER_PATH="/home/hduser/samza-jobs" -# Exit if the package is not installed -if [ ! -x "$NAME" ]; then -{ - echo "Couldn't find $NAME" - exit 99 -} -fi - -# Define LSB log_* functions. -# Depend on lsb-base (>= 3.0-6) to ensure that this file is present. -. /lib/lsb/init-functions - -# -# Function that starts the daemon/service -# -do_start() -{ - - start-stop-daemon --start --pidfile $PID --quiet --exec $NAME --test > /dev/null \ - || return 1 - - start-stop-daemon --start --make-pidfile --pidfile $PID --quiet --background -d $SERVER_PATH --exec $NAME -- $ARGS \ - || return 2 -} - -# -# Function that stops the daemon/service -# -do_stop() -{ - # Return - # 0 if daemon has been stopped - # 1 if daemon was already stopped - # 2 if daemon could not be stopped - # other if a failure occurred - start-stop-daemon --stop --pidfile $PID --quiet --oknodo - RETVAL="$?" - rm -f $PID - return "$RETVAL" -} - -case "$1" in - start) - log_daemon_msg "Starting $DESC" - do_start - case "$?" in - 0|1) log_end_msg 0 ;; - 2) log_end_msg 1 ;; - esac - ;; - stop) - log_daemon_msg "Stopping $DESC" - do_stop - case "$?" in - 0|1) log_end_msg 0 ;; - 2) log_end_msg 1 ;; - esac - ;; - status) - status_of_proc -p $PID $NAME $DESC && exit 0 || exit $? - ;; - restart) - log_daemon_msg "Restarting $DESC" - do_stop - case "$?" in - 0|1) - do_start - case "$?" in - 0) log_end_msg 0 ;; - 1) log_end_msg 1 ;; # Old process is still running - *) log_end_msg 1 ;; # Failed to start - esac - ;; - *) - # Failed to stop - log_end_msg 1 - ;; - esac - ;; - *) - echo "Usage: $SCRIPTNAME {start|stop|restart}" >&2 - exit 3 - ;; -esac - -exit 0 diff --git a/ansible/roles/samza-jobs-additional-config/defaults/main.yml b/ansible/roles/samza-jobs-additional-config/defaults/main.yml deleted file mode 100644 index 626b7f0b2d..0000000000 --- a/ansible/roles/samza-jobs-additional-config/defaults/main.yml +++ /dev/null @@ -1,5 +0,0 @@ ---- -object_denormalization_additional_config_dir: /etc/samza-jobs/{{env}} -object_denormalization_additional_config: "{{object_denormalization_additional_config_dir}}/object-denormalization-additional-config.json" -es_router_additional_config: "{{object_denormalization_additional_config_dir}}/es-router-additional-config.json" -es_router_additional_secondary_config: "{{object_denormalization_additional_config_dir}}/es-router-additional-secondary-config.json" diff --git a/ansible/roles/samza-jobs-additional-config/tasks/main.yml b/ansible/roles/samza-jobs-additional-config/tasks/main.yml deleted file mode 100644 index 11e1535f12..0000000000 --- a/ansible/roles/samza-jobs-additional-config/tasks/main.yml +++ /dev/null @@ -1,12 +0,0 @@ ---- -- name: Create directory for additional config - file: path={{object_denormalization_additional_config_dir}} owner=hduser group=hadoop recurse=yes state=directory - -- name: Copy Object denormalization additional config - template: src=object-denormalization-additional-config.json dest={{object_denormalization_additional_config}} owner=hduser group=hadoop - -- name: Copy Primary es router additional config - template: src=es-router-additional-config.json dest={{es_router_additional_config}} owner=hduser group=hadoop - -- name: Copy Secondary es router additional config - template: src=es-router-additional-secondary-config.json dest={{es_router_additional_secondary_config}} owner=hduser group=hadoop diff --git a/ansible/roles/samza-jobs-additional-config/templates/es-router-additional-config.json b/ansible/roles/samza-jobs-additional-config/templates/es-router-additional-config.json deleted file mode 100644 index 2c47f45f49..0000000000 --- a/ansible/roles/samza-jobs-additional-config/templates/es-router-additional-config.json +++ /dev/null @@ -1,140 +0,0 @@ -{ - "topicConfigs":[ - { - "names":["{{env}}.telemetry.objects.de_normalized"], - "eventConfigs":[ - { - "rules":[ - { - "idPath":"eid", - "idValue":"OE.*" - } - ], - "esIndexValue":"telemetry", - "esIndexType":"events_v1", - "weight":3, - "cumulative":false, - "esIndexDate": { - "primary" : "ts", - "primaryFormat": "string", - "secondary" : "ets", - "secondaryFormat": "epoch", - "updatePrimary": true - } - - }, - { - "rules":[ - { - "idPath":"eid", - "idValue":"GE.*" - } - ], - "esIndexValue":"telemetry", - "esIndexType":"events_v1", - "weight":3, - "cumulative":false, - "esIndexDate": { - "primary" : "ts", - "primaryFormat": "string", - "secondary" : "ets", - "secondaryFormat": "epoch", - "updatePrimary": true - } - }, - { - "rules":[ - { - "idPath":"eid", - "idValue":"BE_ACCESS|BE_JOB_START|BE_JOB_LOG|BE_JOB_END|BE_SERVICE_LOG|BE_SERVICE_LIFECYCLE|BE_SERVICE_METRIC" - } - ], - "esIndexValue":"infra", - "esIndexType":"infra", - "weight":4, - "cumulative":false, - "esIndexDate": { - "primary" : "ts", - "primaryFormat": "string", - "secondary" : "ets", - "secondaryFormat": "epoch", - "updatePrimary": true - } - }, - { - "rules":[ - { - "idPath":"eid", - "idValue":"CP_.*|CE_.*|BE_.*" - } - ], - "esIndexValue":"backend", - "esIndexType":"backend", - "weight":3, - "cumulative":false, - "esIndexDate": { - "primary" : "ts", - "primaryFormat": "string", - "secondary" : "ets", - "secondaryFormat": "epoch", - "updatePrimary": true - } - }, - { - "rules":[ - { - "idPath":"context.granularity", - "idValue":"CUMULATIVE" - }, - { - "idPath":"learning", - "idValue": "true" - } - ], - - "esIndexValue":"learning-cumulative", - "esIndexType":"events_v1", - "weight":3, - "cumulative":true - }, - { - "rules":[ - { - "idPath":"learning", - "idValue": "true" - } - ], - "esIndexValue":"learning", - "esIndexType":"events_v1", - "weight":2, - "cumulative":false, - "esIndexDate": { - "primary" : "context.date_range.to", - "primaryFormat": "epoch", - "updatePrimary": false - } - }, - { - "rules":[ - { - "idPath":"eid", - "idValue":".*" - } - ], - "esIndexValue":"telemetry", - "esIndexType":"events_v1", - "weight":1, - "cumulative":false, - "esIndexDate": { - "primary" : "ts", - "primaryFormat": "string", - "secondary" : "ets", - "secondaryFormat": "epoch", - "updatePrimary": true - } - } - ] - - } - ] -} diff --git a/ansible/roles/samza-jobs-additional-config/templates/es-router-additional-secondary-config.json b/ansible/roles/samza-jobs-additional-config/templates/es-router-additional-secondary-config.json deleted file mode 100644 index f5c15c46ed..0000000000 --- a/ansible/roles/samza-jobs-additional-config/templates/es-router-additional-secondary-config.json +++ /dev/null @@ -1,29 +0,0 @@ -{ - "topicConfigs":[ - { - "names":["{{env}}.telemetry.valid.fail","{{env}}.telemetry.duplicate","{{env}}.telemetry.with_location.fail","{{env}}.telemetry.objects.de_normalized.fail","{{env}}.telemetry.es_router_primary.fail","{{env}}.telemetry.es_indexer_primary.fail"], - "eventConfigs":[ - { - "rules":[ - { - "idPath":"eid", - "idValue":".*" - } - ], - "esIndexValue":"failed-telemetry", - "esIndexType":"events", - "weight":3, - "cumulative":false, - "esIndexDate": { - "primary" : "ts", - "primaryFormat": "string", - "secondary" : "ets", - "secondaryFormat": "epoch", - "updatePrimary": false - } - - } - ] - } - ] -} diff --git a/ansible/roles/samza-jobs-additional-config/templates/object-denormalization-additional-config.json b/ansible/roles/samza-jobs-additional-config/templates/object-denormalization-additional-config.json deleted file mode 100644 index 874a18a337..0000000000 --- a/ansible/roles/samza-jobs-additional-config/templates/object-denormalization-additional-config.json +++ /dev/null @@ -1,84 +0,0 @@ -{ - "eventConfigs": [ - { - "name": "Portal User for all CP & CE events", - "eidPattern": "C[PE]\\_.*", - "denormalizationConfigs": [ - { - "idFieldPath": "uid", - "denormalizedFieldPath": "portaluserdata" - } - ] - }, - { - "name": "Partner data for ME_CONTENT_SNAPSHOT_SUMMARY", - "eidPattern": "ME_CONTENT_SNAPSHOT_SUMMARY", - "denormalizationConfigs": [ - { - "idFieldPath": "dimensions.partner_id", - "denormalizedFieldPath": "partnerdata" - } - ] - }, - { - "name": "Partner data for ME_ASSET_SNAPSHOT_SUMMARY", - "eidPattern": "ME_ASSET_SNAPSHOT_SUMMARY", - "denormalizationConfigs": [ - { - "idFieldPath": "dimensions.partner_id", - "denormalizedFieldPath": "partnerdata" - } - ] - }, - { - "name": "Portal user data for ME_APP_SESSION_SUMMARY", - "eidPattern": "ME_APP_SESSION_SUMMARY", - "denormalizationConfigs": [ - { - "idFieldPath": "uid", - "denormalizedFieldPath": "portaluserdata" - } - ] - }, - { - "name": "Portal user data for ME_APP_USAGE_SUMMARY", - "eidPattern": "ME_APP_USAGE_SUMMARY", - "denormalizationConfigs": [ - { - "idFieldPath": "dimensions.author_id", - "denormalizedFieldPath": "portaluserdata" - } - ] - }, - { - "name": "Portal user data for ME_CE_SESSION_SUMMARY", - "eidPattern": "ME_CE_SESSION_SUMMARY", - "denormalizationConfigs": [ - { - "idFieldPath": "uid", - "denormalizedFieldPath": "portaluserdata" - } - ] - }, - { - "name": "Portal user data for ME_AUTHOR_USAGE_SUMMARY", - "eidPattern": "ME_AUTHOR_USAGE_SUMMARY", - "denormalizationConfigs": [ - { - "idFieldPath": "uid", - "denormalizedFieldPath": "portaluserdata" - } - ] - }, - { - "name": "Portal user data for ME_TEXTBOOK_SESSION_SUMMARY", - "eidPattern": "ME_TEXTBOOK_SESSION_SUMMARY", - "denormalizationConfigs": [ - { - "idFieldPath": "uid", - "denormalizedFieldPath": "portaluserdata" - } - ] - } - ] -} diff --git a/ansible/roles/samza-jobs-telemetry-schemas/tasks/main.yml b/ansible/roles/samza-jobs-telemetry-schemas/tasks/main.yml deleted file mode 100644 index e7cee03b36..0000000000 --- a/ansible/roles/samza-jobs-telemetry-schemas/tasks/main.yml +++ /dev/null @@ -1,17 +0,0 @@ ---- -- name: Create schema directory - file: path={{telemetry_schema_directory}} owner=hduser group=hadoop recurse=yes state=directory - become: yes - -- name: Copy schemas folder - copy: src=schemas dest={{telemetry_schema_directory}} owner=hduser group=hadoop - become: yes - -- name: get schema dir names - raw: find {{telemetry_schema_path}} -type f -name "*.*" - register: schemas - -- name: change internal schema file reference - replace: dest={{item}} regexp="http://localhost:7070/schemas/" replace="file://{{telemetry_schema_path}}/" owner=hduser group=hadoop - with_items: "{{ schemas.stdout_lines }}" - become: yes \ No newline at end of file diff --git a/ansible/roles/samza-jobs/defaults/main.yml b/ansible/roles/samza-jobs/defaults/main.yml deleted file mode 100644 index 63f806a721..0000000000 --- a/ansible/roles/samza-jobs/defaults/main.yml +++ /dev/null @@ -1,51 +0,0 @@ ---- -samza_jobs_dir: /home/hduser/samza-jobs/{{env}} -job_status_file: /home/hduser/samza-jobs/{{env}}/extract/job_status -yarn_path: /usr/local/hadoop/bin -object_denormalization_additional_config_dir: /etc/samza-jobs/{{env}} -object_denormalization_additional_config: "{{object_denormalization_additional_config_dir}}/object-denormalization-additional-config.json" -es_router_additional_config: "{{object_denormalization_additional_config_dir}}/es-router-additional-config.json" -es_router_additional_secondary_config: "{{object_denormalization_additional_config_dir}}/es-router-additional-secondary-config.json" -# lpdeploy: no -# dpdeploy: no -hierarchy_keyspace_name: "{{env}}_hierarchy_store" -cloud_upload_retry_count: 3 -streaming_mime_type: "video/mp4,video/webm" -__yarn_port__: 8000 -delayInMilliSeconds: 60000 -retryTimeInMilliSeconds: 10000 -retry_backoff_base_in_seconds: 10 -bypass_reverse_search: true -retry_limit: 4 -retry_limit_enable: true -publish_pipeline_container_count: 1 -publish_yarn_container_memory_mb: 1024 -publish_pipeline_task_opts: "-Dfile.encoding=UTF8 -XX:-UseG1GC -Xmx800m" -mw_shard_id: 1 -google_vision_tagging: false -es_port: 9200 -cassandra_port: 9042 -content_keyspace_table: content_data -collection_fullecar_disable: true -max_iteration_count_for_samza_job: 2 -composite_search_indexer_container_count: 1 -compositesearch_index_name: "compositesearch" -cloud_store: azure -hadoop_version: 2.7.2 -redis_port: 6379 -google_api_key: "123" -sunbird_installation: "{{env}}" -dial_base_url: "https://{{domain_name}}/dial/" -samza_coordinator_replication_factor: 1 -samza_checkpoint_replication_factor: 1 -course_batch_updater_container_count: 1 -course_certificate_generator_container_count: 1 -course_progress_batch_size: 100 -itemset_generate_pdf: true -auto_creator_container_count: 1 -content_streaming_enabled: false -mvc_search_indexer_container_count: 1 -auto_creator_artifact_allowed_sources: "" -auto_creator_gservice_acct_cred: "" -certificate_pre_processor_container_count: 1 -master_category_validation_enabled: "Yes" \ No newline at end of file diff --git a/ansible/roles/samza-jobs/files/find_job_name.sh b/ansible/roles/samza-jobs/files/find_job_name.sh deleted file mode 100644 index 05f0605223..0000000000 --- a/ansible/roles/samza-jobs/files/find_job_name.sh +++ /dev/null @@ -1 +0,0 @@ -sed -n "/job\.name.*$/ p" $1 | sed -n "s/=/\\t/g p" | cut -f 2 \ No newline at end of file diff --git a/ansible/roles/samza-jobs/files/get_all_job_name.sh b/ansible/roles/samza-jobs/files/get_all_job_name.sh deleted file mode 100644 index 7975c8a34a..0000000000 --- a/ansible/roles/samza-jobs/files/get_all_job_name.sh +++ /dev/null @@ -1,7 +0,0 @@ -#!/usr/bin/env bash -find . -name "*.properties" | while read fname; do - job_name=`sed -n "/^job\.name.*$/ p" $fname | sed -n "s/=/\\t/g p" | cut -f 2` - folder_path=$(dirname `dirname "$fname"`) - folder_name=`basename $folder_path` - echo "$folder_name:$job_name:---:stopped" -done > $1 diff --git a/ansible/roles/samza-jobs/files/get_all_running_app_id.sh b/ansible/roles/samza-jobs/files/get_all_running_app_id.sh deleted file mode 100644 index 74aa7c0491..0000000000 --- a/ansible/roles/samza-jobs/files/get_all_running_app_id.sh +++ /dev/null @@ -1,2 +0,0 @@ -#!/usr/bin/env bash -./yarn application -list | cut -f 2 | sed 1,'/Application-Name/'d \ No newline at end of file diff --git a/ansible/roles/samza-jobs/files/get_all_running_app_name.sh b/ansible/roles/samza-jobs/files/get_all_running_app_name.sh deleted file mode 100644 index b3b1b9dff2..0000000000 --- a/ansible/roles/samza-jobs/files/get_all_running_app_name.sh +++ /dev/null @@ -1,11 +0,0 @@ -#!/usr/bin/env bash -job_names=(`./yarn application -list | cut -f 2 | sed 1,'/Application-Name/'d | sed 's/_1$//'`) -job_ids=(`./yarn application -list | cut -f 1 | sed 1,'/Application-Id/'d`) -count=${#job_names[@]} -for (( i=0; i<${count}; i++ )); -do - job_name=${job_names[i]} - job_id=${job_ids[i]} - `sed -i /$job_name/s/stopped/started/g $1` - `sed -i /$job_name/s/---/$job_id/g $1` -done diff --git a/ansible/roles/samza-jobs/files/kill_all_app.sh b/ansible/roles/samza-jobs/files/kill_all_app.sh deleted file mode 100644 index 55f7341e25..0000000000 --- a/ansible/roles/samza-jobs/files/kill_all_app.sh +++ /dev/null @@ -1,9 +0,0 @@ -#!/usr/bin/env bash -./yarn application -list > applist.txt -sed -n "/$1.*$/ p" applist.txt | cut -f 1 > temp.txt -while read in; -do -./yarn application -kill "$in"; -done < temp.txt -rm temp.txt -rm applist.txt \ No newline at end of file diff --git a/ansible/roles/samza-jobs/files/kill_jobs.sh b/ansible/roles/samza-jobs/files/kill_jobs.sh deleted file mode 100644 index 267515cdea..0000000000 --- a/ansible/roles/samza-jobs/files/kill_jobs.sh +++ /dev/null @@ -1,11 +0,0 @@ -#!/usr/bin/env bash -cat $1 | while read LINE -do - application_id=`echo $LINE | awk -F':' '{print $3}'`; - status=`echo $LINE | awk -F':' '{print $4}'`; - - if [ "$status" == "restart" ] - then - ./yarn application -kill $application_id - fi -done \ No newline at end of file diff --git a/ansible/roles/samza-jobs/files/remove_old_tar.sh b/ansible/roles/samza-jobs/files/remove_old_tar.sh deleted file mode 100644 index 13d0547b89..0000000000 --- a/ansible/roles/samza-jobs/files/remove_old_tar.sh +++ /dev/null @@ -1,12 +0,0 @@ -#!/usr/bin/env bash -cat $1 | awk -F':' '{print $1}' > tmp.txt -DIRS=`ls -l $2/extract/ | egrep '^d'| awk '{print $9}'` -for dir in $DIRS -do - if ! grep -Fxq $dir tmp.txt - then - rm -rf $dir - rm $2/$dir - fi -done -rm tmp.txt \ No newline at end of file diff --git a/ansible/roles/samza-jobs/files/start_jobs.sh b/ansible/roles/samza-jobs/files/start_jobs.sh deleted file mode 100644 index 4d048a58a8..0000000000 --- a/ansible/roles/samza-jobs/files/start_jobs.sh +++ /dev/null @@ -1,15 +0,0 @@ -#!/usr/bin/env bash -folder_path=$2 -cat $1 | while read LINE -do - dir_name=`echo $LINE | awk -F':' '{print $1}'`; - job_name=`echo $LINE | awk -F':' '{print $2}'`; - application_id=`echo $LINE | awk -F':' '{print $3}'`; - status=`echo $LINE | awk -F':' '{print $4}'`; - properties_path="$folder_path/$dir_name/config/*.properties" - config_file_path=`ls -d $properties_path` - if [ "$status" == "stopped" ] || [ "$status" == "restart" ] - then - ./$dir_name/bin/run-job.sh --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file:///$config_file_path - fi -done \ No newline at end of file diff --git a/ansible/roles/samza-jobs/files/update_new_job_name.sh b/ansible/roles/samza-jobs/files/update_new_job_name.sh deleted file mode 100644 index 24e174ce54..0000000000 --- a/ansible/roles/samza-jobs/files/update_new_job_name.sh +++ /dev/null @@ -1,14 +0,0 @@ -#!/usr/bin/env bash -find $2 -name "*.properties" | while read fname; do - job_name=`sed -n "/^job\.name.*$/ p" $fname | sed -n "s/=/\\t/g p" | cut -f 2` - folder_path=$(dirname `dirname "$fname"`) - folder_name=`basename $folder_path` - if grep -Fwq $job_name $1 - then - `sed -i /$job_name/s/^.*\.gz/$folder_name/ $1`; - `sed -i /$job_name/s/started/restart/ $1`; - else - echo "adding" - echo "$folder_name:$job_name:---:stopped" >> $1 - fi -done \ No newline at end of file diff --git a/ansible/roles/samza-jobs/tasks/deploy.yml b/ansible/roles/samza-jobs/tasks/deploy.yml deleted file mode 100644 index 2b9598eb02..0000000000 --- a/ansible/roles/samza-jobs/tasks/deploy.yml +++ /dev/null @@ -1,104 +0,0 @@ ---- -- name: Create Directory for Jobs - file: path={{item}} owner=hduser group=hadoop recurse=yes state=directory - with_items: - - "{{samza_jobs_dir}}" - - "{{samza_jobs_dir}}/extract" - -- name: Copy script to get all running jobs - copy: src=get_all_running_app_name.sh dest=/usr/local/hadoop/bin owner=hduser group=hadoop mode="u=rwx,g=rx,o=r" - -- name: Copy script to get all job names - copy: src=get_all_job_name.sh dest="{{samza_jobs_dir}}/extract" owner=hduser group=hadoop mode="u=rwx,g=rx,o=r" - -- name: Copy script to get updated job names from extracted tar - copy: src=update_new_job_name.sh dest="{{samza_jobs_dir}}/extract" owner=hduser group=hadoop mode="u=rwx,g=rx,o=r" - -- name: Copy script to start jobs based on the status - copy: src=start_jobs.sh dest="{{samza_jobs_dir}}/extract" owner=hduser group=hadoop mode="u=rwx,g=rx,o=r" - -- name: Copy script to remove old job tar - copy: src=remove_old_tar.sh dest="{{samza_jobs_dir}}/extract" owner=hduser group=hadoop mode="u=rwx,g=rx,o=r" - -- name: Copy script to kill jobs based on the status - copy: src=kill_jobs.sh dest=/usr/local/hadoop/bin owner=hduser group=hadoop mode="u=rwx,g=rx,o=r" - -- name: Remove file of job status - file: path="{{job_status_file}}" state=absent - -- name: Get job names from folder - command: bash -lc "./get_all_job_name.sh {{job_status_file}}" - args: - chdir: "{{samza_jobs_dir}}/extract" - -- name: Ensure yarn resource manager is running - command: bash -lc "(ps aux | grep yarn-hduser-resourcemanager | grep -v grep) || /usr/local/hadoop/sbin/yarn-daemon.sh --config /usr/local/hadoop-{{hadoop_version}}/conf/ start resourcemanager" - become: yes - become_user: hduser - -- name: Update status of running job in file - command: bash -lc "./get_all_running_app_name.sh {{job_status_file}}" - args: - chdir: /usr/local/hadoop/bin - -- name: copy new jobs tar ball - copy: src={{ item }} dest={{samza_jobs_dir}}/ force=no owner=hduser group=hadoop - with_fileglob: - - ./jobs/* - register: new_jobs - -- name: Create Directory to extract new jobs - file: path={{samza_jobs_dir}}/extract/{{item.item | basename }} owner=hduser group=hadoop recurse=yes state=directory - register: extract_dir - when: "{{item|changed}}" - with_items: "{{ (new_jobs|default({})).results|default([]) }}" - -- name: extract new jobs - command: tar -xvf "{{samza_jobs_dir}}/{{item.item | basename}}" -C "{{samza_jobs_dir}}/extract/{{item.item | basename }}" - when: "{{item|changed}}" - with_items: "{{ (new_jobs|default({})).results|default([]) }}" - -- name: Create Directory to extract new jobs - file: path={{samza_jobs_dir}}/extract/ owner=hduser group=hadoop recurse=yes - -- name: Get all new job configs - shell: "ls -d -1 {{item.path}}/config/*.properties" - register: config_files - when: "{{item|changed}}" - with_items: "{{ (extract_dir|default({})).results|default([]) }}" - -- name: update environment specific details in new job configs - replace: dest="{{item[1].stdout}}" regexp="{{item[0].key}}" replace="{{item[0].value}}" - when: "{{item[1]|changed}}" - with_nested: - - [{key: "__yarn_host__", value: "{{__yarn_host__}}"}, {key: "__yarn_port__", value: "{{__yarn_port__}}"}, {key: "__env__", value: "{{env}}" }, {key: "__env_name__", value: "{{env_name}}" }, {key: "__zookeepers__", value: "{{zookeepers}}"}, {key: "__kafka_brokers__", value: "{{kafka_brokers}}"}, {key: "__delayInMilliSeconds__", value: "{{delayInMilliSeconds}}" }, {key: "__retryTimeInMilliSeconds__", value: "{{retryTimeInMilliSeconds}}" }, {key: "__bypass_reverse_search__", value: "{{bypass_reverse_search}}" }, {key: "__retryBackoffBaseInSeconds__", value: "{{retry_backoff_base_in_seconds}}" }, {key: "__retryLimit__", value: "{{retry_limit}}" }, {key: "__retryLimitEnable__", value: "{{retry_limit_enable}}" }, {key: "__google_api_key__", value: "{{google_api_key}}" }, {key: "__searchServiceEndpoint__", value: "{{search_service_endpoint}}" }, {key: "__objectDenormalizationAdditionalConfig__", value: "{{object_denormalization_additional_config}}" },{key: "__audit_es_host__", value: "{{audit_es_host}}"}, {key: "__search_es_host__", value: "{{search_es_host}}"}, {key: "__redis_host__", value: "{{redis_host}}"}, {key: "__dp_redis_host__", value: "{{dp_redis_host}}"}, {key: "__redis_port__", value: "{{redis_port}}"}, {key: "__environment_id__", value: "{{environment_id}}"}, {key: "__graph_passport_key__", value: "{{graph_passport_key}}"}, {key: "__lp_bolt_url__", value: "{{lp_bolt_url}}"}, {key: "__lp_bolt_read_url__", value: "{{lp_bolt_read_url}}"}, {key: "__lp_bolt_write_url__", value: "{{lp_bolt_write_url}}"}, {key: "__other_bolt_url__", value: "{{other_bolt_url}}"}, {key: "__other_bolt_read_url__", value: "{{other_bolt_read_url}}"}, {key: "__other_bolt_write_url__", value: "{{other_bolt_write_url}}"}, {key: "__mw_shard_id__", value: "{{mw_shard_id}}"}, {key: "__lp_url__", value: "{{lp_url}}"}, {key: "__cloud_storage_config_environment__", value: "{{cloud_storage_config_environment}}"}, {key: "__google_vision_tagging__", value: "{{google_vision_tagging}}"}, {key: "__lp_tmpfile_location__", value: "{{lp_tmpfile_location}}"}, {key: "__esRouterAdditionalConfig__", value: "{{es_router_additional_config}}"},{key: "__esRouterSecondaryAdditionalConfig__", value: "{{es_router_additional_secondary_config}}"},{key: "__es_port__", value: "{{es_port}}"}, {key: "__keyspace_name__", value: "{{content_keyspace_name}}"}, {key: "__collection_fullecar_disable__", value: "{{collection_fullecar_disable}}"},{key: "__max_iteration_count_for_samza_job__", value: "{{max_iteration_count_for_samza_job}}"},{key: "__cloud_storage_type__", value: "{{cloud_store}}"},{key: "__azure_storage_key__", value: "{{sunbird_public_storage_account_name}}"},{key: "__azure_storage_secret__", value: "{{sunbird_public_storage_account_key}}"},{key: "__azure_storage_container__", value: "{{azure_public_container}}"},{key: "__content_media_base_url__", value: "{{content_media_base_url}}"}, {key: "__plugin_media_base_url__", value: "{{plugin_media_base_url}}"}, {key: "__installation_id__", value: "{{instance_name}}"}, {key: "__content_media_base_url__", value: "{{content_media_base_url}}"}, {key: "__hierarchy_keyspace_name__", value: "{{hierarchy_keyspace_name}}"}, {key: "__composite_search_indexer_container_count__", value: "{{composite_search_indexer_container_count}}"},{key: "__cassandra_lp_connection__", value: "{{lp_cassandra_connection}}"}, {key: "__cassandra_lpa_connection__", value: "{{dp_cassandra_connection}}"}, {key: "__streaming_mime_type__", value: "{{streaming_mime_type}}"}, {key: "__cassandra_sunbird_connection__", value: "{{core_cassandra_connection}}"}, {key: "__cloud_upload_retry_count__", value: "{{cloud_upload_retry_count}}"}, {key: "__compositesearch_index_name__", value: "{{compositesearch_index_name}}"},{key: "__publish_pipeline_container_count__", value: "{{publish_pipeline_container_count}}"},{key: "__yarn_container_memory_mb__", value: "{{publish_yarn_container_memory_mb}}"},{key: "__youtube_api_key__", value: "{{youtube_api_key}}"},{key: "__kp_learning_service_base_url__", value: "{{kp_learning_service_base_url}}"},{key: "__sunbird_installation__", value: "{{sunbird_platform_installation}}"}, {key: "__search_lms_es_host__", value: "{{search_lms_es_host}}"},{key: "__dial_image_storage_container__", value: "{{dial_image_storage_container}}"},{key: "__dial_base_url__", value: "{{dial_base_url}}"},{key: "__learner_service_base_url__", value: "{{learner_service_base_url}}"},{key: "__cert_service_base_url__", value: "{{cert_service_base_url}}"},{key: "__certificate_base_path__", value: "{{certificate_base_path}}"},{key: "__kp_content_service_base_url__", value: "{{kp_content_service_base_url}}"},{key: "__kp_print_service_base_url__", value: "{{kp_print_service_base_url}}"},{key: "__cert_reg_service_base_url__", value: "{{cert_reg_service_base_url}}"},{key: "__kp_search_service_base_url__", value: "{{kp_search_service_base_url}}"},{key: "__samza_coordinator_replication_factor__", value: "{{samza_coordinator_replication_factor}}"},{key: "__samza_checkpoint_replication_factor__", value: "{{samza_checkpoint_replication_factor}}"},{key: "__course_batch_updater_container_count__", value: "{{course_batch_updater_container_count}}"},{key: "__course_certificate_generator_container_count__", value: "{{course_certificate_generator_container_count}}"},{key: "__course_progress_batch_size__", value: "{{course_progress_batch_size}}"},{key: "__itemset_generate_pdf__", value: "{{itemset_generate_pdf}}"},{key: "__auto_creator_container_count__", value: "{{auto_creator_container_count}}"},{key: "__content_streaming_enabled__", value: "{{content_streaming_enabled}}"},{key: "__lms_service_base_url__", value: "{{lms_service_base_url}}"},{key: "__mvc_search_indexer_container_count__", value: "{{mvc_search_indexer_container_count}}"}, {key: "__search_es7_host__", value: "{{search_es7_host}}"} , {key: "__ml-keywordapi__", value: "{{mlworkbench}}"},{key: "__auto_creator_artifact_allowed_sources__", value: "{{auto_creator_artifact_allowed_sources}}"},{key: "__publish_pipeline_task_opts__", value: "{{publish_pipeline_task_opts}}"},{key: "__auto_creator_g_service_acct_cred__", value: "{{auto_creator_gservice_acct_cred}}"},{key: "__certificate_pre_processor_container_count__", value: "{{certificate_pre_processor_container_count}}"},{key: "__master_category_validation_enabled__", value: "{{master_category_validation_enabled}}"}] - - "{{ (config_files|default({})).results|default([]) }}" - - -- name: Create directory for additional config - file: path={{object_denormalization_additional_config_dir}} owner=hduser group=hadoop recurse=yes state=directory - -- name: Update status of new jobs in file - command: bash -lc "./update_new_job_name.sh {{job_status_file}} {{samza_jobs_dir}}/extract/{{item.item | basename}}" - args: - chdir: "{{samza_jobs_dir}}/extract/" - when: "{{item|changed}}" - with_items: "{{ (new_jobs|default({})).results|default([]) }}" - -- name: Kill jobs - command: bash -lc "./kill_jobs.sh {{job_status_file}}" - args: - chdir: /usr/local/hadoop/bin - -- name: Start jobs - command: bash -lc "./start_jobs.sh {{job_status_file}} {{samza_jobs_dir}}/extract" - args: - chdir: "{{samza_jobs_dir}}/extract/" - become_user: hduser - -- name: Remove all old tar - command: bash -lc "./remove_old_tar.sh {{job_status_file}} {{samza_jobs_dir}}" - args: - chdir: "{{samza_jobs_dir}}/extract/" - -- file: path={{samza_jobs_dir}} owner=hduser group=hadoop state=directory recurse=yes diff --git a/ansible/roles/samza-jobs/tasks/main.yml b/ansible/roles/samza-jobs/tasks/main.yml deleted file mode 100644 index 0feb5dcd99..0000000000 --- a/ansible/roles/samza-jobs/tasks/main.yml +++ /dev/null @@ -1,9 +0,0 @@ ---- -- include: deploy.yml - when: deploy_jobs | default(false) - -- include: stop_jobs.yml - when: stop_jobs | default(false) - -- include: start_jobs.yml - when: start_jobs | default(false) diff --git a/ansible/roles/samza-jobs/tasks/start_jobs.yml b/ansible/roles/samza-jobs/tasks/start_jobs.yml deleted file mode 100644 index 4bb0c65c9c..0000000000 --- a/ansible/roles/samza-jobs/tasks/start_jobs.yml +++ /dev/null @@ -1,21 +0,0 @@ ---- -- name: Remove file of job status - file: path="{{job_status_file}}" state=absent - become: yes - -- name: Get job names from folder - command: bash -lc "./get_all_job_name.sh {{job_status_file}}" - args: - chdir: "{{samza_jobs_dir}}/extract" - become: yes - -- name: Ensure yarn resource manager is running - command: bash -lc "(ps aux | grep yarn-hduser-resourcemanager | grep -v grep) || /usr/local/hadoop/sbin/yarn-daemon.sh --config /usr/local/hadoop-{{hadoop_version}}/conf/ start resourcemanager" - become: yes - become_user: hduser - -- name: Start jobs - command: bash -lc "./start_jobs.sh {{job_status_file}} {{samza_jobs_dir}}/extract" - args: - chdir: "{{samza_jobs_dir}}/extract/" - become: yes diff --git a/ansible/roles/samza-jobs/tasks/stop_jobs.yml b/ansible/roles/samza-jobs/tasks/stop_jobs.yml deleted file mode 100644 index 1ef2f7b748..0000000000 --- a/ansible/roles/samza-jobs/tasks/stop_jobs.yml +++ /dev/null @@ -1,16 +0,0 @@ ---- -- name: Remove file of job status - file: path="{{job_status_file}}" state=absent - become: yes - -- name: Get job names from folder - command: bash -lc "./get_all_job_name.sh {{job_status_file}}" - args: - chdir: "{{samza_jobs_dir}}/extract" - become: yes - -- name: Kill jobs - command: bash -lc "./kill_jobs.sh {{job_status_file}}" - args: - chdir: /usr/local/hadoop/bin - become: yes diff --git a/ansible/roles/setup-kafka/defaults/main.yml b/ansible/roles/setup-kafka/defaults/main.yml index 5a2934a8b5..6c5d3e848a 100644 --- a/ansible/roles/setup-kafka/defaults/main.yml +++ b/ansible/roles/setup-kafka/defaults/main.yml @@ -2,6 +2,9 @@ env: dev ingestion_kafka_topics: "" ingestion_kafka_overriden_topics: "" +ingestion_zookeeper_ip: "{{ groups['ingestion-cluster-zookeeper'][0] }}" +processing_zookeeper_ip: "{{ groups['processing-cluster-zookeepers'][0] }}" + processing_kafka_topics: - name: telemetry.raw num_of_partitions: 4 @@ -129,6 +132,43 @@ processing_kafka_topics: - name: object.import.request num_of_partitions: 1 replication_factor: 1 + - name: dialcode.context.job.request + num_of_partitions: 1 + replication_factor: 1 + - name: dialcode.context.job.request.failed + num_of_partitions: 1 + replication_factor: 1 + - name: csp.migration.job.request + num_of_partitions: 1 + replication_factor: 1 + - name: live.video.stream.request + num_of_partitions: 1 + replication_factor: 1 + - name: republish.job.request + num_of_partitions: 4 + replication_factor: 1 + - name: cassandra.data.migration.request + num_of_partitions: 1 + replication_factor: 1 + - name: cassandra.data.migration.job.request.failed + num_of_partitions: 1 + replication_factor: 1 + - name: assessment.republish.request + num_of_partitions: 1 + replication_factor: 1 + - name: assessment.postpublish.request + num_of_partitions: 1 + replication_factor: 1 + - name: republish.events.failed + num_of_partitions: 1 + replication_factor: 1 + - name: republish.events.skipped + num_of_partitions: 1 + replication_factor: 1 + - name: qrimage.request + num_of_partitions: 1 + replication_factor: 1 + processing_kafka_overriden_topics: - name: telemetry.raw @@ -252,3 +292,39 @@ processing_kafka_overriden_topics: - name: object.import.request retention_time: 1209600000 replication_factor: 1 + - name: dialcode.context.job.request + retention_time: 1209600000 + replication_factor: 1 + - name: dialcode.context.job.request.failed + retention_time: 1209600000 + replication_factor: 1 + - name: csp.migration.job.request + retention_time: 1209600000 + replication_factor: 1 + - name: live.video.stream.request + retention_time: 1209600000 + replication_factor: 1 + - name: republish.job.request + retention_time: 1209600000 + replication_factor: 1 + - name: cassandra.data.migration.request + retention_time: 1209600000 + replication_factor: 1 + - name: cassandra.data.migration.job.request.failed + retention_time: 1209600000 + replication_factor: 1 + - name: assessment.republish.request + retention_time: 1209600000 + replication_factor: 1 + - name: assessment.postpublish.request + retention_time: 1209600000 + replication_factor: 1 + - name: republish.events.failed + retention_time: 604800000 + replication_factor: 1 + - name: republish.events.skipped + retention_time: 604800000 + replication_factor: 1 + - name: qrimage.request + retention_time: 604800000 + replication_factor: 1 \ No newline at end of file diff --git a/ansible/roles/setup-kafka/tasks/main.yml b/ansible/roles/setup-kafka/tasks/main.yml index 5080f951e9..0b817e0a5e 100644 --- a/ansible/roles/setup-kafka/tasks/main.yml +++ b/ansible/roles/setup-kafka/tasks/main.yml @@ -1,29 +1,45 @@ - name: create topics - command: /opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic {{env}}.{{item.name}} --partitions {{ item.num_of_partitions }} --replication-factor {{ item.replication_factor }} + command: /opt/kafka/bin/kafka-topics.sh --zookeeper {{ingestion_zookeeper_ip}}:2181 --create --topic {{env}}.{{item.name}} --partitions {{ item.num_of_partitions }} --replication-factor {{ item.replication_factor }} with_items: "{{ingestion_kafka_topics}}" ignore_errors: true - when: kafka_id=="1" + when: kafka_id=="1" and ingestion_kafka_topics | length > 0 tags: - ingestion-kafka - name: override retention time - command: /opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic {{env}}.{{item.name}} --config retention.ms={{ item.retention_time }} + command: /opt/kafka/bin/kafka-topics.sh --zookeeper {{ingestion_zookeeper_ip}}:2181 --alter --topic {{env}}.{{item.name}} --config retention.ms={{ item.retention_time }} with_items: "{{ingestion_kafka_overriden_topics}}" when: kafka_id=="1" and item.retention_time is defined tags: - ingestion-kafka + +- name: override partition count + command: /opt/kafka/bin/kafka-topics.sh --zookeeper {{ingestion_zookeeper_ip}}:2181 --alter --topic {{env}}.{{item.name}} --partitions {{ item.num_of_partitions }} + with_items: "{{ingestion_kafka_overriden_topics}}" + when: kafka_id=="1" and item.num_of_partitions is defined + tags: + - ingestion-kafka + - name: create topics - command: /opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic {{env}}.{{item.name}} --partitions {{ item.num_of_partitions }} --replication-factor {{ item.replication_factor }} + command: /opt/kafka/bin/kafka-topics.sh --zookeeper {{processing_zookeeper_ip}}:2181 --create --topic {{env}}.{{item.name}} --partitions {{ item.num_of_partitions }} --replication-factor {{ item.replication_factor }} with_items: "{{processing_kafka_topics}}" ignore_errors: true - when: kafka_id=="1" + when: kafka_id=="1" and processing_kafka_topics | length > 0 tags: - processing-kafka - name: override retention time - command: /opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic {{env}}.{{item.name}} --config retention.ms={{ item.retention_time }} + command: /opt/kafka/bin/kafka-topics.sh --zookeeper {{processing_zookeeper_ip}}:2181 --alter --topic {{env}}.{{item.name}} --config retention.ms={{ item.retention_time }} with_items: "{{processing_kafka_overriden_topics}}" when: kafka_id=="1" and item.retention_time is defined tags: - processing-kafka + + +- name: override partition count + command: /opt/kafka/bin/kafka-topics.sh --zookeeper {{processing_zookeeper_ip}}:2181 --alter --topic {{env}}.{{item.name}} --partitions {{ item.num_of_partitions }} + with_items: "{{processing_kafka_overriden_topics}}" + when: kafka_id=="1" and item.num_of_partitions is defined + tags: + - processing-kafka \ No newline at end of file diff --git a/ansible/roles/yarn/defaults/main.yml b/ansible/roles/yarn/defaults/main.yml deleted file mode 100644 index 4060075e8e..0000000000 --- a/ansible/roles/yarn/defaults/main.yml +++ /dev/null @@ -1,16 +0,0 @@ ---- -yarn_deploy_dir: /home/ecosystem/.deploy -repo_folder: /home/hduser/Ecosystem-Platform -hadoop_tarball: hadoop-{{hadoop_version}}.tar.gz -hadoop_download_url: https://archive.apache.org/dist/hadoop/common/hadoop-{{hadoop_version}}/{{hadoop_tarball}} -scala_tarball: scala-{{scala_version}}.tgz -scala_download_url: http://www.scala-lang.org/files/archive/{{scala_tarball}} -hadoop_yarn_home: /usr/local/hadoop-{{hadoop_version}} -hadoop_version: 2.7.2 -scala_version: 2.10.4 - -yarn_config_override: true -yarn_vmem_check_enabled: false -yarn_vmem_pmem_ratio: 2.1 -yarn_vcores: 16 -yarn_resource_memory: 20000 diff --git a/ansible/roles/yarn/files/truncate_logs.sh b/ansible/roles/yarn/files/truncate_logs.sh deleted file mode 100644 index 7dddd9d702..0000000000 --- a/ansible/roles/yarn/files/truncate_logs.sh +++ /dev/null @@ -1,21 +0,0 @@ -#!/bin/bash - -# Truncate hadoop/yarn userlogs and keep the last 100 lines - -HADOOP_LOGS_HOME=/usr/local/hadoop/logs/userlogs - -for d in $HADOOP_LOGS_HOME/*/*/ ; do (cd $d && tail -n 100 stdout > stdout.tmp && cat stdout.tmp > stdout && rm stdout.tmp); done - -LOGSTASH_LOGS=/var/log/logstash -tail -n 100 $LOGSTASH_LOGS/logstash.stdout > $LOGSTASH_LOGS/logstash.stdout.tmp -cat $LOGSTASH_LOGS/logstash.stdout.tmp > $LOGSTASH_LOGS/logstash.stdout -rm $LOGSTASH_LOGS/logstash.stdout.tmp - -HADOOP_TMP_USERLOGS=/usr/local/hadoop/logs/userlogs - -for g in $HADOOP_TMP_USERLOGS/*/*/ do - cd $g - tail -n 100 stdout > stdout.tmp - cat stdout.tmp > stdout - rm stdout.tmp -done diff --git a/ansible/roles/yarn/tasks/common.yml b/ansible/roles/yarn/tasks/common.yml deleted file mode 100644 index 75ceb4b69a..0000000000 --- a/ansible/roles/yarn/tasks/common.yml +++ /dev/null @@ -1,78 +0,0 @@ -- name: Common tasks for yarn master and slave - block: - - name: Download and extract hadoop tarball - unarchive: - src: "{{hadoop_download_url}}" - dest: "/usr/local/" - owner: hduser - group: hadoop - creates: "{{hadoop_yarn_home}}" - remote_src: yes - - - name: Creates symlink - file: - src: /usr/local/hadoop-{{hadoop_version}} - dest: /usr/local/hadoop - owner: hduser - group: hadoop - state: link - - - name: creating conf dir - file: - path: "{{hadoop_yarn_home}}/conf" - owner: hduser - group: hadoop - recurse: yes - state: directory - - - name: Templating configs - template: - src: "{{item}}" - dest: "{{hadoop_yarn_home}}/conf/{{item}}" - owner: hduser - group: hadoop - with_items: - - yarn-site.xml - - capacity-scheduler.xml - - core-site.xml - - log4j.properties - - hadoop-env.sh - - - name: Downloading artifacts - get_url: - url: "http://search.maven.org/remotecontent?filepath=org/{{item}}" - dest: "{{hadoop_yarn_home}}/share/hadoop/hdfs/lib/" - owner: hduser - group: hadoop - with_items: - - clapper/grizzled-slf4j_2.10/1.0.1/grizzled-slf4j_2.10-1.0.1.jar - - apache/samza/samza-yarn_2.10/0.8.0/samza-yarn_2.10-0.8.0.jar - - apache/samza/samza-core_2.10/0.8.0/samza-core_2.10-0.8.0.jar - - - name: Download and extract scala - unarchive: - src: "{{scala_download_url}}" - dest: "/usr/local/" - owner: hduser - group: hadoop - remote_src: yes - - - name: Creates symlink - file: - src: "/usr/local/scala-{{scala_version}}" - dest: /usr/local/scala - owner: hduser - group: hadoop - state: link - - - name: copying scala files - copy: - src: "/usr/local/scala-{{scala_version}}/lib/{{item}}" - dest: "{{hadoop_yarn_home}}/share/hadoop/hdfs/lib/" - owner: hduser - group: hadoop - remote_src: true - with_items: - - scala-compiler.jar - - scala-library.jar - delegate_to: "{{slave|default(inventory_hostname)}}" diff --git a/ansible/roles/yarn/tasks/main.yml b/ansible/roles/yarn/tasks/main.yml deleted file mode 100644 index a297e9a951..0000000000 --- a/ansible/roles/yarn/tasks/main.yml +++ /dev/null @@ -1,55 +0,0 @@ -- name: Debian | Install Maven - apt: - pkg: "{{item}}" - update_cache: yes - state: latest - install_recommends: yes - with_items: - - maven - - git - -# Running common tasks in master -- include: common.yml - -- lineinfile: - dest: /home/hduser/.bashrc - state: present - regexp: '^HADOOP_YARN_HOME' - line: 'HADOOP_YARN_HOME={{hadoop_yarn_home}}' - -- lineinfile: - dest: /home/hduser/.bashrc - state: present - regexp: '^HADOOP_CONF_DIR' - line: 'HADOOP_CONF_DIR=$HADOOP_YARN_HOME/conf' - -- file: - path: "/{{hadoop_yarn_home}}/conf/slaves" - state: touch - -- lineinfile: - dest: "/{{hadoop_yarn_home}}/conf/slaves" - state: present - regexp: "^{{item}}" - line: "{{item}}" - with_items: "{{yarn_slaves}}" - -# Running common tasks in slaves -- name: Running common tasks in slaves - include: common.yml - with_items: "{{yarn_slaves}}" - loop_control: - loop_var: slave - -- name: Copy truncate_files.sh - copy: - src: truncate_logs.sh - dest: /usr/local/bin - mode: 755 - -- name: Add truncate logs to cron - cron: - name: "Truncate yarn logs" - minute: "0" - job: "/usr/local/bin/truncate_logs.sh" - backup: yes diff --git a/ansible/roles/yarn/tasks/truncate-logs.yml b/ansible/roles/yarn/tasks/truncate-logs.yml deleted file mode 100644 index e0d550a8ec..0000000000 --- a/ansible/roles/yarn/tasks/truncate-logs.yml +++ /dev/null @@ -1,8 +0,0 @@ ---- -- name: Copy truncate_files.sh - copy: src=truncate_logs.sh dest=/ mode=755 - become: yes - -- name: Add truncate logs to cron - cron: name="Truncate yarn logs" minute="0" job="/truncate_logs.sh" backup=yes - become: yes \ No newline at end of file diff --git a/ansible/roles/yarn/templates/capacity-scheduler.xml b/ansible/roles/yarn/templates/capacity-scheduler.xml deleted file mode 100644 index 4161b7ac03..0000000000 --- a/ansible/roles/yarn/templates/capacity-scheduler.xml +++ /dev/null @@ -1,111 +0,0 @@ - - - - - yarn.scheduler.capacity.maximum-applications - 10000 - - Maximum number of applications that can be pending and running. - - - - - yarn.scheduler.capacity.maximum-am-resource-percent - 0.5 - - Maximum percent of resources in the cluster which can be used to run - application masters i.e. controls number of concurrent running - applications. - - - - - yarn.scheduler.capacity.resource-calculator - org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator - - The ResourceCalculator implementation to be used to compare - Resources in the scheduler. - The default i.e. DefaultResourceCalculator only uses Memory while - DominantResourceCalculator uses dominant-resource to compare - multi-dimensional resources such as Memory, CPU etc. - - - - - yarn.scheduler.capacity.root.queues - default - - The queues at the this level (root is the root queue). - - - - - yarn.scheduler.capacity.root.default.capacity - 100 - Default queue target capacity. - - - - yarn.scheduler.capacity.root.default.user-limit-factor - 1 - - Default queue user limit a percentage from 0.0 to 1.0. - - - - - yarn.scheduler.capacity.root.default.maximum-capacity - 100 - - The maximum capacity of the default queue. - - - - - yarn.scheduler.capacity.root.default.state - RUNNING - - The state of the default queue. State can be one of RUNNING or STOPPED. - - - - - yarn.scheduler.capacity.root.default.acl_submit_applications - * - - The ACL of who can submit jobs to the default queue. - - - - - yarn.scheduler.capacity.root.default.acl_administer_queue - * - - The ACL of who can administer jobs on the default queue. - - - - - yarn.scheduler.capacity.node-locality-delay - -1 - - Number of missed scheduling opportunities after which the CapacityScheduler - attempts to schedule rack-local containers. - Typically this should be set to number of racks in the cluster, this - feature is disabled by default, set to -1. - - - - diff --git a/ansible/roles/yarn/templates/config.j2 b/ansible/roles/yarn/templates/config.j2 deleted file mode 100644 index f1b411ed5f..0000000000 --- a/ansible/roles/yarn/templates/config.j2 +++ /dev/null @@ -1,3 +0,0 @@ -Host {{item}} - IdentityFile ~/.ssh/hadoop_rsa - StrictHostKeyChecking no diff --git a/ansible/roles/yarn/templates/core-site.xml b/ansible/roles/yarn/templates/core-site.xml deleted file mode 100644 index d5a0752d52..0000000000 --- a/ansible/roles/yarn/templates/core-site.xml +++ /dev/null @@ -1,7 +0,0 @@ - - - - fs.http.impl - org.apache.samza.util.hadoop.HttpFileSystem - - diff --git a/ansible/roles/yarn/templates/hadoop-env.sh b/ansible/roles/yarn/templates/hadoop-env.sh deleted file mode 100644 index fd200c373c..0000000000 --- a/ansible/roles/yarn/templates/hadoop-env.sh +++ /dev/null @@ -1,2 +0,0 @@ -#export JAVA_HOME=/usr/lib/jvm/java-8-oracle -export JAVA_HOME=/opt/jdk1.8.0_121 diff --git a/ansible/roles/yarn/templates/log4j.properties b/ansible/roles/yarn/templates/log4j.properties deleted file mode 100644 index 4181560fcc..0000000000 --- a/ansible/roles/yarn/templates/log4j.properties +++ /dev/null @@ -1,267 +0,0 @@ -# Copyright 2011 The Apache Software Foundation -# -# Licensed to the Apache Software Foundation (ASF) under one -# or more contributor license agreements. See the NOTICE file -# distributed with this work for additional information -# regarding copyright ownership. The ASF licenses this file -# to you under the Apache License, Version 2.0 (the -# "License"); you may not use this file except in compliance -# with the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -# Define some default values that can be overridden by system properties -hadoop.root.logger=INFO,console -hadoop.log.dir=. -hadoop.log.file=hadoop.log - -# Define the root logger to the system property "hadoop.root.logger". -log4j.rootLogger=${hadoop.root.logger}, EventCounter - -# Logging Threshold -log4j.threshold=ALL - -# Null Appender -log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender - -# -# Rolling File Appender - cap space usage at 5gb. -# -hadoop.log.maxfilesize=25MB -hadoop.log.maxbackupindex=1 -log4j.appender.RFA=org.apache.log4j.RollingFileAppender -log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} - -log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize} -log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex} - -log4j.appender.RFA.layout=org.apache.log4j.PatternLayout - -# Pattern format: Date LogLevel LoggerName LogMessage -log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n -# Debugging Pattern format -#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n - - -# -# Daily Rolling File Appender -# - -log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender -log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} - -# Rollver at midnight -log4j.appender.DRFA.DatePattern=.yyyy-MM-dd - -# 30-day backup -log4j.appender.DRFA.MaxBackupIndex=1 -log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout - -# Pattern format: Date LogLevel LoggerName LogMessage -log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n -# Debugging Pattern format -#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n - - -# -# console -# Add "console" to rootlogger above if you want to use this -# - -log4j.appender.console=org.apache.log4j.ConsoleAppender -log4j.appender.console.target=System.err -log4j.appender.console.layout=org.apache.log4j.PatternLayout -log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n - -# -# TaskLog Appender -# - -#Default values -hadoop.tasklog.taskid=null -hadoop.tasklog.iscleanup=false -hadoop.tasklog.noKeepSplits=4 -hadoop.tasklog.totalLogFileSize=100 -hadoop.tasklog.purgeLogSplits=true -hadoop.tasklog.logsRetainHours=1 - -log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender -log4j.appender.TLA.taskId=${hadoop.tasklog.taskid} -log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup} -log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize} - -log4j.appender.TLA.layout=org.apache.log4j.PatternLayout -log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n - -# -# HDFS block state change log from block manager -# -# Uncomment the following to suppress normal block state change -# messages from BlockManager in NameNode. -#log4j.logger.BlockStateChange=WARN - -# -#Security appender -# -hadoop.security.logger=INFO,NullAppender -hadoop.security.log.maxfilesize=256MB -hadoop.security.log.maxbackupindex=20 -log4j.category.SecurityLogger=${hadoop.security.logger} -hadoop.security.log.file=SecurityAuth-${user.name}.audit -log4j.appender.RFAS=org.apache.log4j.RollingFileAppender -log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} -log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout -log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n -log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize} -log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex} - -# -# Daily Rolling Security appender -# -log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender -log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} -log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout -log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n -log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd - -# -# hadoop configuration logging -# - -# Uncomment the following line to turn off configuration deprecation warnings. -# log4j.logger.org.apache.hadoop.conf.Configuration.deprecation=WARN - -# -# hdfs audit logging -# -hdfs.audit.logger=INFO,NullAppender -hdfs.audit.log.maxfilesize=25MB -hdfs.audit.log.maxbackupindex=1 -log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} -log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false -log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender -log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log -log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout -log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n -log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize} -log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex} - -# -# mapred audit logging -# -mapred.audit.logger=INFO,NullAppender -mapred.audit.log.maxfilesize=256MB -mapred.audit.log.maxbackupindex=20 -log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger} -log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false -log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender -log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log -log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout -log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n -log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize} -log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex} - -# Custom Logging levels - -#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG -#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG -#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG - -# Jets3t library -log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR - -# -# Event Counter Appender -# Sends counts of logging messages at different severity levels to Hadoop Metrics. -# -log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter - -# -# Job Summary Appender -# -# Use following logger to send summary to separate file defined by -# hadoop.mapreduce.jobsummary.log.file : -# hadoop.mapreduce.jobsummary.logger=INFO,JSA -# -hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger} -hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log -hadoop.mapreduce.jobsummary.log.maxfilesize=256MB -hadoop.mapreduce.jobsummary.log.maxbackupindex=20 -log4j.appender.JSA=org.apache.log4j.RollingFileAppender -log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file} -log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize} -log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex} -log4j.appender.JSA.layout=org.apache.log4j.PatternLayout -log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n -log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger} -log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false - -# -# Yarn ResourceManager Application Summary Log -# -# Set the ResourceManager summary log filename -yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log -# Set the ResourceManager summary log level and appender -yarn.server.resourcemanager.appsummary.logger=${hadoop.root.logger} -#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY - -# To enable AppSummaryLogging for the RM, -# set yarn.server.resourcemanager.appsummary.logger to -# ,RMSUMMARY in hadoop-env.sh - -# Appender for ResourceManager Application Summary Log -# Requires the following properties to be set -# - hadoop.log.dir (Hadoop Log directory) -# - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename) -# - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender) - -log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger} -log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false -log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender -log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file} -log4j.appender.RMSUMMARY.MaxFileSize=25MB -log4j.appender.RMSUMMARY.MaxBackupIndex=1 -log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout -log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n - -# HS audit log configs -#mapreduce.hs.audit.logger=INFO,HSAUDIT -#log4j.logger.org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger=${mapreduce.hs.audit.logger} -#log4j.additivity.org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger=false -#log4j.appender.HSAUDIT=org.apache.log4j.DailyRollingFileAppender -#log4j.appender.HSAUDIT.File=${hadoop.log.dir}/hs-audit.log -#log4j.appender.HSAUDIT.layout=org.apache.log4j.PatternLayout -#log4j.appender.HSAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n -#log4j.appender.HSAUDIT.DatePattern=.yyyy-MM-dd - -# Http Server Request Logs -#log4j.logger.http.requests.namenode=INFO,namenoderequestlog -#log4j.appender.namenoderequestlog=org.apache.hadoop.http.HttpRequestLogAppender -#log4j.appender.namenoderequestlog.Filename=${hadoop.log.dir}/jetty-namenode-yyyy_mm_dd.log -#log4j.appender.namenoderequestlog.RetainDays=1 - -#log4j.logger.http.requests.datanode=INFO,datanoderequestlog -#log4j.appender.datanoderequestlog=org.apache.hadoop.http.HttpRequestLogAppender -#log4j.appender.datanoderequestlog.Filename=${hadoop.log.dir}/jetty-datanode-yyyy_mm_dd.log -#log4j.appender.datanoderequestlog.RetainDays=3 - -#log4j.logger.http.requests.resourcemanager=INFO,resourcemanagerrequestlog -#log4j.appender.resourcemanagerrequestlog=org.apache.hadoop.http.HttpRequestLogAppender -#log4j.appender.resourcemanagerrequestlog.Filename=${hadoop.log.dir}/jetty-resourcemanager-yyyy_mm_dd.log -#log4j.appender.resourcemanagerrequestlog.RetainDays=3 - -#log4j.logger.http.requests.jobhistory=INFO,jobhistoryrequestlog -#log4j.appender.jobhistoryrequestlog=org.apache.hadoop.http.HttpRequestLogAppender -#log4j.appender.jobhistoryrequestlog.Filename=${hadoop.log.dir}/jetty-jobhistory-yyyy_mm_dd.log -#log4j.appender.jobhistoryrequestlog.RetainDays=3 - -#log4j.logger.http.requests.nodemanager=INFO,nodemanagerrequestlog -#log4j.appender.nodemanagerrequestlog=org.apache.hadoop.http.HttpRequestLogAppender -#log4j.appender.nodemanagerrequestlog.Filename=${hadoop.log.dir}/jetty-nodemanager-yyyy_mm_dd.log -#log4j.appender.nodemanagerrequestlog.RetainDays=3 diff --git a/ansible/roles/yarn/templates/yarn-site.xml b/ansible/roles/yarn/templates/yarn-site.xml deleted file mode 100644 index 796f6450c3..0000000000 --- a/ansible/roles/yarn/templates/yarn-site.xml +++ /dev/null @@ -1,64 +0,0 @@ - - - - - yarn.resourcemanager.hostname - - {{resourcemanager}} - - - yarn.nodemanager.resource.memory-mb - {{yarn_resource_memory}} - - - yarn.scheduler.minimum-allocation-mb - 128 - - - mapreduce.job.userlog.retain.hours - 240 - - - yarn.log-aggregation-enable - false - - - yarn.nodemanager.log.retain-seconds - 3600 - - - yarn.nodemanager.recovery.enabled - true - - - yarn.nodemanager.address - 0.0.0.0:45454 - - - yarn.nodemanager.resource.cpu-vcores - {{yarn_vcores}} - - {% if yarn_config_override is defined %} - - yarn.nodemanager.vmem-check-enabled - {{yarn_vmem_check_enabled}} - - - yarn.nodemanager.vmem-pmem-ratio - {{yarn_vmem_pmem_ratio}} - - {% endif %} - - diff --git a/ansible/samza_jobs_alert.yml b/ansible/samza_jobs_alert.yml deleted file mode 100644 index f225d1d5bc..0000000000 --- a/ansible/samza_jobs_alert.yml +++ /dev/null @@ -1,9 +0,0 @@ ---- -- hosts: lp-yarn-master - vars_files: - - "{{inventory_dir}}/secrets.yml" - tasks: - - command: ./samza_alerts.sh - args: - chdir: /home/hduser - become: yes diff --git a/ansible/samza_logs_provision.yml b/ansible/samza_logs_provision.yml deleted file mode 100644 index 9f2a952bc2..0000000000 --- a/ansible/samza_logs_provision.yml +++ /dev/null @@ -1,8 +0,0 @@ ---- -- hosts: dp-yarn-slave - vars_files: - - "{{inventory_dir}}/secrets.yml" - become: yes - tasks: - - name: copy the backup script to yarn slaves - copy: src=resources/upload_samza_logs dest="{{script_path}}/upload_samza_logs.sh" mode=755 diff --git a/img.png b/img.png new file mode 100644 index 0000000000..b0fc37c93b Binary files /dev/null and b/img.png differ diff --git a/img_1.png b/img_1.png new file mode 100644 index 0000000000..8e9a965892 Binary files /dev/null and b/img_1.png differ diff --git a/img_2.png b/img_2.png new file mode 100644 index 0000000000..0a9047cd10 Binary files /dev/null and b/img_2.png differ diff --git a/img_3.png b/img_3.png new file mode 100644 index 0000000000..411332d99a Binary files /dev/null and b/img_3.png differ diff --git a/kubernetes/ansible/roles/flink-jobs-deploy/defaults/main.yml b/kubernetes/ansible/roles/flink-jobs-deploy/defaults/main.yml index 1d83067c28..ef3ad36b9c 100644 --- a/kubernetes/ansible/roles/flink-jobs-deploy/defaults/main.yml +++ b/kubernetes/ansible/roles/flink-jobs-deploy/defaults/main.yml @@ -70,6 +70,7 @@ post_publish_event_router_parallelism: 1 post_publish_shallow_copy_parallelism: 1 post_publish_link_dialcode_parallelism: 1 post_publish_batch_create_parallelism: 1 +post_publish_dialcode_context_parallelism: 1 ### Certificate Job related Vars certificate_generator_consumer_parallelism: 1 @@ -78,13 +79,6 @@ certificate_generator_parallelism: 1 ### Video stream generator related vars ### IMPORTANT: The media-service configuration values should be updated in respective environment. -video_stream_generator_azure_tenant: "{{ media_service_azure_tenant | default('') }}" -video_stream_generator_azure_subscription_id: "{{ media_service_azure_subscription_id | default('') }}" -video_stream_generator_azure_account_name: "{{ media_service_azure_account_name | default('') }}" -video_stream_generator_azure_resource_group_name: "{{ media_service_azure_resource_group_name | default('') }}" -video_stream_generator_azure_token_client_key: "{{ media_service_azure_token_client_key | default('') }}" -video_stream_generator_azure_token_client_secret: "{{ media_service_azure_token_client_secret | default('') }}" -video_stream_generator_azure_stream_base_url: "{{ stream_base_url | default('') }}" video_stream_generator_consumer_parallelism: 1 video_stream_generator_parallelism: 1 video_stream_generator_timer_duration: 1800 @@ -118,11 +112,21 @@ middleware_assessment_aggregator_table: "assessment_aggregator" ### Collection Generator Job related Vars collection_certificate_generator_consumer_parallelism: 1 collection_certificate_generator_parallelism: 1 -collection_certificate_generator_enable_suppress_exception: false -collection_certificate_generator_enable_rc_certificate: true +collection_certificate_generator_enable_suppress_exception: "{{ enable_suppress_exception | lower }}" +collection_certificate_generator_enable_rc_certificate: "{{ enable_rc_certificate | lower }}" +collection_certificate_pre_processor_enable_suppress_exception: "{{ enable_suppress_exception | lower }}" +collection_certificate_generator_rc_badcharlist: "{{ rc_bad_char_list | default('\"\\\\x00,\\\\\\\\aaa,\\\\aaa,Ø,Ý,\\\\\"') }}" registry_sunbird_keyspace: "sunbird" cert_registry_table: "cert_registry" +### transaction event processor related vars +transaction_event_processor_consumer_parallelism: 1 +transaction_event_processor_parallelism: 1 +transaction_event_processor_producer_parallelism: 1 +transaction_event_processor_default_channel: "{{ default_channel | default('org.sunbird') }}" +enable_audit_event_generator: "true" +enable_audit_history_indexer: "true" +enable_obsrv_metadata_generator: "false" ### to be removed job_classname: "" @@ -169,7 +173,7 @@ flink_job_names: replica: 1 jobmanager_memory: 1024m taskmanager_memory: 1024m - taskslots: 1 + taskslots: "{{search_indexer_taskslots | default('1') }}" cpu_requests: 0.3 enrolment-reconciliation: job_class_name: 'org.sunbird.job.recounciliation.task.EnrolmentReconciliationStreamTask' @@ -255,6 +259,48 @@ flink_job_names: taskmanager_memory: 1024m taskslots: 1 cpu_requests: 0.3 + dialcode-context-updater: + job_class_name: 'org.sunbird.job.dialcodecontextupdater.task.DialcodeContextUpdaterStreamTask' + replica: 1 + jobmanager_memory: 1024m + taskmanager_memory: 1024m + taskslots: 1 + cpu_requests: 0.3 + csp-migrator: + job_class_name: 'org.sunbird.job.cspmigrator.task.CSPMigratorStreamTask' + replica: 1 + jobmanager_memory: 1024m + taskmanager_memory: 1024m + taskslots: 1 + cpu_requests: 0.3 + live-node-publisher: + job_class_name: 'org.sunbird.job.livenodepublisher.task.LiveNodePublisherStreamTask' + replica: 1 + jobmanager_memory: "{{live_node_publisher_job_memory | default('2048m') }}" + taskmanager_memory: "{{live_node_publisher_task_memory | default('2048m') }}" + taskslots: "{{live_node_publisher_taskslots | default('1') }}" + cpu_requests: "{{live_node_publisher_cpu_requests | default('0.7') }}" + live-video-stream-generator: + job_class_name: 'org.sunbird.job.livevideostream.task.LiveVideoStreamGeneratorStreamTask' + replica: 1 + jobmanager_memory: 1024m + taskmanager_memory: 1024m + taskslots: 1 + cpu_requests: 0.3 + cassandra-data-migration: + job_class_name: 'org.sunbird.job.task.CassandraDataMigrationStreamTask' + replica: 1 + jobmanager_memory: 1024m + taskmanager_memory: 1024m + taskslots: 1 + cpu_requests: 0.3 + transaction-event-processor: + job_class_name: 'org.sunbird.job.transaction.task.TransactionEventProcessorStreamTask' + replica: 1 + jobmanager_memory: 1024m + taskmanager_memory: 1024m + taskslots: 1 + cpu_requests: 0.3 ### Global vars middleware_course_keyspace: "sunbird_courses" @@ -268,6 +314,11 @@ composite_search_indexer_parallelism: 1 dialcode_external_indexer_parallelism: 1 dialcode_metric_indexer_parallelism: 1 schema_definition_cache_expiry_in_sec: 14400 +search_indexer_topic_name: "{{ env_name }}.learning.graph.events" +search_indexer_failed_topic_name: "{{ env_name }}.learning.events.failed" +search_indexer_group_name: "{{ env_name }}-search-indexer-group" +search_indexer_es_index_name: "{{ compositesearch_index_name }}" +dialcode_es_index_name: "dialcode" search_indexer_ignored_fields: ["responseDeclaration", "body", "options", "lastStatusChangedOn", "SYS_INTERNAL_LAST_UPDATED_ON", "sYS_INTERNAL_LAST_UPDATED_ON","branchingLogic"] search_indexer_restrict_object_types: ["EventSet", "EventSetImage", "Event", "EventImage", "Questionnaire", "Misconception", "FrameworkType", "Concept", "Misconception", "Language", "Reference", "Dimension", "Method", "Library", "Domain", "Api"] @@ -319,3 +370,51 @@ qrcode_image_generator_consumer_parallelism: 1 qrcode_image_generator_parallelism: 1 source_base_url: "{{proto}}://{{domain_name}}/api" + +### video-stream-generator related vars +media_service_provider_name: "azure" +## AWS media convert service vars +aws_mediaconvert_api_version: "2017-08-29" +aws_mediaconvert_region: "" +aws_content_bucket_name: "" +aws_mediaconvert_access_key: "" +aws_mediaconvert_access_secret: "" +aws_mediaconvert_api_endpoint: "" +aws_mediaconvert_queue_id: "" +aws_mediaconvert_role_name: "" + +#Cloud storage config +cloud_storage_key: "" +cloud_storage_secret: "" +cloud_storage_container: "" +cloud_storage_endpoint: "" +cloudstorage_sdk_endpoint: "" +cloud_storage_proxy_host: "" + + +### csp-migrator related vars +csp_migrator_parallelism: 1 +csp_migrator_timer_duration: 1800 +csp_migrator_max_retries: 10 +csp_migrator_consumer_parallelism: 1 +csp_migrator_cassandra_parallelism: 1 +csp_migrator_router_parallelism: 1 +csp_migration_topic_name: "{{ env_name }}.csp.migration.job.request" +csp_migrator_group_name: "{{ env_name }}-csp-migrator-group" +csp_migrator_failed_topic_name: "{{ env_name }}.csp.migration.job.request.failed" +question_republish_topic_name: "{{ env_name }}.assessment.republish.request" +content_republish_topic_name: "{{ env_name }}.republish.job.request" +video_stream_topic_name: "{{ env_name }}.live.video.stream.request" +content_hierarchy_keyspace_name: "{{ hierarchy_keyspace_name }}" +questionset_hierarchy_keyspace_name: "{{ hierarchy_keyspace_name }}" +gdrive_application_name: "drive-download" + +### video-stream-generator azure_mediakind related vars +azure_mediakind_project_name: "{{ azure_mediakind_project_name | default('') }}" +azure_mediakind_auth_token: "{{ azure_mediakind_auth_token | default('') }}" +azure_mediakind_account_name: "{{ azure_mediakind_account_name | default('') }}" +azure_mediakind_transform_default: "media_transform_default" +azure_mediakind_stream_base_url: "{{ azure_mediakind_stream_base_url | default('') }}" +azure_mediakind_stream_endpoint_name: "default" +azure_mediakind_stream_protocol : "Hls" +azure_mediakind_stream_policy_name : "Predefined_ClearStreamingOnly" \ No newline at end of file diff --git a/kubernetes/helm_charts/datapipeline_jobs/templates/flink_job_deployment.yaml b/kubernetes/helm_charts/datapipeline_jobs/templates/flink_job_deployment.yaml index ebd851a454..18c4729df2 100644 --- a/kubernetes/helm_charts/datapipeline_jobs/templates/flink_job_deployment.yaml +++ b/kubernetes/helm_charts/datapipeline_jobs/templates/flink_job_deployment.yaml @@ -109,8 +109,16 @@ spec: workingDir: /opt/flink command: ["/opt/flink/bin/standalone-job.sh"] args: ["start-foreground", - "--job-classname={{ .Values.job_classname }}", + "--job-classname={{ .Values.job_classname }}", + {{- if eq .Values.csp "oci" }} + "-Dpresto.s3.access-key={{ .Values.s3_access_key}}", + "-Dpresto.s3.secret-key={{ .Values.s3_secret_key }}", + "-Dpresto.s3.endpoint={{ .Values.s3_endpoint }}", + "-Dpresto.s3.region={{ .Values.s3_region }}", + "-Dpresto.s3.path-style-access={{ .Values.s3_path_style_access }}", + {{- else}} "-Dfs.azure.account.key.{{ .Values.azure_account }}.blob.core.windows.net={{ .Values.azure_secret }}", + {{- end}} "-Dweb.submit.enable=false", "-Dmetrics.reporter.prom.class=org.apache.flink.metrics.prometheus.PrometheusReporter", "-Dmetrics.reporter.prom.port={{ .Values.jobmanager.prom_port }}", @@ -183,7 +191,15 @@ spec: workingDir: {{ .Values.taskmanager.flink_work_dir }} command: ["/opt/flink/bin/taskmanager.sh"] args: ["start-foreground", + {{- if eq .Values.csp "oci" }} + "-Dpresto.s3.access.key={{ .Values.s3_access_key}}", + "-Dpresto.s3.secret.key={{ .Values.s3_secret_key }}", + "-Dpresto.s3.endpoint={{ .Values.s3_endpoint }}", + "-Dpresto.s3.endpoint={{ .Values.s3_region }}", + "-Dpresto.s3.path.style.access={{ .Values.s3_path_style_access }}", + {{- else}} "-Dfs.azure.account.key.{{ .Values.azure_account }}.blob.core.windows.net={{ .Values.azure_secret }}", + {{- end}} "-Dweb.submit.enable=false", "-Dmetrics.reporter.prom.class=org.apache.flink.metrics.prometheus.PrometheusReporter", "-Dmetrics.reporter.prom.host={{ .Release.Name }}-taskmanager", diff --git a/kubernetes/helm_charts/datapipeline_jobs/values.j2 b/kubernetes/helm_charts/datapipeline_jobs/values.j2 index 1b29a28fb5..d3c6b1cd93 100644 --- a/kubernetes/helm_charts/datapipeline_jobs/values.j2 +++ b/kubernetes/helm_charts/datapipeline_jobs/values.j2 @@ -3,8 +3,22 @@ imagepullsecrets: {{ imagepullsecrets }} dockerhub: {{ dockerhub }} repository: {{flink_repository|default('knowledge-platform-jobs')}} image_tag: {{ image_tag }} +csp: {{cloud_service_provider}} +cloud_storage_key: {{cloud_public_storage_accountname}} +cloud_storage_secret: {{cloud_public_storage_secret}} +cloud_storage_container: {{cloud_storage_content_bucketname}} +cloud_storage_endpoint: {{cloudstorage_sdk_endpoint}} azure_account: {{ azure_account }} azure_secret: {{ azure_secret }} +s3_access_key: {{ cloud_public_storage_accountname }} +s3_secret_key: {{cloud_public_storage_secret}} +{% if cloud_service_provider == "oci" %} +s3_endpoint: {{oci_flink_s3_storage_endpoint}} +s3_region: {{s3_region}} +s3_path_style_access: true +{% else %} +s3_endpoint: {{cloud_public_storage_endpoint}} +{% endif %} serviceMonitor: enabled: {{ service_monitor_enabled | lower}} @@ -38,8 +52,8 @@ log4j_console_properties: | rootLogger.appenderRef.console.ref = ConsoleAppender # Uncomment this if you want to _only_ change Flink's logging - #logger.flink.name = org.apache.flink - #logger.flink.level = {{ flink_jobs_console_log_level | default(INFO) }} + # logger.flink.name = org.apache.flink + # logger.flink.level = {{ flink_jobs_console_log_level | default(INFO) }} # The following lines keep the log level of common libraries/connectors on # log level INFO. The root logger does not override this. You have to manually @@ -53,6 +67,8 @@ log4j_console_properties: | logger.zookeeper.name = org.apache.zookeeper logger.zookeeper.level = {{ flink_libraries_log_level | default(INFO) }} + + # Log all infos to the console appender.console.name = ConsoleAppender appender.console.type = CONSOLE @@ -73,7 +89,19 @@ base_config: | } job { env = "{{ env_name }}" - enable.distributed.checkpointing = true + enable.distributed.checkpointing = false +{% if cloud_service_provider == "oci" %} + statebackend { + s3 { + storage { + endpoint = "{{ oci_flink_s3_storage_endpoint }}" + container = "{{ flink_container_name }}" + checkpointing.dir = "checkpoint" + } + } + base.url = "s3://"${job.statebackend.s3.storage.container}"/"${job.statebackend.s3.storage.checkpointing.dir} + } +{% elif cloud_service_provider == "azure" %} statebackend { blob { storage { @@ -84,6 +112,19 @@ base_config: | } base.url = "wasbs://"${job.statebackend.blob.storage.container}"@"${job.statebackend.blob.storage.account}"/"${job.statebackend.blob.storage.checkpointing.dir} } +{% elif cloud_service_provider == "aws" %} + statebackend { + s3 { + storage { + endpoint = "{{ cloud_storage_endpoint }}" + container = "{{ cloud_storage_container }}" + checkpointing.dir = "checkpoint" + } + } + base.url = "s3://"${job.statebackend.s3.storage.container}"/"${job.statebackend.s3.storage.checkpointing.dir} + } +{% endif %} + } task { parallelism = 1 @@ -116,106 +157,15 @@ base_config: | } } -activity-aggregate-updater: - activity-aggregate-updater: |+ - include file("/data/flink/conf/base-config.conf") - kafka { - input.topic = {{ env_name }}.coursebatch.job.request - output.audit.topic = {{ env_name }}.telemetry.raw - output.failed.topic = {{ env_name }}.activity.agg.failed - output.certissue.topic = {{ env_name }}.issue.certificate.request - groupId = {{ env_name }}-activity-aggregate-group - } - task { - window.shards = {{ activity_agg_window_shards }} - checkpointing.interval = {{ activity_agg_checkpointing_interval }} - checkpointing.pause.between.seconds = {{ activity_agg_checkpointing_pause_interval }} - restart-strategy.attempts = {{ restart_attempts }} # max 3 restart attempts - restart-strategy.delay = 240000 # in milli-seconds # on max restarts (3) within 4 min the job will fail. - consumer.parallelism = {{ activity_agg_consumer_parallelism }} - dedup.parallelism = {{ activity_agg_dedup_parallelism }} - activity.agg.parallelism = {{ activity_agg_parallelism }} - enrolment.complete.parallelism = {{ enrolment_complete_parallelism }} - } - lms-cassandra { - keyspace = "{{ middleware_course_keyspace }}" - consumption.table = "{{ middleware_consumption_table }}" - user_activity_agg.table = "{{ middleware_user_activity_agg_table }}" - user_enrolments.table = "user_enrolments" - } - redis { - database { - relationCache.id = 10 - } - } - dedup-redis { - host = {{ dedup_redis_host }} - port = 6379 - database.index = {{ activity_agg_dedup_index }} - database.expiry = {{ activity_agg_dedup_expiry }} - } - threshold.batch.read.interval = {{ activity_agg_batch_interval }} - threshold.batch.read.size = {{ activity_agg_batch_read_size }} - threshold.batch.write.size = {{ activity_agg_batch_write_size }} - activity { - module.aggs.enabled = true - input.dedup.enabled = true - filter.processed.enrolments = {{ activity_agg_enrolment_filter_processe_enabled | lower }} - collection.status.cache.expiry = {{ activity_agg_collection_status_cache_expiry_time }} - } - service { - search.basePath = "{{ kp_search_service_base_url }}" - } - - - flink-conf: |+ - jobmanager.memory.flink.size: {{ flink_job_names['activity-aggregate-updater'].jobmanager_memory }} - taskmanager.memory.flink.size: {{ flink_job_names['activity-aggregate-updater'].taskmanager_memory }} - taskmanager.numberOfTaskSlots: {{ flink_job_names['activity-aggregate-updater'].taskslots }} - parallelism.default: 1 - jobmanager.execution.failover-strategy: region - taskmanager.memory.network.fraction: 0.1 - -relation-cache-updater: - relation-cache-updater: |+ - include file("/data/flink/conf/base-config.conf") - kafka { - input.topic = {{ env_name }}.content.postpublish.request - groupId = {{ env_name }}-relation-cache-updater-group - } - task { - consumer.parallelism = {{ relation_cache_updater_consumer_parallelism }} - parallelism = {{ relation_cache_updater_parallelism }} - } - lms-cassandra { - keyspace = "{{ middleware_hierarchy_keyspace }}" - table = "{{ middleware_content_hierarchy_table }}" - } - redis { - database.index = 10 - } - dp-redis { - host = {{ dp_redis_host }} - port = 6379 - database.index = 5 - } - - flink-conf: |+ - jobmanager.memory.flink.size: {{ flink_job_names['relation-cache-updater'].jobmanager_memory }} - taskmanager.memory.flink.size: {{ flink_job_names['relation-cache-updater'].taskmanager_memory }} - taskmanager.numberOfTaskSlots: {{ flink_job_names['relation-cache-updater'].taskslots }} - parallelism.default: 1 - jobmanager.execution.failover-strategy: region - taskmanager.memory.network.fraction: 0.1 - post-publish-processor: post-publish-processor: |+ include file("/data/flink/conf/base-config.conf") kafka { input.topic = {{ env_name }}.content.postpublish.request groupId = {{ env_name }}-post-publish-processor-group - publish.topic = {{ env_name }}.learning.job.request + publish.topic = {{ env_name }}.publish.job.request qrimage.topic = {{ env_name }}.qrimage.request + dialcode.context.topic = {{ env_name }}.dialcode.context.job.request } task { consumer.parallelism = {{ post_publish_processor_consumer_parallelism }} @@ -223,6 +173,7 @@ post-publish-processor: shallow_copy.parallelism = {{ post_publish_shallow_copy_parallelism }} link_dialcode.parallelism = {{ post_publish_link_dialcode_parallelism }} batch_create.parallelism = {{ post_publish_batch_create_parallelism }} + dialcode_context_updater.parallelism = {{ post_publish_dialcode_context_parallelism }} } lms-cassandra { keyspace = "{{ middleware_course_keyspace }}" @@ -237,6 +188,15 @@ post-publish-processor: lms.basePath = "{{ lms_service_base_url }}" learning_service.basePath = "{{ kp_learning_service_base_url }}" dial.basePath = "https://{{domain_name}}/dial/" + content.basePath = "{{ kp_content_service_base_url }}" + } + + cloudstorage { + metadata.replace_absolute_path={{ cloudstorage_replace_absolute_path | default('false') }} + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} } flink-conf: |+ @@ -247,43 +207,6 @@ post-publish-processor: jobmanager.execution.failover-strategy: region taskmanager.memory.network.fraction: 0.1 -questionset-publish: - questionset-publish: |+ - include file("/data/flink/conf/base-config.conf") - kafka { - input.topic = {{ env_name }}.assessment.publish.request - post_publish.topic = {{ env_name }}.content.postpublish.request - groupId = {{ env_name }}-questionset-publish-group - } - task { - consumer.parallelism = 1 - parallelism = 1 - router.parallelism = 1 - } - question { - keyspace = "{{ assessment_keyspace_name }}" - table = "question_data" - } - questionset { - keyspace = "{{ hierarchy_keyspace_name }}" - table = "questionset_hierarchy" - } - print_service.base_url = "{{ kp_print_service_base_url }}" - cloud_storage_type="{{ cloud_store }}" - azure_storage_key="{{ sunbird_public_storage_account_name }}" - azure_storage_secret="{{ sunbird_public_storage_account_key }}" - azure_storage_container="{{ azure_public_container }}" - - master.category.validation.enabled ="{{ master_category_validation_enabled }}" - - flink-conf: |+ - jobmanager.memory.flink.size: {{ flink_job_names['questionset-publish'].jobmanager_memory }} - taskmanager.memory.flink.size: {{ flink_job_names['questionset-publish'].taskmanager_memory }} - taskmanager.numberOfTaskSlots: {{ flink_job_names['questionset-publish'].taskslots }} - parallelism.default: 1 - jobmanager.execution.failover-strategy: region - taskmanager.memory.network.fraction: 0.1 - video-stream-generator: video-stream-generator: |+ include file("/data/flink/conf/base-config.conf") @@ -302,33 +225,66 @@ video-stream-generator: table = "job_request" } service.content.basePath="{{ kp_content_service_base_url }}" - azure { - location = "centralindia" - login { - endpoint="https://login.microsoftonline.com" - } + + azure_mediakind{ + project_name="{{ azure_mediakind_project_name }}" + auth_token="{{ azure_mediakind_auth_token }}" + account_name="{{ azure_mediakind_account_name }}" api { - endpoint="https://management.azure.com" - version = "2018-07-01" + endpoint="https://api.mk.io/api" } transform { - default = "media_transform_default" - hls = "media_transform_hls" + default = "{{ azure_mediakind_transform_default }}" } stream { - base_url="{{ video_stream_generator_azure_stream_base_url }}" - endpoint_name = "default" - protocol = "Hls" - policy_name = "Predefined_ClearStreamingOnly" + base_url = "{{azure_mediakind_stream_base_url}}" + endpoint_name = "{{azure_mediakind_stream_endpoint_name}}" + protocol = "{{azure_mediakind_stream_protocol}}" + policy_name = "{{azure_mediakind_stream_policy_name}}" } } - azure_tenant="{{ video_stream_generator_azure_tenant }}" - azure_subscription_id="{{ video_stream_generator_azure_subscription_id }}" - azure_account_name="{{ video_stream_generator_azure_account_name }}" - azure_resource_group_name="{{ video_stream_generator_azure_resource_group_name }}" - azure_token_client_key="{{ video_stream_generator_azure_token_client_key }}" - azure_token_client_secret="{{ video_stream_generator_azure_token_client_secret }}" - + ## CSP Name. e.g: aws or azure + media_service_type="{{ media_service_provider_name }}" + ## AWS Elemental Media Convert Config + aws { + region="{{ aws_mediaconvert_region }}" + content_bucket_name="{{ aws_content_bucket_name }}" + token { + access_key="{{ aws_mediaconvert_access_key }}" + access_secret="{{ aws_mediaconvert_access_secret }}" + } + api { + endpoint="{{ aws_mediaconvert_api_endpoint }}" + version="{{ aws_mediaconvert_api_version }}" + } + service { + name="mediaconvert" + queue="{{ aws_mediaconvert_queue_id }}" + role="{{ aws_mediaconvert_role_name }}" + } + stream { + protocol="Hls" + } + } +{% if cloud_service_provider == "oci" %} + #OCI Elemental Media Convert Config + oci { + region="{{oci_media_region}}" + compartment_id="{{oci_media_compartment}}" + namespace="{{oci_media_namespace}}" + bucket { + content_bucket_name="{{oci_media_source_bucket}}" + processed_bucket_name="{{oci_media_target_bucket}}" + } + stream { + prefix_input="{{oci_media_prefix_input}}" + distribution_channel_id="{{oci_media_dist_channel_id}}" + work_flow_id="{{ oci_media_work_flow_id }}" + stream_package_config_id="{{oci_media_stream_config_id}}" + gateway_domain="{{oci_media_gateway_domain}}" + } + } +{% endif %} flink-conf: |+ jobmanager.memory.flink.size: {{ flink_job_names['video-stream-generator'].jobmanager_memory }} @@ -342,9 +298,9 @@ search-indexer: search-indexer: |+ include file("/data/flink/conf/base-config.conf") kafka { - input.topic = "{{ env_name }}.learning.graph.events" - error.topic = "{{ env_name }}.learning.events.failed" - groupId = "{{ env_name }}-search-indexer-group" + input.topic = "{{ search_indexer_topic_name }}" + error.topic = "{{ search_indexer_failed_topic_name }}" + groupId = "{{ search_indexer_group_name }}" } task { consumer.parallelism = {{ search_indexer_consumer_parallelism }} @@ -353,14 +309,22 @@ search-indexer: dialcodeIndexer.parallelism = {{ dialcode_external_indexer_parallelism }} dialcodemetricsIndexer.parallelism = {{ dialcode_metric_indexer_parallelism }} } - compositesearch.index.name = "compositesearch" - dialcode.index.name = "dialcode" + compositesearch.index.name = "{{ search_indexer_es_index_name }}" + dialcode.index.name = "{{ dialcode_es_index_name }}" dailcodemetrics.index.name = "dialcodemetrics" restrict.metadata.objectTypes = [] nested.fields = ["badgeAssertions", "targets", "badgeAssociations", "plugins", "me_totalTimeSpent", "me_totalPlaySessionCount", "me_totalTimeSpentInSec", "batches", "trackable", "credentials", "discussionForum", "provider", "osMetadata", "actions", "transcripts", "accessibility"] schema.definition_cache.expiry = {{ schema_definition_cache_expiry_in_sec }} restrict.objectTypes = {{ search_indexer_restrict_object_types | to_json }} ignored.fields={{ search_indexer_ignored_fields | to_json }} + cloud_storage_container="{{ cloud_storage_content_bucketname }}" + + cloudstorage { + metadata.replace_absolute_path={{ cloudstorage_replace_absolute_path | default('false') }} + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + } flink-conf: |+ jobmanager.memory.flink.size: {{ flink_job_names['search-indexer'].jobmanager_memory }} @@ -370,52 +334,6 @@ search-indexer: jobmanager.execution.failover-strategy: region taskmanager.memory.network.fraction: 0.1 -enrolment-reconciliation: - enrolment-reconciliation: |+ - include file("/data/flink/conf/base-config.conf") - kafka { - input.topic = {{ env_name }}.batch.enrolment.sync.request - output.audit.topic = {{ env_name }}.telemetry.raw - output.failed.topic = {{ env_name }}.activity.agg.failed - output.certissue.topic = {{ env_name }}.issue.certificate.request - groupId = {{ env_name }}-enrolment-reconciliation-group - } - task { - restart-strategy.attempts = {{ restart_attempts }} # max 3 restart attempts - restart-strategy.delay = 240000 # in milli-seconds # on max restarts (3) within 4 min the job will fail. - consumer.parallelism = {{ enrolment_reconciliation_consumer_parallelism }} - enrolment.reconciliation.parallelism = {{ enrolment_reconciliation_parallelism }} - enrolment.complete.parallelism = {{ enrolment_complete_parallelism }} - } - lms-cassandra { - keyspace = "{{ middleware_course_keyspace }}" - consumption.table = "{{ middleware_consumption_table }}" - user_activity_agg.table = "{{ middleware_user_activity_agg_table }}" - user_enrolments.table = "user_enrolments" - } - redis { - database { - relationCache.id = 10 - } - } - threshold.batch.write.size = {{ enrolment_reconciliation_batch_write_size }} - activity { - module.aggs.enabled = true - collection.status.cache.expiry = {{ enrolment_reconciliation_collection_status_cache_expiry_time }} - } - service { - search.basePath = "{{ kp_search_service_base_url }}" - } - - - flink-conf: |+ - jobmanager.memory.flink.size: {{ flink_job_names['enrolment-reconciliation'].jobmanager_memory }} - taskmanager.memory.flink.size: {{ flink_job_names['enrolment-reconciliation'].taskmanager_memory }} - taskmanager.numberOfTaskSlots: {{ flink_job_names['enrolment-reconciliation'].taskslots }} - parallelism.default: 1 - jobmanager.execution.failover-strategy: region - taskmanager.memory.network.fraction: 0.1 - asset-enrichment: asset-enrichment: |+ include file("/data/flink/conf/base-config.conf") @@ -450,10 +368,19 @@ asset-enrichment: size.pixel = 150 } content_youtube_apikey="{{ youtube_api_key }}" - cloud_storage_type="{{ cloud_store }}" - azure_storage_key="{{ sunbird_public_storage_account_name }}" - azure_storage_secret="{{ sunbird_public_storage_account_key }}" - azure_storage_container="{{ azure_public_container }}" + cloud_storage_type="{{ cloud_service_provider }}" + cloud_storage_key="{{ cloud_public_storage_accountname }}" + cloud_storage_secret="{{ cloud_public_storage_secret }}" + cloud_storage_container="{{ cloud_storage_content_bucketname }}" + cloud_storage_endpoint="{{cloudstorage_sdk_endpoint}}" + + cloudstorage { + metadata.replace_absolute_path={{ cloudstorage_replace_absolute_path | default('false') }} + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} + } flink-conf: |+ jobmanager.memory.flink.size: {{ flink_job_names['asset-enrichment'].jobmanager_memory }} @@ -463,6 +390,7 @@ asset-enrichment: jobmanager.execution.failover-strategy: region taskmanager.memory.network.fraction: 0.1 + audit-history-indexer: audit-history-indexer: |+ include file("/data/flink/conf/base-config.conf") @@ -510,10 +438,11 @@ auto-creator-v2: service { content.basePath = "{{ kp_content_service_base_url }}" } - cloud_storage_type="{{ cloud_store }}" - azure_storage_key="{{ sunbird_public_storage_account_name }}" - azure_storage_secret="{{ sunbird_public_storage_account_key }}" - azure_storage_container="{{ azure_public_container }}" + cloud_storage_type="{{ cloud_service_provider }}" + cloud_storage_key="{{ cloud_public_storage_accountname }}" + cloud_storage_secret="{{ cloud_public_storage_secret }}" + cloud_storage_container="{{ cloud_storage_content_bucketname }}" + cloud_storage_endpoint="{{cloudstorage_sdk_endpoint}}" source { baseUrl="{{ source_base_url }}" @@ -541,6 +470,7 @@ content-auto-creator: consumer.parallelism = {{ content_auto_creator_consumer_parallelism }} parallelism = {{ content_auto_creator_parallelism }} window.time = 60 + checkpointing.timeout = 4200000 } redis { @@ -557,19 +487,28 @@ content-auto-creator: learning_service.basePath = "{{ kp_learning_service_base_url }}" } - cloud_storage_type="{{ cloud_store }}" - azure_storage_key="{{ sunbird_public_storage_account_name }}" - azure_storage_secret="{{ sunbird_public_storage_account_key }}" - azure_storage_container="{{ azure_public_container }}" + cloud_storage_type="{{ cloud_service_provider }}" + cloud_storage_key="{{ cloud_public_storage_accountname }}" + cloud_storage_secret="{{ cloud_public_storage_secret }}" + cloud_storage_container="{{ cloud_storage_content_bucketname }}" + cloud_storage_endpoint="{{cloudstorage_sdk_endpoint}}" + + cloudstorage { + metadata.replace_absolute_path={{ cloudstorage_replace_absolute_path | default('false') }} + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} + } content_auto_creator { actions=auto-create allowed_object_types=["Content"] allowed_content_stages=["create","upload","review","publish"] content_mandatory_fields=["name","code","mimeType","primaryCategory","artifactUrl","lastPublishedBy"] - content_props_to_removed=["identifier","downloadUrl","variants","createdOn","collections","children","lastUpdatedOn","SYS_INTERNAL_LAST_UPDATED_ON","versionKey","s3Key","status","pkgVersion","toc_url","mimeTypesCount","contentTypesCount","leafNodesCount","childNodes","prevState","lastPublishedOn","flagReasons","compatibilityLevel","size","publishChecklist","publishComment","lastPublishedBy","rejectReasons","rejectComment","badgeAssertions","leafNodes","sYS_INTERNAL_LAST_UPDATED_ON","previewUrl","channel","objectType","visibility","version","pragma","prevStatus","streamingUrl","idealScreenSize","contentDisposition","lastStatusChangedOn","idealScreenDensity","lastSubmittedOn","publishError","flaggedBy","flags","lastFlaggedOn","publisher","lastUpdatedBy","lastSubmittedBy","uploadError","lockKey","publish_type","reviewError","totalCompressedSize","origin","originData","importError","questions"] + content_props_to_removed=["identifier","downloadUrl","variants","createdOn","collections","children","lastUpdatedOn","SYS_INTERNAL_LAST_UPDATED_ON","versionKey","s3Key","status","pkgVersion","toc_url","mimeTypesCount","contentTypesCount","leafNodesCount","childNodes","prevState","lastPublishedOn","flagReasons","compatibilityLevel","size","publishChecklist","publishComment","lastPublishedBy","rejectReasons","rejectComment","badgeAssertions","leafNodes","sYS_INTERNAL_LAST_UPDATED_ON","previewUrl","channel","objectType","visibility","version","pragma","prevStatus","streamingUrl","idealScreenSize","contentDisposition","lastStatusChangedOn","idealScreenDensity","lastSubmittedOn","publishError","flaggedBy","flags","lastFlaggedOn","publisher","lastUpdatedBy","lastSubmittedBy","uploadError","lockKey","publish_type","reviewError","totalCompressedSize","origin","originData","importError","questions","posterImage"] bulk_upload_mime_types=["video/mp4"] - artifact_upload_max_size=52428800 + artifact_upload_max_size=157286400 content_create_props=["name","code","mimeType","contentType","framework","processId","primaryCategory"] artifact_upload_allowed_source=[] g_service_acct_cred="{{ auto_creator_g_service_acct_cred }}" @@ -654,100 +593,6 @@ metrics-data-transformer: jobmanager.execution.failover-strategy: region taskmanager.memory.network.fraction: 0.1 -collection-cert-pre-processor: - collection-cert-pre-processor: |+ - include file("/data/flink/conf/base-config.conf") - kafka { - input.topic = {{ env_name }}.issue.certificate.request - output.topic = {{ env_name }}.generate.certificate.request - output.failed.topic = {{ env_name }}.issue.certificate.failed - groupId = {{ env_name }}-collection-cert-pre-processor-group - } - task { - restart-strategy.attempts = {{ restart_attempts }} # max 3 restart attempts - restart-strategy.delay = 240000 # in milli-seconds # on max restarts (3) within 4 min the job will fail. - parallelism = {{collection_cert_pre_processor_consumer_parallelism}} - consumer.parallelism = {{ collection_cert_pre_processor_consumer_parallelism }} - generate_certificate.parallelism = {{generate_certificate_parallelism}} - } - lms-cassandra { - keyspace = "{{ middleware_course_keyspace }}" - consumption.table = "{{ middleware_consumption_table }}" - user_enrolments.table = "{{ middleware_user_enrolments_table }}" - course_batch.table = "{{ middleware_course_batch_table }}" - assessment_aggregator.table = "{{ middleware_assessment_aggregator_table }}" - user_activity_agg.table = "{{ middleware_user_activity_agg_table }}" - } - cert_domain_url = "{{ cert_domain_url }}" - user_read_api = "/private/user/v1/read" - content_read_api = "/content/v3/read" - service { - content.basePath = "{{ content_service_base_url }}" - learner.basePath = "{{ learner_service_base_url }}" - } - redis-meta { - {% if metadata2_redis_host is defined %} - host = {{ metadata2_redis_host }} - {% else %} - host = {{ redis_host }} - {% endif %} - port = 6379 - } - assessment.metrics.supported.contenttype = ["SelfAssess"] - - flink-conf: |+ - jobmanager.memory.flink.size: {{ flink_job_names['collection-cert-pre-processor'].jobmanager_memory }} - taskmanager.memory.flink.size: {{ flink_job_names['collection-cert-pre-processor'].taskmanager_memory }} - taskmanager.numberOfTaskSlots: {{ flink_job_names['collection-cert-pre-processor'].taskslots }} - parallelism.default: 1 - jobmanager.execution.failover-strategy: region - taskmanager.memory.network.fraction: 0.1 - -collection-certificate-generator: - collection-certificate-generator: |+ - include file("/data/flink/conf/base-config.conf") - kafka { - input.topic = {{ env_name }}.generate.certificate.request - output.audit.topic = {{ env_name }}.telemetry.raw - groupId = {{ env_name }}-certificate-generator-group - } - task { - restart-strategy.attempts = {{ restart_attempts }} # max 3 restart attempts - restart-strategy.delay = 240000 # in milli-seconds # on max restarts (3) within 4 min the job will fail. - consumer.parallelism = {{ collection_certificate_generator_consumer_parallelism }} - parallelism = {{ collection_certificate_generator_parallelism }} - } - lms-cassandra { - keyspace = "{{ middleware_course_keyspace }}" - user_enrolments.table = "{{ middleware_user_enrolments_table }}" - course_batch.table = "{{ middleware_course_batch_table }}" - sbkeyspace = "{{ registry_sunbird_keyspace }}" - certreg.table ="{{ cert_registry_table }}" - } - cert_domain_url = "{{ cert_domain_url }}" - cert_container_name = "{{ cert_container_name }}" - cert_cloud_storage_type = "{{ cert_cloud_storage_type }}" - cert_azure_storage_secret = "{{ cert_azure_storage_secret }}" - cert_azure_storage_key = "{{ cert_azure_storage_key }}" - service { - certreg.basePath = "{{ cert_reg_service_base_url }}" - learner.basePath = "{{ learner_service_base_url }}" - enc.basePath = "{{ enc_service_base_url }}" - rc.basePath = "{{ cert_rc_base_url }}" - rc.entity = "{{ cert_rc_entity }}" - } - enable.suppress.exception = {{ collection_certificate_generator_enable_suppress_exception | lower }} - enable.rc.certificate = {{ collection_certificate_generator_enable_rc_certificate | lower }} - - - flink-conf: |+ - jobmanager.memory.flink.size: {{ flink_job_names['collection-certificate-generator'].jobmanager_memory }} - taskmanager.memory.flink.size: {{ flink_job_names['collection-certificate-generator'].taskmanager_memory }} - taskmanager.numberOfTaskSlots: {{ flink_job_names['collection-certificate-generator'].taskslots }} - parallelism.default: 1 - jobmanager.execution.failover-strategy: region - taskmanager.memory.network.fraction: 0.1 - mvc-indexer: mvc-indexer: |+ include "base-config.conf" @@ -815,37 +660,61 @@ content-publish: table = "content_data" tmp_file_location = "/tmp" objectType = ["Content", "ContentImage","Collection","CollectionImage"] - mimeType = ["application/pdf", "video/avi", "video/mpeg", "video/quicktime", "video/3gpp", "video/mpeg", "video/mp4", "video/ogg", "video/webm", "application/vnd.ekstep.html-archive","application/vnd.ekstep.ecml-archive","application/vnd.ekstep.content-collection" - "application/vnd.ekstep.ecml-archive", - "application/vnd.ekstep.html-archive", - "application/vnd.android.package-archive", - "application/vnd.ekstep.content-archive", - "application/octet-stream", - "application/json", - "application/javascript", - "application/xml", - "text/plain", - "text/html", - "text/javascript", - "text/xml", - "text/css", - "image/jpeg", "image/jpg", "image/png", "image/tiff", "image/bmp", "image/gif", "image/svg+xml", - "image/x-quicktime", - "video/avi", "video/mpeg", "video/quicktime", "video/3gpp", "video/mpeg", "video/mp4", "video/ogg", "video/webm", - "video/msvideo", - "video/x-msvideo", - "video/x-qtc", - "video/x-mpeg", - "audio/mp3", "audio/mp4", "audio/mpeg", "audio/ogg", "audio/webm", "audio/x-wav", "audio/wav", - "audio/mpeg3", - "audio/x-mpeg-3", - "audio/vorbis", - "application/x-font-ttf", - "application/pdf", "application/epub", "application/msword", - "application/vnd.ekstep.h5p-archive", - "application/vnd.ekstep.plugin-archive", - "video/x-youtube", "video/youtube", - "text/x-url"] + mimeType = ["application/pdf", + "application/vnd.ekstep.ecml-archive", + "application/vnd.ekstep.html-archive", + "application/vnd.android.package-archive", + "application/vnd.ekstep.content-archive", + "application/epub", + "application/msword", + "application/vnd.ekstep.h5p-archive", + "video/webm", + "video/mp4", + "application/vnd.ekstep.content-collection", + "video/quicktime", + "application/octet-stream", + "application/json", + "application/javascript", + "application/xml", + "text/plain", + "text/html", + "text/javascript", + "text/xml", + "text/css", + "image/jpeg", + "image/jpg", + "image/png", + "image/tiff", + "image/bmp", + "image/gif", + "image/svg+xml", + "image/x-quicktime", + "video/avi", + "video/mpeg", + "video/quicktime", + "video/3gpp", + "video/mp4", + "video/ogg", + "video/webm", + "video/msvideo", + "video/x-msvideo", + "video/x-qtc", + "video/x-mpeg", + "audio/mp3", + "audio/mp4", + "audio/mpeg", + "audio/ogg", + "audio/webm", + "audio/x-wav", + "audio/wav", + "audio/mpeg3", + "audio/x-mpeg-3", + "audio/vorbis", + "application/x-font-ttf", + "application/vnd.ekstep.plugin-archive", + "video/x-youtube", + "video/youtube", + "text/x-url"] asset_download_duration = "60 seconds" stream { enabled = {{ content_stream_enabled | lower }} @@ -911,13 +780,23 @@ content-publish: Asset: "Certificate Template" } - compositesearch.index.name = "compositesearch" + compositesearch.index.name = "{{ compositesearch_index_name }}" search.document.type = "cs" + enableDIALContextUpdate = "Yes" + + cloud_storage_type="{{ cloud_service_provider }}" + cloud_storage_key="{{ cloud_public_storage_accountname }}" + cloud_storage_secret="{{ cloud_public_storage_secret }}" + cloud_storage_container="{{ cloud_storage_content_bucketname }}" + cloud_storage_endpoint="{{cloudstorage_sdk_endpoint}}" - cloud_storage_type="{{ cloud_store }}" - azure_storage_key="{{ sunbird_public_storage_account_name }}" - azure_storage_secret="{{ sunbird_public_storage_account_key }}" - azure_storage_container="{{ azure_public_container }}" + cloudstorage { + metadata.replace_absolute_path={{ cloudstorage_replace_absolute_path | default('false') }} + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} + } master.category.validation.enabled ="{{ master_category_validation_enabled }}" service { @@ -955,10 +834,20 @@ qrcode-image-generator: margin=1 } - cloud_storage_type="{{ cloud_store }}" - azure_storage_key="{{ sunbird_public_storage_account_name }}" - azure_storage_secret="{{ sunbird_public_storage_account_key }}" - azure_storage_container="{{ azure_public_container }}" + cloudstorage { + metadata.replace_absolute_path={{ cloudstorage_replace_absolute_path | default('false') }} + relative_path_prefix={{ cloudstorage_relative_path_prefix_dial }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} + } + + cloud_storage_type="{{ cloud_service_provider }}" + cloud_storage_key="{{ cloud_public_storage_accountname }}" + cloud_storage_secret="{{ cloud_public_storage_secret }}" + cloud_storage_container="{{ cloud_storage_dial_bucketname | default('dial') }}" + cloud_storage_endpoint="{{cloudstorage_sdk_endpoint}}" + cloud_storage_proxy_host="{{cloud_storage_proxy_host}}" lms-cassandra { keyspace = "dialcodes" @@ -975,3 +864,467 @@ qrcode-image-generator: parallelism.default: 1 jobmanager.execution.failover-strategy: region taskmanager.memory.network.fraction: 0.1 + +dialcode-context-updater: + dialcode-context-updater: |+ + include file("/data/flink/conf/base-config.conf") + kafka { + input.topic = "{{ env_name }}.dialcode.context.job.request" + failed.topic = "{{ env_name }}.dialcode.context.job.request.failed" + groupId = "{{ env_name }}-dialcode-group" + } + task { + consumer.parallelism = 1 + parallelism = 1 + dialcode-context-updater.parallelism = 1 + } + dialcode_context_updater { + actions="dialcode-context-update" + search_mode="Collection" + context_map_path = "https://raw.githubusercontent.com/project-sunbird/knowledge-platform-jobs/master/dialcode-context-updater/src/main/resources/contextMapping.json" + identifier_search_fields = ["identifier", "primaryCategory", "channel"] + dial_code_context_read_api_path = "/dialcode/v4/read/" + dial_code_context_update_api_path = "/dialcode/v4/update/" + } + service { + search.basePath = "{{ kp_search_service_base_url }}" + dial_service.basePath = "{{ kp_dial_service_base_url }}" + } + + es_sync_wait_time = 20000 + + flink-conf: |+ + jobmanager.memory.flink.size: {{ flink_job_names['dialcode-context-updater'].jobmanager_memory }} + taskmanager.memory.flink.size: {{ flink_job_names['dialcode-context-updater'].taskmanager_memory }} + taskmanager.numberOfTaskSlots: {{ flink_job_names['dialcode-context-updater'].taskslots }} + parallelism.default: 1 + jobmanager.execution.failover-strategy: region + taskmanager.memory.network.fraction: 0.1 + +live-node-publisher: + live-node-publisher: |+ + include file("/data/flink/conf/base-config.conf") + kafka { + input.topic = {{ env_name }}.republish.job.request + live_video_stream.topic = "{{ env_name }}.live.video.stream.request" + error.topic = "{{ env_name }}.republish.events.failed" + skipped.topic = "{{ env_name }}.republish.events.skipped" + groupId = {{ env_name }}-content-republish-group + } + task { + parallelism = 1 + consumer.parallelism = {{live_node_publisher_consumer_parallelism | default('2') }} + router.parallelism = {{live_node_publisher_router_parallelism | default('2') }} + window.time = 60 + checkpointing.timeout = {{live_node_publisher_checkpointing_timeout | default('900000') }} + } + redis { + host={{redis_host}} + port=6379 + database { + contentCache.id = 0 + } + } + content { + bundleLocation = "/tmp/contentBundle" + isECARExtractionEnabled = true + retry_asset_download_count = 1 + keyspace = "{{ content_keyspace_name }}" + table = "content_data" + tmp_file_location = "/tmp" + objectType = ["Content", "ContentImage","Collection","CollectionImage"] + mimeType = ["application/pdf", + "application/vnd.ekstep.ecml-archive", + "application/vnd.ekstep.html-archive", + "application/vnd.android.package-archive", + "application/vnd.ekstep.content-archive", + "application/epub", + "application/msword", + "application/vnd.ekstep.h5p-archive", + "video/webm", + "video/mp4", + "application/vnd.ekstep.content-collection", + "video/quicktime", + "application/octet-stream", + "application/json", + "application/javascript", + "application/xml", + "text/plain", + "text/html", + "text/javascript", + "text/xml", + "text/css", + "image/jpeg", + "image/jpg", + "image/png", + "image/tiff", + "image/bmp", + "image/gif", + "image/svg+xml", + "image/x-quicktime", + "video/avi", + "video/mpeg", + "video/quicktime", + "video/3gpp", + "video/mp4", + "video/ogg", + "video/webm", + "video/msvideo", + "video/x-msvideo", + "video/x-qtc", + "video/x-mpeg", + "audio/mp3", + "audio/mp4", + "audio/mpeg", + "audio/ogg", + "audio/webm", + "audio/x-wav", + "audio/wav", + "audio/mpeg3", + "audio/x-mpeg-3", + "audio/vorbis", + "application/x-font-ttf", + "application/vnd.ekstep.plugin-archive", + "video/x-youtube", + "video/youtube", + "text/x-url"] + asset_download_duration = "60 seconds" + stream { + enabled = {{ content_stream_enabled | lower }} + mimeType = ["video/mp4", "video/webm"] + } + artifact.size.for_online= {{ content_artifact_size_for_online }} + + downloadFiles { + spine = ["appIcon"] + full = ["appIcon", "grayScaleAppIcon", "artifactUrl", "itemSetPreviewUrl", "media"] + } + + nested.fields=["badgeAssertions", "targets", "badgeAssociations", "plugins", "me_totalTimeSpent", "me_totalPlaySessionCount", "me_totalTimeSpentInSec", "batches", "trackable", "credentials", "discussionForum", "provider", "osMetadata", "actions", "transcripts", "accessibility"] + + } + cloud_storage { + folder { + content = "content" + artifact = "artifact" + } + } + + hierarchy { + keyspace = "{{ hierarchy_keyspace_name }}" + table = "content_hierarchy" + } + + contentTypeToPrimaryCategory { + ClassroomTeachingVideo: "Explanation Content" + ConceptMap: "Learning Resource" + Course: "Course" + CuriosityQuestionSet: "Practice Question Set" + eTextBook: "eTextbook" + Event: "Event" + EventSet: "Event Set" + ExperientialResource: "Learning Resource" + ExplanationResource: "Explanation Content" + ExplanationVideo: "Explanation Content" + FocusSpot: "Teacher Resource" + LearningOutcomeDefinition: "Teacher Resource" + MarkingSchemeRubric: "Teacher Resource" + PedagogyFlow: "Teacher Resource" + PracticeQuestionSet: "Practice Question Set" + PracticeResource: "Practice Question Set" + SelfAssess: "Course Assessment" + TeachingMethod: "Teacher Resource" + TextBook: "Digital Textbook" + Collection: "Content Playlist" + ExplanationReadingMaterial: "Learning Resource" + LearningActivity: "Learning Resource" + LessonPlan: "Content Playlist" + LessonPlanResource: "Teacher Resource" + PreviousBoardExamPapers: "Learning Resource" + TVLesson: "Explanation Content" + OnboardingResource: "Learning Resource" + ReadingMaterial: "Learning Resource" + Template: "Template" + Asset: "Asset" + Plugin: "Plugin" + LessonPlanUnit: "Lesson Plan Unit" + CourseUnit: "Course Unit" + TextBookUnit: "Textbook Unit" + Asset: "Certificate Template" + } + + compositesearch.index.name = "compositesearch" + search.document.type = "cs" + + cloud_storage_type="{{ cloud_service_provider }}" + cloud_storage_key="{{ cloud_public_storage_accountname }}" + cloud_storage_secret="{{ cloud_public_storage_secret }}" + cloud_storage_container="{{ cloud_storage_content_bucketname }}" + cloud_storage_endpoint="{{cloudstorage_sdk_endpoint}}" + + master.category.validation.enabled ="{{ master_category_validation_enabled }}" + service { + print.basePath = "{{ kp_print_service_base_url }}" + search.basePath = "{{ kp_search_service_base_url }}" + } + + cloudstorage { + metadata.replace_absolute_path={{ cloudstorage_replace_absolute_path | default('false') }} + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} + } + + flink-conf: |+ + jobmanager.memory.flink.size: {{ flink_job_names['live-node-publisher'].jobmanager_memory }} + taskmanager.memory.flink.size: {{ flink_job_names['live-node-publisher'].taskmanager_memory }} + taskmanager.numberOfTaskSlots: {{ flink_job_names['live-node-publisher'].taskslots }} + parallelism.default: 1 + jobmanager.execution.failover-strategy: region + taskmanager.memory.network.fraction: 0.1 + +live-video-stream-generator: + live-video-stream-generator: |+ + include file("/data/flink/conf/base-config.conf") + kafka { + input.topic = "{{ env_name }}.live.video.stream.request" + groupId = "{{ env_name }}-live-video-stream-generator-group" + } + task { + timer.duration = {{ video_stream_generator_timer_duration }} + consumer.parallelism = {{ video_stream_generator_consumer_parallelism }} + parallelism = {{ video_stream_generator_parallelism }} + max.retries = {{ video_stream_generator_max_retries }} + } + lms-cassandra { + keyspace = {{ platform_keyspace_name }} + table = "job_request" + } + service.content.basePath="{{ kp_content_service_base_url }}" + ## CSP Name. e.g: aws or azure + media_service_type="{{ media_service_provider_name }}" + ## AWS Elemental Media Convert Config + aws { + region="{{ aws_mediaconvert_region }}" + content_bucket_name="{{ aws_content_bucket_name }}" + token { + access_key="{{ aws_mediaconvert_access_key }}" + access_secret="{{ aws_mediaconvert_access_secret }}" + } + api { + endpoint="{{ aws_mediaconvert_api_endpoint }}" + version="{{ aws_mediaconvert_api_version }}" + } + service { + name="mediaconvert" + queue="{{ aws_mediaconvert_queue_id }}" + role="{{ aws_mediaconvert_role_name }}" + } + stream { + protocol="Hls" + } + } + + + flink-conf: |+ + jobmanager.memory.flink.size: {{ flink_job_names['live-video-stream-generator'].jobmanager_memory }} + taskmanager.memory.flink.size: {{ flink_job_names['live-video-stream-generator'].taskmanager_memory }} + taskmanager.numberOfTaskSlots: {{ flink_job_names['live-video-stream-generator'].taskslots }} + parallelism.default: 1 + jobmanager.execution.failover-strategy: region + taskmanager.memory.network.fraction: 0.1 + +csp-migrator: + csp-migrator: |+ + include file("/data/flink/conf/base-config.conf") + kafka { + input.topic = "{{ csp_migration_topic_name }}" + groupId = "{{ csp_migrator_group_name }}" + failed.topic = "{{ csp_migrator_failed_topic_name }}" + live_video_stream.topic = "{{ video_stream_topic_name }}" + live_content_node_republish.topic = "{{ content_republish_topic_name }}" + live_question_node_republish.topic = "{{ question_republish_topic_name }}" + } + task { + timer.duration = {{ csp_migrator_timer_duration }} + consumer.parallelism = {{ csp_migrator_consumer_parallelism }} + router.parallelism = {{csp_migrator_router_parallelism }} + parallelism = {{ csp_migrator_parallelism }} + max.retries = {{ csp_migrator_max_retries }} + cassandra-migrator.parallelism = {{csp_migrator_cassandra_parallelism}} + } + redis { + database { + relationCache.id = 10 + collectionCache.id = 5 + } + } + + hierarchy { + keyspace = "{{ content_hierarchy_keyspace_name }}" + table = "content_hierarchy" + } + + content { + keyspace = "{{ content_keyspace_name }}" + content_table = "content_data" + assessment_table = "question_data" + } + + cloud_storage { + folder { + content = "content" + artifact = "artifact" + } + } + + gdrive.application_name="{{ gdrive_application_name }}" + g_service_acct_cred="{{ auto_creator_g_service_acct_cred }}" + + questionset.hierarchy.keyspace="{{ questionset_hierarchy_keyspace_name }}" + questionset.hierarchy.table="questionset_hierarchy" + + key_value_strings_to_migrate = { + "https://qa.ekstep.in/assets/public": "{{ cloudstorage_relative_path_prefix_content }}", + "https://dev.ekstep.in/assets/public": "{{ cloudstorage_relative_path_prefix_content }}", + "https://community.ekstep.in/assets/public": "{{ cloudstorage_relative_path_prefix_content }}", + "https://community.ekstep.in:443/assets/public": "{{ cloudstorage_relative_path_prefix_content }}", + "https://ekstep-public-qa.s3-ap-south-1.amazonaws.com": "{{ cloudstorage_relative_path_prefix_content }}", + "https://ekstep-public-dev.s3-ap-south-1.amazonaws.com": "{{ cloudstorage_relative_path_prefix_content }}", + "https://ekstep-public-preprod.s3-ap-south-1.amazonaws.com": "{{ cloudstorage_relative_path_prefix_content }}", + "https://ekstep-public-prod.s3-ap-south-1.amazonaws.com": "{{ cloudstorage_relative_path_prefix_content }}", + "https://sunbirddev.blob.core.windows.net/sunbird-content-dev": "{{ cloudstorage_relative_path_prefix_content }}", + "https://sunbirdstagingpublic.blob.core.windows.net/sunbird-content-staging": "{{ cloudstorage_relative_path_prefix_content }}", + "https://preprodall.blob.core.windows.net/ntp-content-preprod": "{{ cloudstorage_relative_path_prefix_content }}", + "https://ntpproductionall.blob.core.windows.net/ntp-content-production": "{{ cloudstorage_relative_path_prefix_content }}", + "https://dockpreprodall.blob.core.windows.net/dock-content-preprod": "{{ cloudstorage_relative_path_prefix_content }}", + "https://dockprodall.blob.core.windows.net/dock-content-prod": "{{ cloudstorage_relative_path_prefix_content }}", + "CLOUD_STORAGE_BASE_PATH": "{{ cloudstorage_relative_path_prefix_content }}" + } + + neo4j_fields_to_migrate = { + "asset": ["artifactUrl", "thumbnail", "downloadUrl","variants"], + "content": ["appIcon", "artifactUrl", "posterImage", "previewUrl", "thumbnail", "assetsMap", "certTemplate", "itemSetPreviewUrl", "grayScaleAppIcon", "sourceURL", "variants", "downloadUrl","streamingUrl","transcripts"], + "contentimage": ["appIcon", "artifactUrl", "posterImage", "previewUrl", "thumbnail", "assetsMap", "certTemplate", "itemSetPreviewUrl", "grayScaleAppIcon", "sourceURL", "variants", "downloadUrl","streamingUrl","transcripts"], + "collection": ["appIcon", "artifactUrl", "posterImage", "previewUrl", "thumbnail", "toc_url", "grayScaleAppIcon", "variants", "downloadUrl"], + "collectionimage": ["appIcon", "artifactUrl", "posterImage", "previewUrl", "thumbnail", "toc_url", "grayScaleAppIcon", "variants", "downloadUrl"], + "plugins": ["artifactUrl"], + "itemset": ["previewUrl", "downloadUrl"], + "assessmentitem": ["data", "question", "solutions", "editorState", "media"], + "question": ["appIcon","artifactUrl", "posterImage", "previewUrl","downloadUrl", "variants","pdfUrl"], + "questionimage": ["appIcon","artifactUrl", "posterImage", "previewUrl","downloadUrl", "variants","pdfUrl"], + "questionset": ["appIcon","artifactUrl", "posterImage", "previewUrl","downloadUrl", "variants","pdfUrl"], + "questionsetimage": ["appIcon","artifactUrl", "posterImage", "previewUrl","downloadUrl", "variants","pdfUrl"] + } + + cassandra_fields_to_migrate = { + "assessmentitem": ["question", "editorstate", "solutions", "body"] + } + + cloudstorage { + metadata.replace_absolute_path=false + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} + } + + migrationVersion = 1 + video_stream_regeneration_enable = false + live_node_republish_enable = true + copy_missing_files_to_cloud = false + download_path = /tmp + + cloud_storage_type="{{ cloud_service_provider }}" + cloud_storage_key="{{ cloud_public_storage_accountname }}" + cloud_storage_secret="{{ cloud_public_storage_secret }}" + cloud_storage_container="{{ cloud_storage_content_bucketname }}" + cloud_storage_endpoint="{{cloudstorage_sdk_endpoint}}" + + flink-conf: |+ + jobmanager.memory.flink.size: {{ flink_job_names['content-publish'].jobmanager_memory }} + taskmanager.memory.flink.size: {{ flink_job_names['content-publish'].taskmanager_memory }} + taskmanager.numberOfTaskSlots: {{ flink_job_names['content-publish'].taskslots }} + parallelism.default: 1 + jobmanager.execution.failover-strategy: region + taskmanager.memory.network.fraction: 0.1 + +cassandra-data-migration: + cassandra-data-migration: |+ + include file("/data/flink/conf/base-config.conf") + kafka { + input.topic = "{{ env_name }}.cassandra.data.migration.request" + failed.topic = "{{ env_name }}.cassandra.data.migration.job.request.failed" + groupId = "{{ env_name }}-cassandra-data-migration-group" + } + + task { + consumer.parallelism = 1 + parallelism = 1 + } + + migrate = { + key_value_strings_to_migrate = { + "https://sunbirdstagingpublic.blob.core.windows.net/dial": "{{ cloudstorage_relative_path_prefix_dial }}", + "https://preprodall.blob.core.windows.net/dial": "{{ cloudstorage_relative_path_prefix_dial }}", + "https://ntpproductionall.blob.core.windows.net/dial": "{{ cloudstorage_relative_path_prefix_dial }}" + } + } + + cloudstorage { + metadata.replace_absolute_path=false + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} + } + + flink-conf: |+ + jobmanager.memory.flink.size: {{ flink_job_names['cassandra-data-migration'].jobmanager_memory }} + taskmanager.memory.flink.size: {{ flink_job_names['cassandra-data-migration'].taskmanager_memory }} + taskmanager.numberOfTaskSlots: {{ flink_job_names['cassandra-data-migration'].taskslots }} + parallelism.default: 1 + jobmanager.execution.failover-strategy: region + taskmanager.memory.network.fraction: 0.1 + +transaction-event-processor: + transaction-event-processor: |+ + include file("/data/flink/conf/base-config.conf") + job { + env = "{{ env_name }}" + } + + kafka { + input.topic = "{{ env_name }}.learning.graph.events" + output.audit.topic = "{{ env_name }}.telemetry.raw" + output.obsrv.topic = "{{ env_name }}.transaction.meta" + groupId = "{{ env_name }}-transaction-event-processor-group" + } + + task { + consumer.parallelism = {{ transaction_event_processor_consumer_parallelism }} + parallelism = {{ transaction_event_processor_parallelism }} + producer.parallelism = {{ transaction_event_processor_producer_parallelism }} + window.time = 60 + } + + schema { + basePath = "{kp_schema_base_path}" + } + + channel.default = "{{ transaction_event_processor_default_channel }}" + + job { + audit-event-generator = "{{ enable_audit_event_generator }}" + audit-history-indexer = "{{ enable_audit_history_indexer }}" + obsrv-metadata-generator = "{{ enable_obsrv_metadata_generator }}" + } + + flink-conf: |+ + jobmanager.memory.flink.size: {{ flink_job_names['transaction-event-processor'].jobmanager_memory }} + taskmanager.memory.flink.size: {{ flink_job_names['transaction-event-processor'].taskmanager_memory }} + taskmanager.numberOfTaskSlots: {{ flink_job_names['transaction-event-processor'].taskslots }} + parallelism.default: 1 + jobmanager.execution.failover-strategy: region + taskmanager.memory.network.fraction: 0.1 diff --git a/pipelines/backup/cassandra-backup/Jenkinsfile b/pipelines/backup/cassandra-backup/Jenkinsfile index 227dbd6f09..412f97658f 100644 --- a/pipelines/backup/cassandra-backup/Jenkinsfile +++ b/pipelines/backup/cassandra-backup/Jenkinsfile @@ -25,7 +25,7 @@ node() { jobName = sh(returnStdout: true, script: "echo $JOB_NAME").split('/')[-1].trim() currentWs = sh(returnStdout: true, script: 'pwd').trim() ansiblePlaybook = "${currentWs}/ansible/cassandra-backup.yml" - ansibleExtraArgs = "--extra-vars \"remote=${params.remote} data_dir=${params.data_dir}\" --vault-password-file /var/lib/jenkins/secrets/vault-pass" + ansibleExtraArgs = "--vault-password-file /var/lib/jenkins/secrets/vault-pass" values.put('currentWs', currentWs) values.put('env', envDir) values.put('module', module) diff --git a/pipelines/build/build.sh b/pipelines/build/build.sh new file mode 100755 index 0000000000..269930c4ae --- /dev/null +++ b/pipelines/build/build.sh @@ -0,0 +1,11 @@ +#!/bin/bash +# Build script +set -eo pipefail + +build_tag=$1 +name=$2 +node=$3 +org=$4 + +docker build -f pipelines/build/${name}/Dockerfile --label commitHash=$(git rev-parse --short HEAD) -t ${org}/${name}:${build_tag} . +echo {\"image_name\" : \"${name}\", \"image_tag\" : \"${build_tag}\", \"node_name\" : \"$node\"} > metadata.json diff --git a/pipelines/build/learning/Dockerfile b/pipelines/build/learning/Dockerfile new file mode 100644 index 0000000000..86bfbf142d --- /dev/null +++ b/pipelines/build/learning/Dockerfile @@ -0,0 +1,6 @@ +FROM tomcat:9.0.62-jdk11-openjdk +RUN rm -rf /usr/local/tomcat/webapps/* +COPY platform-modules/service/target/learning-service.war /usr/local/tomcat/webapps/ +# COPY ./platform-modules/service/src/main/resources/application.conf /usr/local/tomcat/config/ +ENV JAVA_OPTS -Dconfig.file=/usr/local/tomcat/config/application.conf +CMD ["catalina.sh", "run"] diff --git a/pipelines/build/learning/Jenkinsfile b/pipelines/build/learning/Jenkinsfile index 4df6e90574..e57a054a7d 100644 --- a/pipelines/build/learning/Jenkinsfile +++ b/pipelines/build/learning/Jenkinsfile @@ -47,7 +47,7 @@ node() { cp platform-tools/spikes/content-tool/target/content-tool-*.jar lp_artifacts zip -j lp_artifacts.zip:${artifact_version} lp_artifacts/* """ - } + } else { sh """ mkdir lp_artifacts @@ -55,7 +55,7 @@ node() { zip -j lp_artifacts.zip:${artifact_version} lp_artifacts/* """ } - + archiveArtifacts artifacts: "lp_artifacts.zip:${artifact_version}", fingerprint: true, onlyIfSuccessful: true sh """echo {\\"artifact_name\\" : \\"lp_artifacts.zip\\", \\"artifact_version\\" : \\"${artifact_version}\\", \\"node_name\\" : \\"${env.NODE_NAME}\\"} > metadata.json""" archiveArtifacts artifacts: 'metadata.json', onlyIfSuccessful: true @@ -71,4 +71,4 @@ node() { slack_notify(currentBuild.result) email_notify() } -} +} \ No newline at end of file diff --git a/pipelines/build/learning/Jenkinsfile_containerization b/pipelines/build/learning/Jenkinsfile_containerization new file mode 100644 index 0000000000..3b0c581286 --- /dev/null +++ b/pipelines/build/learning/Jenkinsfile_containerization @@ -0,0 +1,63 @@ +@Library('deploy-conf') _ +node() { + try { + String ANSI_GREEN = "\u001B[32m" + String ANSI_NORMAL = "\u001B[0m" + String ANSI_BOLD = "\u001B[1m" + String ANSI_RED = "\u001B[31m" + String ANSI_YELLOW = "\u001B[33m" + + ansiColor('xterm') { + stage('Checkout') { + cleanWs() + checkout scm + commit_hash = sh(script: 'git rev-parse --short HEAD', returnStdout: true).trim() + build_tag = sh(script: "echo " + params.github_release_tag.split('/')[-1] + "_" + commit_hash + "_" + env.BUILD_NUMBER, returnStdout: true).trim() + echo "build_tag: " + build_tag + artifact_version = sh(script: "echo " + params.github_release_tag.split('/')[-1] + "_" + commit_hash + "_" + env.BUILD_NUMBER, returnStdout: true).trim() + echo "artifact_version: "+ artifact_version + } + } + + stage('Pre-Build') { + sh """ + java -version + rm -rf /data/logs/* + rm -rf /data/graphDB/* + rm -rf /data/testgraphDB/* + rm -rf /data/testGraphDB/* + vim -esnc '%s/dialcode.es_conn_info="localhost:9200"/dialcode.es_conn_info="10.6.0.11:9200"/g|:wq' platform-core/unit-tests/src/test/resources/application.conf + vim -esnc '%s/search.es_conn_info="localhost:9200"/search.es_conn_info="10.6.0.11:9200"/g|:wq' platform-core/unit-tests/src/test/resources/application.conf + """ + } + + stage('Build') { + env.NODE_ENV = "build" + print "Environment will be : ${env.NODE_ENV}" + sh 'mvn clean install -DskipTests -P ${profile_id} -T10' + } + + stage('Post_Build-Action') { + jacoco exclusionPattern: '**/common/**,**/dto/**,**/enums/**,**/pipeline/**,**/servlet/**,**/interceptor/**,**/batch/**,**/models/**,**/model/**,**/EnrichActor*.class,**/language/controller/**,**/wordchain/**,**/importer/**,**/Base**,**/ControllerUtil**,**/Indowordnet**,**/Import**' + + } + + stage('Package') { + sh('chmod 777 pipelines/build/build.sh') + sh("pipelines/build/build.sh ${build_tag} ${"learning"} ${env.NODE_NAME} ${hub_org}") + } + + stage('ArchiveArtifacts') { + archiveArtifacts "metadata.json" + currentBuild.description = "${build_tag}" + } + } + catch (err) { + currentBuild.result = "FAILURE" + throw err + } + finally { + slack_notify(currentBuild.result) + email_notify() + } +} \ No newline at end of file diff --git a/pipelines/build/learning/auto_build_deploy b/pipelines/build/learning/auto_build_deploy index 8e5b5db73d..05735c6086 100644 --- a/pipelines/build/learning/auto_build_deploy +++ b/pipelines/build/learning/auto_build_deploy @@ -13,11 +13,6 @@ node() { tag_name = env.JOB_NAME.split("/")[-1] pre_checks() cleanWs() - if (!tag_name.contains(env.public_repo_branch)) { - println("Error.. Tag does not contain " + env.public_repo_branch) - error("Oh ho! Tag is not a release candidate.. Skipping build") - } - cleanWs() def scmVars = checkout scm checkout scm: [$class: 'GitSCM', branches: [[name: "refs/tags/$tag_name"]], userRemoteConfigs: [[url: scmVars.GIT_URL]]] commit_hash = sh(script: 'git rev-parse --short HEAD', returnStdout: true).trim() diff --git a/pipelines/build/yarn/Jenkinsfile b/pipelines/build/yarn/Jenkinsfile deleted file mode 100644 index 6116b4e291..0000000000 --- a/pipelines/build/yarn/Jenkinsfile +++ /dev/null @@ -1,45 +0,0 @@ -@Library('deploy-conf') _ -node() { - try { - String ANSI_GREEN = "\u001B[32m" - String ANSI_NORMAL = "\u001B[0m" - String ANSI_BOLD = "\u001B[1m" - String ANSI_RED = "\u001B[31m" - String ANSI_YELLOW = "\u001B[33m" - - ansiColor('xterm') { - stage('Checkout') { - cleanWs() - checkout scm - commit_hash = sh(script: 'git rev-parse --short HEAD', returnStdout: true).trim() - artifact_version = sh(script: "echo " + params.github_release_tag.split('/')[-1] + "_" + commit_hash + "_" + env.BUILD_NUMBER, returnStdout: true).trim() - echo "artifact_version: "+ artifact_version - } - } - - stage('Build') { - sh 'mvn clean package -DskipTests -P samza-jobs' - } - - stage('Archive artifacts'){ - sh """ - mkdir lp_yarn_artifacts - cp platform-jobs/samza/distribution/target/distribution-*.tar.gz lp_yarn_artifacts - zip -j lp_yarn_artifacts.zip:${artifact_version} lp_yarn_artifacts/* - """ - archiveArtifacts artifacts: "lp_yarn_artifacts.zip:${artifact_version}", fingerprint: true, onlyIfSuccessful: true - sh """echo {\\"artifact_name\\" : \\"lp_yarn_artifacts.zip\\", \\"artifact_version\\" : \\"${artifact_version}\\", \\"node_name\\" : \\"${env.NODE_NAME}\\"} > metadata.json""" - archiveArtifacts artifacts: 'metadata.json', onlyIfSuccessful: true - currentBuild.result = "SUCCESS" - currentBuild.description = "Artifact: ${artifact_version}, Public: ${params.github_release_tag}" - } - } - catch (err) { - currentBuild.result = "FAILURE" - throw err - } - finally { - slack_notify(currentBuild.result) - email_notify() - } -} diff --git a/pipelines/build/yarn/auto_build_deploy b/pipelines/build/yarn/auto_build_deploy deleted file mode 100644 index 17b4654463..0000000000 --- a/pipelines/build/yarn/auto_build_deploy +++ /dev/null @@ -1,63 +0,0 @@ -@Library('deploy-conf') _ -node() { - try { - String ANSI_GREEN = "\u001B[32m" - String ANSI_NORMAL = "\u001B[0m" - String ANSI_BOLD = "\u001B[1m" - String ANSI_RED = "\u001B[31m" - String ANSI_YELLOW = "\u001B[33m" - - ansiColor('xterm') { - stage('Checkout') { - tag_name = env.JOB_NAME.split("/")[-1] - module = env.JOB_NAME.split("/")[-3] - envDir = env.JOB_NAME.split("/")[-4] - pre_checks() - cleanWs() - def scmVars = checkout scm - checkout scm: [$class: 'GitSCM', branches: [[name: "refs/tags/$tag_name"]], userRemoteConfigs: [[url: scmVars.GIT_URL]]] - commit_hash = sh(script: 'git rev-parse --short HEAD', returnStdout: true).trim() - artifact_version = tag_name + "_" + commit_hash - echo "artifact_version: "+ artifact_version - } - } - -// stage Build learning. -// println ANSI_BOLD + ANSI_GREEN + "Triggering KnowledgePlatform build.." + ANSI_NORMAL -// lpbuild = build job: "AutoBuild/$envDir/$module/Learning", parameters: [string(name: 'github_release_tag', value: "$tag_name")] - -// if (lpbuild.currentResult == "SUCCESS") { -// stage Build - sh 'mvn clean package -DskipTests -P samza-jobs' - - -// stage Archive artifacts - sh """ - mkdir lp_yarn_artifacts - cp platform-jobs/samza/distribution/target/distribution-*.tar.gz lp_yarn_artifacts - zip -j lp_yarn_artifacts.zip:${artifact_version} lp_yarn_artifacts/* - """ - archiveArtifacts artifacts: "lp_yarn_artifacts.zip:${artifact_version}", fingerprint: true, onlyIfSuccessful: true - sh """echo {\\"artifact_name\\" : \\"lp_yarn_artifacts.zip\\", \\"artifact_version\\" : \\"${artifact_version}\\", \\"node_name\\" : \\"${env.NODE_NAME}\\"} > metadata.json""" - archiveArtifacts artifacts: 'metadata.json', onlyIfSuccessful: true - currentBuild.result = "SUCCESS" - currentBuild.description = "Artifact: ${artifact_version}, Public: ${params.github_release_tag}" - - currentBuild.result = "SUCCESS" - slack_notify(currentBuild.result, tag_name) - email_notify() - auto_build_deploy() -// } -// else { -// println (ANSI_BOLD + ANSI_RED + "knowledge platform build failed. Skipping build" + ANSI_NORMAL) -// error "knowledge platform build failed" -// } - - } - catch (err) { - currentBuild.result = "FAILURE" - slack_notify(currentBuild.result, tag_name) - email_notify() - throw err - } -} diff --git a/pipelines/deploy/yarn/Jenkinsfile b/pipelines/deploy/yarn/Jenkinsfile deleted file mode 100644 index cf551df104..0000000000 --- a/pipelines/deploy/yarn/Jenkinsfile +++ /dev/null @@ -1,60 +0,0 @@ -@Library('deploy-conf') _ -node() { - try { - String ANSI_GREEN = "\u001B[32m" - String ANSI_NORMAL = "\u001B[0m" - String ANSI_BOLD = "\u001B[1m" - String ANSI_RED = "\u001B[31m" - String ANSI_YELLOW = "\u001B[33m" - - stage('checkout public repo') { - folder = new File("$WORKSPACE/.git") - if (folder.exists()) - { - println "Found .git folder. Clearing it.." - sh'git clean -fxd' - } - checkout scm - } - - ansiColor('xterm') { - values = lp_dp_params() - stage('get artifact') { - currentWs = sh(returnStdout: true, script: 'pwd').trim() - artifact = values.artifact_name + ":" + values.artifact_version - values.put('currentWs', currentWs) - values.put('artifact', artifact) - artifact_download(values) - } - stage('deploy artifact') { - sh """ - unzip ${artifact} - mv distribution-*.tar.gz ansible - rm -rf ansible/roles/samza-jobs/files/jobs - mkdir ansible/roles/samza-jobs/files/jobs - tar -xvf ansible/distribution-*.tar.gz -C ansible/roles/samza-jobs/files/jobs/ - - """ - ansiblePlaybook = "${currentWs}/ansible/lp_samza_deploy.yml" - ansibleExtraArgs = "--vault-password-file /var/lib/jenkins/secrets/vault-pass" - values.put('ansiblePlaybook', ansiblePlaybook) - values.put('ansibleExtraArgs', ansibleExtraArgs) - println values - ansible_playbook_run(values) - currentBuild.result = "SUCCESS" - currentBuild.description = "Artifact: ${values.artifact_version}, Private: ${params.private_branch}, Public: ${params.branch_or_tag}" - archiveArtifacts artifacts: "${artifact}", fingerprint: true, onlyIfSuccessful: true - archiveArtifacts artifacts: 'metadata.json', onlyIfSuccessful: true - } - } - summary() - } - catch (err) { - currentBuild.result = "FAILURE" - throw err - } - finally { - slack_notify(currentBuild.result) - email_notify() - } -} \ No newline at end of file diff --git a/platform-core/graph-engine/module/graph-dac-api/src/main/java/org/sunbird/graph/dac/model/Filter.java b/platform-core/graph-engine/module/graph-dac-api/src/main/java/org/sunbird/graph/dac/model/Filter.java index f7e0564958..ff5cef741e 100644 --- a/platform-core/graph-engine/module/graph-dac-api/src/main/java/org/sunbird/graph/dac/model/Filter.java +++ b/platform-core/graph-engine/module/graph-dac-api/src/main/java/org/sunbird/graph/dac/model/Filter.java @@ -104,6 +104,8 @@ public String getCypher(SearchCriteria sc, String param) { sb.append(" ").append(param).append(property).append(" in {").append(pIndex).append("} "); sc.params.put("" + pIndex, value); pIndex += 1; + } else if (SearchConditions.OP_IS.equals(getOperator())) { + sb.append(" ").append(param).append(property).append(" is ").append(value).append(" "); } sc.pIndex = pIndex; return sb.toString(); diff --git a/platform-core/graph-engine/module/graph-dac-api/src/main/java/org/sunbird/graph/dac/model/SearchConditions.java b/platform-core/graph-engine/module/graph-dac-api/src/main/java/org/sunbird/graph/dac/model/SearchConditions.java index b2b182c210..07095d7c40 100644 --- a/platform-core/graph-engine/module/graph-dac-api/src/main/java/org/sunbird/graph/dac/model/SearchConditions.java +++ b/platform-core/graph-engine/module/graph-dac-api/src/main/java/org/sunbird/graph/dac/model/SearchConditions.java @@ -20,6 +20,7 @@ public class SearchConditions implements Serializable { public static final String OP_LESS_OR_EQUAL = "<="; public static final String OP_NOT_EQUAL = "!="; public static final String OP_IN = "in"; + public static final String OP_IS = "is"; static List operators = new ArrayList(); @@ -34,5 +35,6 @@ public class SearchConditions implements Serializable { operators.add(OP_LESS_OR_EQUAL); operators.add(OP_NOT_EQUAL); operators.add(OP_IN); + operators.add(OP_IS); } } diff --git a/platform-core/graph-engine/module/graph-dac/src/main/java/org/sunbird/graph/dac/mgr/impl/Neo4JBoltSearchMgrImpl.java b/platform-core/graph-engine/module/graph-dac/src/main/java/org/sunbird/graph/dac/mgr/impl/Neo4JBoltSearchMgrImpl.java index b330fd4328..dfb20b65be 100644 --- a/platform-core/graph-engine/module/graph-dac/src/main/java/org/sunbird/graph/dac/mgr/impl/Neo4JBoltSearchMgrImpl.java +++ b/platform-core/graph-engine/module/graph-dac/src/main/java/org/sunbird/graph/dac/mgr/impl/Neo4JBoltSearchMgrImpl.java @@ -4,6 +4,7 @@ import java.util.List; import java.util.Map; +import org.sunbird.common.Platform; import org.sunbird.common.dto.Property; import org.sunbird.common.dto.Request; import org.sunbird.common.dto.Response; @@ -247,6 +248,8 @@ public Response searchNodes(Request request) { } else { try { List nodes = Neo4JBoltSearchOperations.searchNodes(graphId, searchCriteria, getTags, request); + boolean isrRelativePathEnabled = Platform.config.hasPath("cloudstorage.metadata.replace_absolute_path")?Platform.config.getBoolean("cloudstorage.metadata.replace_absolute_path"):false; + if(isrRelativePathEnabled) updateAbsolutePath(nodes); return OK(GraphDACParams.node_list.name(), nodes); } catch (Exception e) { return ERROR(e); @@ -321,4 +324,34 @@ public Response getSubGraph(Request request) { } } + private Node updateAbsolutePath(Node node) { + Map metadata = updateAbsolutePath(node.getMetadata()); + node.setMetadata(metadata); + return node; + } + + private java.util.List updateAbsolutePath(java.util.List nodes) { + for(Node node: nodes) { + updateAbsolutePath(node); + } + return nodes; + } + + private Map updateAbsolutePath(Map data) { + String relativePathPrefix = Platform.config.getString("cloudstorage.relative_path_prefix"); + List cspMeta = Platform.config.getStringList("cloudstorage.metadata.list"); + String absolutePath = Platform.config.getString("cloudstorage.read_base_path") + java.io.File.separator + Platform.config.getString("cloud_storage_container"); + if (data !=null && !data.isEmpty()) { + for (Map.Entry entry : data.entrySet()) { + if(cspMeta.contains(entry.getKey())) { + if(entry.getValue() instanceof String) { + data.replace(entry.getKey(), ((String) entry.getValue()).replaceAll(relativePathPrefix, absolutePath)); + } + } + } + } + return data; + } + + } diff --git a/platform-core/unit-tests/src/test/resources/application.conf b/platform-core/unit-tests/src/test/resources/application.conf index 169b02f75e..a1d51f9470 100644 --- a/platform-core/unit-tests/src/test/resources/application.conf +++ b/platform-core/unit-tests/src/test/resources/application.conf @@ -226,4 +226,10 @@ content.license = "CC BY 4.0" content.tagging.backward_enable=true content.tagging.property="subject,medium" -kp.search_service.base_url="http://search-service" \ No newline at end of file +kp.search_service.base_url="http://search-service" + +cloud_storage_type="azure" +cloud_storage_key="accesskeyyyy" +cloud_storage_secret="secretxxx=" +cloud_storage_container="sunbird-content-dev" +cloud_storage_endpoint="" \ No newline at end of file diff --git a/platform-jobs/.gitignore b/platform-jobs/.gitignore deleted file mode 100644 index c195647c3a..0000000000 --- a/platform-jobs/.gitignore +++ /dev/null @@ -1 +0,0 @@ -**/deploy/samza diff --git a/platform-jobs/pom.xml b/platform-jobs/pom.xml deleted file mode 100644 index c562efae5b..0000000000 --- a/platform-jobs/pom.xml +++ /dev/null @@ -1,83 +0,0 @@ - - - 4.0.0 - - org.sunbird - sunbird-platform - 1.1-SNAPSHOT - ../pom.xml - - platform-jobs - pom - Base for all platform jobs - - - UTF-8 - 4.2.4.RELEASE - 2.3.1 - 1.8 - 1.8 - - - - samza - - - - - - - - - - org.apache.maven.plugins - maven-compiler-plugin - 2.3.2 - - 1.8 - 1.8 - - - - - - - org.apache.maven.plugins - maven-surefire-plugin - 2.20 - - - - org.jacoco - jacoco-maven-plugin - 0.7.9 - - - **/common/** - **/dto/** - **/enums/** - **/pipeline/** - **/servlet/** - **/interceptor/** - - - - - default-prepare-agent - - prepare-agent - - - - default-report - prepare-package - - report - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/auto-creator/pom.xml b/platform-jobs/samza/auto-creator/pom.xml deleted file mode 100644 index 9ecf52adb4..0000000000 --- a/platform-jobs/samza/auto-creator/pom.xml +++ /dev/null @@ -1,87 +0,0 @@ - - - - samza - org.sunbird - 1.1-SNAPSHOT - - 4.0.0 - auto-creator - 0.0.39 - - - - com.konghq - unirest-java - 3.7.02 - - - org.sunbird - course-common - 1.1-SNAPSHOT - - - unirest-java - com.mashape.unirest - - - - - org.sunbird - unit-tests - 1.1-SNAPSHOT - test - - - org.mockito - mockito-all - 1.10.19 - test - - - org.powermock - powermock-api-mockito - 1.7.4 - test - - - org.powermock - powermock-module-junit4 - 1.7.4 - test - - - - - - - org.apache.maven.plugins - maven-compiler-plugin - - ${java.version} - ${java.version} - - - - - maven-assembly-plugin - - - src/main/assembly/src.xml - - - - - make-assembly - package - - single - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/auto-creator/src/main/assembly/src.xml b/platform-jobs/samza/auto-creator/src/main/assembly/src.xml deleted file mode 100644 index aa00816eba..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/assembly/src.xml +++ /dev/null @@ -1,69 +0,0 @@ - - - - - distribution - - tar.gz - - false - - - ${basedir} - - README* - LICENSE* - NOTICE* - - - - - - ${basedir}/src/main/resources/log4j.xml - lib - - - - ${basedir}/src/main/config/auto-creator.properties - config - true - - - - - bin - - org.apache.samza:samza-shell:tgz:dist:* - - 0744 - true - - - lib - - org.apache.samza:samza-api - org.sunbird:auto-creator - org.apache.samza:samza-core_2.11 - org.apache.samza:samza-kafka_2.11 - org.apache.samza:samza-yarn_2.11 - org.apache.samza:samza-log4j - org.apache.kafka:kafka_2.11 - org.apache.hadoop:hadoop-hdfs - - true - - - \ No newline at end of file diff --git a/platform-jobs/samza/auto-creator/src/main/config/auto-creator.properties b/platform-jobs/samza/auto-creator/src/main/config/auto-creator.properties deleted file mode 100644 index b3f049495b..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/config/auto-creator.properties +++ /dev/null @@ -1,89 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=__env__.auto-creator -job.container.count=__auto_creator_container_count__ - -# YARN -yarn.package.path=http://__yarn_host__:__yarn_port__/__env__/${project.artifactId}-${pom.version}-distribution.tar.gz - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.__env__.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory -output.metrics.job.name=auto-creator -output.metrics.topic.name=__env__.pipeline_metrics - -# Task -task.class=org.sunbird.jobs.samza.task.AutoCreatorTask -task.inputs=kafka.__env__.auto.creation.job.request -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=__samza_checkpoint_replication_factor__ -task.commit.ms=60000 -task.window.ms=300000 -task.opts=-Dfile.encoding=UTF8 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=__zookeepers__ -systems.kafka.consumer.auto.offset.reset=smallest -systems.kafka.samza.offset.default=oldest -systems.kafka.producer.bootstrap.servers=__kafka_brokers__ - -# Job Coordinator -job.coordinator.system=kafka - -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=__samza_coordinator_replication_factor__ - -#Job Specific Config -graph.passport.key.base=__graph_passport_key__ -output.failed.events.topic.name=__env__.auto.creation.job.request.failed -lp.tempfile.location=__lp_tmpfile_location__ -max.iteration.count.samza.job=__max_iteration_count_for_samza_job__ - -kp.content_service.base_url=__kp_content_service_base_url__ -kp.learning_service.base_url=__kp_learning_service_base_url__ -kp.search_service_base_url=__kp_search_service_base_url__ - -auto_creator.actions=auto-create -auto_creator.allowed_object_types=Content -auto_creator.content_mandatory_fields=name,code,mimeType,primaryCategory,artifactUrl,lastPublishedBy -#TODO: Need to test, if collectionId will be overridden by publish, is there any impact -auto_creator.content_props_to_removed=identifier,downloadUrl,artifactUrl,variants,createdOn,collections,children,lastUpdatedOn,SYS_INTERNAL_LAST_UPDATED_ON,versionKey,s3Key,status,pkgVersion,toc_url,mimeTypesCount,contentTypesCount,leafNodesCount,childNodes,prevState,lastPublishedOn,flagReasons,compatibilityLevel,size,publishChecklist,publishComment,lastPublishedBy,rejectReasons,rejectComment,badgeAssertions,leafNodes,sYS_INTERNAL_LAST_UPDATED_ON,previewUrl,channel,objectType,visibility,version,pragma,prevStatus,streamingUrl,idealScreenSize,contentDisposition,lastStatusChangedOn,idealScreenDensity,lastSubmittedOn,publishError,flaggedBy,flags,lastFlaggedOn,publisher,lastUpdatedBy,lastSubmittedBy,uploadError,lockKey,publish_type,reviewError,totalCompressedSize,origin,originData,importError,questions -auto_creator.bulk_upload.mime_types=video/mp4 -auto_creator.artifact_upload.max_size=157286400 -auto_creator.content_create_props=name,code,mimeType,contentType,framework,processId,primaryCategory -auto_creator.artifact_upload.allowed_source=__auto_creator_artifact_allowed_sources__ -# Delay between each api call in seconds -auto_creator.api_call_delay=1 - -auto_creator_g_service_acct_cred=__auto_creator_g_service_acct_cred__ -auto_creator.gdrive.application_name=drive-download -auto_creator.initial_backoff_delay=120000 -auto_creator.maximum_backoff_delay=1200000 -auto_creator.increment_backoff_delay=2 - - -# Folder Config -cloud_storage.content.folder=content -cloud_storage.artefact.folder=artifact - -# Cloud store details -cloud_storage_type=__cloud_storage_type__ -azure_storage_key=__azure_storage_key__ -azure_storage_secret=__azure_storage_secret__ -azure_storage_container=__azure_storage_container__ -aws_storage_key=__aws_access_key_id__ -aws_storage_secret=__aws_secret_access_key__ -aws_storage_container=__aws_storage_container__ - - - diff --git a/platform-jobs/samza/auto-creator/src/main/config/local.auto-creator.properties.properties b/platform-jobs/samza/auto-creator/src/main/config/local.auto-creator.properties.properties deleted file mode 100644 index 45783a38b0..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/config/local.auto-creator.properties.properties +++ /dev/null @@ -1,74 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=local.auto-creator -job.container.count=1 - -# YARN -yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-distribution.tar.gz - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.local.lp.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory -output.metrics.job.name=course-batch-updater -output.metrics.topic.name=local.lp.metrics - -# Task -task.class=org.sunbird.job.samza.task.AutoCreatorTask -task.inputs=kafka.local.auto.creation.job.request -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=1 -task.commit.ms=60000 -task.window.ms=300000 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=localhost:2181 -systems.kafka.consumer.auto.offset.reset=smallest -systems.kafka.producer.bootstrap.servers=localhost:9092 - -# Job Coordinator -job.coordinator.system=kafka -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=1 - -#Remote Debug Configuration -# task.opts=-agentlib:jdwp=transport=dt_socket,address=localhost:9009,server=y,suspend=y - -#Job Specific Config -output.failed.events.topic.name=__env__.learning.events.failed -lp.tempfile.location=__lp_tmpfile_location__ -max.iteration.count.samza.job=__max_iteration_count_for_samza_job__ - -kp.content_service.base_url=__kp_content_service_base_url__ -kp.learning_service.base_url=__kp_learning_service_base_url__ -kp.search_service_base_url=__kp_search_service_base_url__ - -auto_creator.actions=auto-create -auto_creator.allowed_object_types=Content -auto_creator.content_mandatory_fields=identifier,name,description,code,mimeType,contentType,artifactUrl,lastPublishedBy -auto_creator.content_props_to_removed=identifier,downloadUrl,artifactUrl,variants,createdOn,collections,children,lastUpdatedOn,SYS_INTERNAL_LAST_UPDATED_ON,versionKey,s3Key,status,pkgVersion,toc_url,mimeTypesCount,contentTypesCount,leafNodesCount,childNodes,prevState,lastPublishedOn,flagReasons,compatibilityLevel,size,publishChecklist,publishComment,lastPublishedBy,rejectReasons,rejectComment,badgeAssertions,leafNodes -auto_creator.bulk_upload.mime_types=video/mp4 -auto_creator.artifact_upload.max_size=62914560 - -# Folder Config -cloud_storage.content.folder=content -cloud_storage.artefact.folder=artifact - -# Cloud store details -cloud_storage_type=__cloud_storage_type__ -azure_storage_key=__azure_storage_key__ -azure_storage_secret=__azure_storage_secret__ -azure_storage_container=__azure_storage_container__ -aws_storage_key=__aws_access_key_id__ -aws_storage_secret=__aws_secret_access_key__ -aws_storage_container=__aws_storage_container__ - diff --git a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/service/AutoCreatorService.java b/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/service/AutoCreatorService.java deleted file mode 100644 index ca60887eed..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/service/AutoCreatorService.java +++ /dev/null @@ -1,111 +0,0 @@ -package org.sunbird.jobs.samza.service; - -import org.apache.commons.collections.MapUtils; -import org.apache.commons.lang3.StringUtils; -import org.apache.samza.config.Config; -import org.apache.samza.system.SystemStream; -import org.apache.samza.task.MessageCollector; -import org.sunbird.common.Platform; -import org.sunbird.common.exception.ServerException; -import org.sunbird.jobs.samza.exception.PlatformErrorCodes; -import org.sunbird.jobs.samza.service.task.JobMetrics; -import org.sunbird.jobs.samza.util.AutoCreatorParams; -import org.sunbird.jobs.samza.util.ContentUtil; -import org.sunbird.jobs.samza.util.FailedEventsUtil; -import org.sunbird.jobs.samza.util.JSONUtils; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.jobs.samza.util.SamzaCommonParams; - -import java.util.*; - -public class AutoCreatorService implements ISamzaService { - private static JobLogger LOGGER = new JobLogger(AutoCreatorService.class); - private Config config = null; - private SystemStream failedEventStream; - private static Integer MAX_ITERATION_COUNT = null; - private List ALLOWED_OBJECT_TYPES = null; - private ContentUtil contentUtil = null; - - @Override - public void initialize(Config config) throws Exception { - this.config = config; - JSONUtils.loadProperties(config); - LOGGER.info("Service config initialized"); - failedEventStream = new SystemStream("kafka", config.get("output.failed.events.topic.name")); - LOGGER.info("Stream initialized for Failed Events"); - MAX_ITERATION_COUNT = (Platform.config.hasPath("max.iteration.count.samza.job")) ? - Platform.config.getInt("max.iteration.count.samza.job") : 2; - ALLOWED_OBJECT_TYPES = Arrays.asList(Platform.config.getString("auto_creator.allowed_object_types").split(",")); - contentUtil = new ContentUtil(); - LOGGER.info("ContentUtil initialized."); - } - - @Override - public void processMessage(Map message, JobMetrics metrics, MessageCollector collector) throws Exception { - if (null == message) { - LOGGER.info("Null Event Received. So Skipped Processing."); - return; - } - Map edata = (Map) message.get(SamzaCommonParams.edata.name()); - Map object = (Map) message.get(SamzaCommonParams.object.name()); - Map context = (Map) message.get(AutoCreatorParams.context.name()); - try { - Integer currentIteration = (Integer) edata.getOrDefault(SamzaCommonParams.iteration.name(),1); - String channel = (String) context.getOrDefault(AutoCreatorParams.channel.name(), ""); - String identifier = (String) object.getOrDefault(AutoCreatorParams.id.name(), ""); - String objectType = (String) edata.getOrDefault(AutoCreatorParams.objectType.name(), ""); - String repository = (String) edata.getOrDefault(AutoCreatorParams.repository.name(), ""); - Map metadata = (Map) edata.getOrDefault(AutoCreatorParams.metadata.name(), new HashMap()); - List> collection = (List>) edata.getOrDefault(AutoCreatorParams.collection.name(), new ArrayList>()); - String stage = (String) edata.getOrDefault(AutoCreatorParams.stage.name(), ""); - - if (!validateEvent(currentIteration, channel, identifier, objectType, metadata)) { - LOGGER.info("Event Ignored. Event Validation Failed for auto-creator operation : " + edata.get("action") + " | Event : " + message); - return; - } - - switch (objectType.toLowerCase()) { - case "content": { - if(!contentUtil.validateStage(stage)) { - LOGGER.info("Event Ignored. Content Stage Validation Failed for :" + identifier + " | Stage : " + stage + " Allowed Stages are : " + contentUtil.ALLOWED_CONTENT_STAGE); - return; - } - if (!(contentUtil.validateMetadata(metadata))) { - LOGGER.info("Event Ignored. Event Metadata Validation Failed for :" + identifier + " | Metadata : " + metadata + " Required fields are : " + contentUtil.REQUIRED_METADATA_FIELDS); - return; - } - contentUtil.process(channel, identifier, edata); - break; - } - default: { - LOGGER.info("Event Ignored. Event objectType doesn't match with allowed objectType."); - } - } - } catch (Exception e) { - LOGGER.error("AutoCreatorService :: Message processing failed for mid : " + message.get("mid"), message, e); - metrics.incErrorCounter(); - Integer currentIteration = (Integer) edata.getOrDefault(SamzaCommonParams.iteration.name(), 1); - if (currentIteration < MAX_ITERATION_COUNT) { - ((Map) message.get(SamzaCommonParams.edata.name())).put(SamzaCommonParams.iteration.name(), currentIteration + 1); - FailedEventsUtil.pushEventForRetry(failedEventStream, message, metrics, collector, - PlatformErrorCodes.PROCESSING_ERROR.name(), e); - LOGGER.info("Failed Event Sent To Kafka Topic : " + config.get("output.failed.events.topic.name") + " | for mid : " + message.get("mid"), message); - }else{ - LOGGER.info("Event Reached Maximum Retry Limit having mid : " + message.get("mid"), message); - } - if ((e instanceof ServerException) && StringUtils.equalsIgnoreCase(((ServerException) e).getErrCode(), "ERR_API_CALL")) { - LOGGER.error("Error While making api calls. ", e); - throw e; - } - } - } - - private Boolean validateEvent(Integer currentIteration, String channel, String identifier, String objectType, Map metadata) { - if ((currentIteration <= MAX_ITERATION_COUNT) && (StringUtils.isNotBlank(channel) && StringUtils.isNotBlank(identifier) && MapUtils.isNotEmpty(metadata)) && - (StringUtils.isNotBlank(objectType) && ALLOWED_OBJECT_TYPES.contains(objectType))) { - return true; - } - return false; - } - -} diff --git a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/task/AutoCreatorTask.java b/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/task/AutoCreatorTask.java deleted file mode 100644 index 4c78084dc6..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/task/AutoCreatorTask.java +++ /dev/null @@ -1,59 +0,0 @@ -package org.sunbird.jobs.samza.task; - -import org.apache.commons.lang3.StringUtils; -import org.apache.samza.system.OutgoingMessageEnvelope; -import org.apache.samza.system.SystemStream; -import org.apache.samza.task.MessageCollector; -import org.apache.samza.task.TaskCoordinator; -import org.sunbird.common.Platform; -import org.sunbird.common.exception.ServerException; -import org.sunbird.jobs.samza.service.AutoCreatorService; -import org.sunbird.jobs.samza.service.ISamzaService; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.jobs.samza.task.BaseTask; - - -import java.util.Arrays; -import java.util.Map; - -public class AutoCreatorTask extends BaseTask { - - private ISamzaService service = new AutoCreatorService(); - - private static JobLogger LOGGER = new JobLogger(AutoCreatorTask.class); - - public ISamzaService initialize() throws Exception { - LOGGER.info("Task initialized"); - this.action = Platform.config.hasPath("auto_creator.actions") ? - Arrays.asList(Platform.config.getString("auto_creator.actions").split(",")) : Arrays.asList("auto-create"); - LOGGER.info("Available Actions : " + this.action); - this.jobStartMessage = "Started processing of auto-creator samza job"; - this.jobEndMessage = "Completed processing of auto-creator job"; - this.jobClass = "org.sunbird.jobs.samza.task.AutoCreatorTask"; - return service; - } - - @Override - public void process(Map message, MessageCollector collector, TaskCoordinator coordinator) throws Exception { - try { - LOGGER.info("Starting Task Process for auto-creator operation for mid : " + message.get("mid") + " at :: " + System.currentTimeMillis()); - long startTime = System.currentTimeMillis(); - service.processMessage(message, metrics, collector); - long endTime = System.currentTimeMillis(); - LOGGER.info("Successfully completed processing for auto-creator operation for mid : " + message.get("mid") + " at :: " + System.currentTimeMillis()); - } catch (Exception e) { - LOGGER.error("AutoCreatorTask ::: process ::: Message processing failed.", message, e); - if ((e instanceof ServerException) && StringUtils.equalsIgnoreCase(((ServerException) e).getErrCode(), "ERR_API_CALL")) { - LOGGER.error("Error While making api calls. ", e); - throw e; - } - } - } - - @Override - public void window(MessageCollector collector, TaskCoordinator coordinator) throws Exception { - Map event = metrics.collect(); - collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", metrics.getTopic()), event)); - metrics.clear(); - } -} \ No newline at end of file diff --git a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/AutoCreatorParams.java b/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/AutoCreatorParams.java deleted file mode 100644 index 45f94007d2..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/AutoCreatorParams.java +++ /dev/null @@ -1,8 +0,0 @@ -package org.sunbird.jobs.samza.util; - -public enum AutoCreatorParams { - channel, id, objectType, metadata, artifactUrl, status, request, filters, origin, originData, count, content, identifier, - repository, pkgVersion, lastPublishedBy, children, childNodes, rootId,unitId, context, collection, - processId, versionKey, importError, textbookInfo, unitIdentifiers, mimeType, stage - -} diff --git a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/ContentUtil.java b/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/ContentUtil.java deleted file mode 100644 index 0dc3555d39..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/ContentUtil.java +++ /dev/null @@ -1,717 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import com.fasterxml.jackson.databind.ObjectMapper; -import kong.unirest.HttpResponse; -import kong.unirest.Unirest; -import org.apache.commons.collections4.CollectionUtils; -import org.apache.commons.collections4.MapUtils; -import org.apache.commons.io.FileUtils; -import org.apache.commons.io.FilenameUtils; -import org.apache.commons.lang.StringUtils; -import org.apache.tika.Tika; -import org.sunbird.common.Platform; -import org.sunbird.common.Slug; -import org.sunbird.common.dto.Response; -import org.sunbird.common.enums.TaxonomyErrorCodes; -import org.sunbird.common.exception.ResponseCode; -import org.sunbird.common.exception.ServerException; -import org.sunbird.common.util.HttpDownloadUtility; -import org.sunbird.common.util.S3PropertyReader; -import org.sunbird.learning.common.enums.ContentErrorCodes; -import org.sunbird.learning.util.CloudStore; - -import java.io.File; -import java.io.IOException; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.HashMap; -import java.util.List; -import java.util.Map; -import java.util.stream.Collectors; - -public class ContentUtil { - - private static final String KP_CS_BASE_URL = Platform.config.getString("kp.content_service.base_url"); - private static final String KP_LEARNING_BASE_URL = Platform.config.getString("kp.learning_service.base_url"); - private static final String KP_SEARCH_URL = Platform.config.getString("kp.search_service_base_url") + "/v3/search"; - private static final String PASSPORT_KEY = Platform.config.getString("graph.passport.key.base"); - private static final String TEMP_FILE_LOCATION = Platform.config.hasPath("lp.tempfile.location") ? - Platform.config.getString("lp.tempfile.location") : "/tmp/content"; - public static final List REQUIRED_METADATA_FIELDS = Arrays.asList(Platform.config.getString("auto_creator.content_mandatory_fields").split(",")); - public static final List METADATA_FIELDS_TO_BE_REMOVED = Arrays.asList(Platform.config.getString("auto_creator.content_props_to_removed").split(",")); - private static final List SEARCH_FIELDS = Arrays.asList("identifier", "mimeType", "pkgVersion", "channel", "status", "origin", "originData","artifactUrl"); - private static final List SEARCH_EXISTS_FIELDS = Arrays.asList("originData"); - private static final List FINAL_STATUS = Arrays.asList("Live", "Unlisted", "Processing"); - private static final String DEFAULT_CONTENT_TYPE = "application/json"; - private static final int IDX_CLOUD_KEY = 0; - private static final int IDX_CLOUD_URL = 1; - private static final String CONTENT_FOLDER = "cloud_storage.content.folder"; - private static final String ARTEFACT_FOLDER = "cloud_storage.artefact.folder"; - private static final Long CONTENT_UPLOAD_ARTIFACT_MAX_SIZE = Platform.config.hasPath("auto_creator.artifact_upload.max_size") ? Platform.config.getLong("auto_creator.artifact_upload.max_size") : 62914560; - private static final List BULK_UPLOAD_MIMETYPES = Platform.config.hasPath("auto_creator.bulk_upload.mime_types") ? Arrays.asList(Platform.config.getString("auto_creator.bulk_upload.mime_types").split(",")) : new ArrayList(); - private static final List CONTENT_CREATE_PROPS = Platform.config.hasPath("auto_creator.content_create_props") ? Arrays.asList(Platform.config.getString("auto_creator.content_create_props").split(",")) : new ArrayList(); - private static final List ALLOWED_ARTIFACT_SOURCE = Platform.config.hasPath("auto_creator.artifact_upload.allowed_source") ? Arrays.asList(Platform.config.getString("auto_creator.artifact_upload.allowed_source").split(",")) : new ArrayList(); - private static final Integer API_CALL_DELAY = Platform.config.hasPath("auto_creator.api_call_delay") ? Platform.config.getInt("auto_creator.api_call_delay") : 2; - public static final List ALLOWED_CONTENT_STAGE = Platform.config.hasPath("auto_creator.allowed_content_stages") ? Arrays.asList(Platform.config.getString("auto_creator.allowed_content_stages").split(",")) : Arrays.asList("create", "upload", "review", "publish"); - private static ObjectMapper mapper = new ObjectMapper(); - private static Tika tika = new Tika(); - private static JobLogger LOGGER = new JobLogger(ContentUtil.class); - - - public Boolean validateMetadata(Map metadata) { - List reqFields = REQUIRED_METADATA_FIELDS.stream().filter(x -> null == metadata.get(x)).collect(Collectors.toList()); - return CollectionUtils.isEmpty(reqFields) ? true : false; - } - - public Boolean validateStage(String stage) { - return StringUtils.isNotBlank(stage) ? ALLOWED_CONTENT_STAGE.contains(stage) : true; - } - - public void process(String channelId, String identifier, Map edata) throws Exception { - String stage = (String) edata.getOrDefault(AutoCreatorParams.stage.name(), ""); - String repository = (String) edata.getOrDefault(AutoCreatorParams.repository.name(), ""); - Map metadata = (Map) edata.getOrDefault(AutoCreatorParams.metadata.name(), new HashMap()); - Map filteredMetadata = metadata.entrySet().stream().filter(x -> !METADATA_FIELDS_TO_BE_REMOVED.contains(x.getKey())).collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)); - String mimeType = (String) metadata.getOrDefault(AutoCreatorParams.mimeType.name(), ""); - Integer delayUpload = StringUtils.equalsIgnoreCase(mimeType, "application/vnd.ekstep.h5p-archive") ? 6 * API_CALL_DELAY : API_CALL_DELAY; - List> collection = (List>) edata.getOrDefault(AutoCreatorParams.collection.name(), new ArrayList>()); - Map textbookInfo = (Map) edata.getOrDefault(AutoCreatorParams.textbookInfo.name(), new HashMap()); - String newIdentifier = (String) edata.get(AutoCreatorParams.identifier.name()); - LOGGER.info("ContentUtil :: process :: started processing for: " + identifier + " | Channel : " + channelId + " | Metadata : " + metadata+ " | collection :"+collection +" | textbookInfo : "+textbookInfo); - String contentStage = ""; - String internalId = ""; - Boolean isCreated = false; - Boolean isUploaded = false; - Boolean isReviewed = false; - Boolean isPublished = false; - Double pkgVersion = Double.parseDouble(String.valueOf(metadata.getOrDefault(AutoCreatorParams.pkgVersion.name(), "0.0"))); - Map createMetadata = filteredMetadata.entrySet().stream().filter(x -> CONTENT_CREATE_PROPS.contains(x.getKey())).collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)); - Map updateMetadata = filteredMetadata.entrySet().stream().filter(x->!CONTENT_CREATE_PROPS.contains(x.getKey())).collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)); - Map reqOriginData = (Map) edata.getOrDefault(AutoCreatorParams.originData.name(), new HashMap()); - String originId = (String) reqOriginData.getOrDefault(AutoCreatorParams.identifier.name(), ""); - if (MapUtils.isNotEmpty(reqOriginData) && StringUtils.isNotBlank(originId)) { - Map contentMetadata = getOriginContent(channelId, identifier); - if (MapUtils.isNotEmpty(contentMetadata)) { - internalId = originId; - contentStage = "na"; - } - } - - if (StringUtils.isBlank(contentStage)) { - Map contentMetadata = searchContent(identifier); - if (MapUtils.isEmpty(contentMetadata)) { - contentStage = "create"; - } else { - contentStage = getContentStage(identifier, pkgVersion, contentMetadata); - internalId = (String) contentMetadata.get("contentId"); - } - } - - try { - switch (contentStage) { - case "create": { - Map result = create(channelId, identifier, newIdentifier, repository, createMetadata); - internalId = (String) result.get(AutoCreatorParams.identifier.name()); - if (StringUtils.isNotBlank(internalId)) { - isCreated = true; - updateMetadata.put(AutoCreatorParams.versionKey.name(), (String) result.get(AutoCreatorParams.versionKey.name())); - } - } - case "update": { - if (!isCreated) { - Map readMetadata = read(channelId, internalId); - updateMetadata.put(AutoCreatorParams.versionKey.name(), (String) readMetadata.get(AutoCreatorParams.versionKey.name())); - } - update(channelId, internalId, updateMetadata); - if (StringUtils.equalsIgnoreCase("create", stage)) - break; - } - case "upload": { - isUploaded = upload(channelId, internalId, metadata); - if(StringUtils.equalsIgnoreCase("upload", stage)) - break; - delay(delayUpload); - } - case "review": { - isReviewed = review(channelId, internalId); - if(StringUtils.equalsIgnoreCase("review", stage)) - break; - delay(API_CALL_DELAY); - } - case "publish": { - isPublished = publish(channelId, internalId, (String) metadata.get(AutoCreatorParams.lastPublishedBy.name())); - break; - } - default: { - LOGGER.info("ContentUtil :: process :: Event Skipped for operations (create, upload, publish) for: " + identifier + " | Content Stage : " + contentStage); - } - } - }catch (Exception e) { - if(StringUtils.isNotBlank(internalId)) - updateStatus(channelId, internalId, e.getMessage()); - throw e; - } - - if(CollectionUtils.isNotEmpty(collection) && (isUploaded || isReviewed || isPublished || StringUtils.equalsIgnoreCase("na", contentStage))) { - linkCollection(channelId, identifier, collection, internalId); - } else if(MapUtils.isNotEmpty(textbookInfo) && (isUploaded || isReviewed || isPublished || StringUtils.equalsIgnoreCase("na", contentStage))) { - linkTextbook(channelId, identifier, textbookInfo, internalId); - }else { - LOGGER.info("ContentUtil :: process :: Textbook Linking Skipped because received empty collection/textbookInfo for : " + identifier); - } - LOGGER.info("ContentUtil :: process :: finished processing for: " + identifier); - } - - private void updateStatus(String channelId, String identifier, String message) throws Exception { - String errorMsg = StringUtils.isNotBlank(message) ? message : "Processing Error"; - String url = KP_LEARNING_BASE_URL + "/system/v3/content/update/" + identifier; - Map request = new HashMap() {{ - put("request", new HashMap() {{ - put("content", new HashMap() {{ - put(AutoCreatorParams.importError.name(), errorMsg); - put(AutoCreatorParams.status.name(), "Failed"); - }}); - }}); - }}; - Map header = new HashMap() {{ - put("X-Channel-Id", channelId); - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - Response resp = UnirestUtil.patch(url, request, header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - String node_id = (String) resp.getResult().get("node_id"); - if (StringUtils.isNotBlank(node_id)) { - LOGGER.info("ContentUtil :: updateStatus :: Content failed status successfully updated for : " + identifier); - } - else - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Content update status Call Failed For : " + identifier); - } else { - LOGGER.info("ContentUtil :: updateStatus :: Invalid Response received while updating failed status for : " + identifier + getErrorDetails(resp)); - throw new ServerException("ERR_API_CALL", "Invalid Response received while updating content status for : " + identifier + getErrorDetails(resp)); - } - } - - private Map searchContent(String identifier) throws Exception { - Map result = new HashMap(); - Map header = new HashMap() {{ - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - - Map request = new HashMap() {{ - put(AutoCreatorParams.request.name(), new HashMap() {{ - put(AutoCreatorParams.filters.name(), new HashMap() {{ - put(AutoCreatorParams.objectType.name(), "Content"); - put(AutoCreatorParams.status.name(), Arrays.asList()); - put(AutoCreatorParams.origin.name(), identifier); - }}); - put("exists", SEARCH_EXISTS_FIELDS); - put("fields", SEARCH_FIELDS); - }}); - }}; - Response resp = UnirestUtil.post(KP_SEARCH_URL, request, header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK)) { - if (MapUtils.isNotEmpty(resp.getResult()) && (Integer) resp.getResult().get(AutoCreatorParams.count.name()) > 0) { - List contents = (List) resp.getResult().get(AutoCreatorParams.content.name()); - contents.stream().map(obj -> (Map) obj).forEach(map -> { - String contentId = (String) map.get(AutoCreatorParams.identifier.name()); - Map originData = null; - try { - originData = mapper.readValue((String) map.get(AutoCreatorParams.originData.name()), Map.class); - } catch (IOException e) { - e.printStackTrace(); - } - if (MapUtils.isNotEmpty(originData)) { - String originId = (String) originData.get(AutoCreatorParams.identifier.name()); - String repository = (String) originData.get(AutoCreatorParams.repository.name()); - if (StringUtils.equalsIgnoreCase(identifier, originId) && StringUtils.isNotBlank(repository)) { - result.put("contentId", contentId); - result.put(AutoCreatorParams.status.name(), map.get(AutoCreatorParams.status.name())); - result.put(AutoCreatorParams.artifactUrl.name(), map.get(AutoCreatorParams.artifactUrl.name())); - result.put(AutoCreatorParams.pkgVersion.name(), map.get(AutoCreatorParams.pkgVersion.name())); - LOGGER.info("ContentUtil :: searchContent :: Internal Content Found with Identifier : " + contentId + " for :" + identifier + " | Result : " + result); - } - } else - LOGGER.info("Received empty originData for " + identifier); - }); - } else - LOGGER.info("ContentUtil :: searchContent :: Received 0 count while searching content for : " + identifier); - - } else { - LOGGER.info("ContentUtil :: searchContent :: Invalid Response received while searching content for : " + identifier + getErrorDetails(resp)); - throw new ServerException("ERR_API_CALL", "Invalid Response received while searching content for : " + identifier + getErrorDetails(resp)); - } - return result; - } - - private String getContentStage(String identifier, Double pkgVersion, Map metadata) { - String result = "na"; - String status = (String) metadata.get(AutoCreatorParams.status.name()); - String artifactUrl = (String) metadata.get(AutoCreatorParams.artifactUrl.name()); - Double pkgVer = 0.0; - try { - pkgVer = (Double) metadata.getOrDefault(AutoCreatorParams.pkgVersion.name(), 0.0); - } catch (ClassCastException ccex) { - pkgVer = Double.valueOf((Integer) metadata.getOrDefault(AutoCreatorParams.pkgVersion.name(), 0)); - } - if (!FINAL_STATUS.contains(status)) - result = StringUtils.isNotBlank(artifactUrl) ? "review" : "update"; - else if (pkgVersion > pkgVer) - result = "update"; - else - LOGGER.info("ContentUtil :: getContentStage :: Skipped Processing for : " + identifier + " | Internal Identifier : " + metadata.get("contentId") + " ,Status : " + status + " , artifactUrl : " + artifactUrl); - return result; - } - - - private Map create(String channelId, String identifier, String newIdentifier, String repository, Map metadata) throws Exception { - String contentId = ""; - String url = KP_CS_BASE_URL + "/content/v3/create"; - Map metaFields = new HashMap(); - metaFields.putAll(metadata); - if(StringUtils.isNotBlank(newIdentifier)) - metaFields.put(AutoCreatorParams.identifier.name(), newIdentifier); - else { - metaFields.put(AutoCreatorParams.identifier.name(), identifier); - metaFields.put(AutoCreatorParams.origin.name(), identifier); - metaFields.put(AutoCreatorParams.originData.name(), new HashMap(){{ - put(AutoCreatorParams.identifier.name(), identifier); - put(AutoCreatorParams.repository.name(), repository); - }}); - } - Map request = new HashMap() {{ - put("request", new HashMap() {{ - put("content", metaFields); - }}); - }}; - LOGGER.info("ContentUtil :: create :: create request : "+request); - Map header = new HashMap() {{ - put("X-Channel-Id", channelId); - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - Response resp = UnirestUtil.post(url, request, header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - contentId = (String) resp.getResult().get("identifier"); - LOGGER.info("ContentUtil :: create :: Content Created Successfully with identifier : " + contentId); - } else { - LOGGER.info("ContentUtil :: create :: Invalid Response received while creating content for : " + identifier + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while creating content for : " + identifier+ getErrorDetails(resp)); - } - return resp.getResult(); - } - - private Map read(String channelId, String identifier) throws Exception { - String contentId = ""; - String url = KP_CS_BASE_URL + "/content/v3/read/" + identifier; - LOGGER.info("ContentUtil :: read :: Reading content having identifier : "+identifier); - Map header = new HashMap() {{ - put("X-Channel-Id", channelId); - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - Response resp = UnirestUtil.get(url, "mode=edit", header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - contentId = ((String) ((Map)resp.getResult().getOrDefault("content", new HashMap())).getOrDefault("identifier", "")).replace(".img", ""); - if(StringUtils.equalsIgnoreCase(identifier, contentId)) - LOGGER.info("ContentUtil :: read :: Content Fetched Successfully with identifier : " + contentId); - else throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while reading content for : " + identifier); - } else { - LOGGER.info("ContentUtil :: read :: Invalid Response received while reading content for : " + identifier + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while reading content for : " + identifier + getErrorDetails(resp)); - } - return ((Map) resp.getResult().getOrDefault("content", new HashMap())); - } - - private void update(String channelId, String internalId, Map updateMetadata) throws Exception { - String url = KP_CS_BASE_URL + "/content/v3/update/" + internalId; - String appIconUrl = (String) updateMetadata.getOrDefault("appIcon", ""); - if(appIconUrl != null && !appIconUrl.trim().isEmpty()) { - LOGGER.info("ContentUtil :: update :: Initiating Icon download for : " + internalId + " | appIconUrl : " + appIconUrl); - File file = getFile(internalId, appIconUrl, "image"); - LOGGER.info("ContentUtil :: update :: Icon downloaded for : " + internalId + " | appIconUrl : " + appIconUrl); - if (null == file || !file.exists()) { - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Error Occurred while downloading appIcon file for " + internalId + " | File Url : " + appIconUrl); - } - String[] urls = uploadArtifact(file, internalId); - if (null != urls && StringUtils.isNotBlank(urls[1])) { - String appIconBlobUrl = urls[IDX_CLOUD_URL]; - LOGGER.info("ContentUtil :: update :: Icon Uploaded Successfully to cloud for : " + internalId + " | appIconUrl : " + appIconUrl + " | appIconBlobUrl : " + appIconBlobUrl); - updateMetadata.put("appIcon", appIconBlobUrl); - } - } - Map request = new HashMap() {{ - put("request", new HashMap() {{ - put("content", updateMetadata); - }}); - }}; - LOGGER.info("ContentUtil :: update :: update request : "+request); - Map header = new HashMap() {{ - put("X-Channel-Id", channelId); - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - Response resp = UnirestUtil.patch(url, request, header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - String contentId = (String) resp.getResult().get("identifier"); - LOGGER.info("ContentUtil :: update :: Content Update Successfully having identifier : " + contentId); - } else { - LOGGER.info("ContentUtil :: update :: Invalid Response received while updating content for : " + internalId + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while updating content for : " + internalId + getErrorDetails(resp)); - } - } - - /*private void upload(String channelId, String identifier, File file) throws Exception { - if (null != file && !file.exists()) - LOGGER.info("ContentUtil :: upload :: File Path for " + identifier + "is : " + file.getAbsolutePath() + " | File Size : " + file.length()); - String preSignedUrl = getPreSignedUrl(identifier, file.getName()); - String fileUrl = preSignedUrl.split("\\?")[0]; - Boolean isUploaded = uploadBlob(identifier, preSignedUrl, file); - if (isUploaded) { - String url = KP_CS_BASE_URL + "/content/v3/upload/" + identifier; - Map header = new HashMap() {{ - put("X-Channel-Id", channelId); - }}; - Response resp = UnirestUtil.post(url, "fileUrl", fileUrl, header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - String artifactUrl = (String) resp.getResult().get(AutoCreatorParams.artifactUrl.name()); - if (StringUtils.isNotBlank(artifactUrl) && StringUtils.equalsIgnoreCase(fileUrl, artifactUrl)) - LOGGER.info("ContentUtil :: upload :: Content Uploaded Successfully for : " + identifier + " | artifactUrl : " + artifactUrl); - } else { - LOGGER.info("ContentUtil :: upload :: Invalid Response received while uploading for: " + identifier + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while uploading : " + identifier); - } - } else { - LOGGER.info("ContentUtil :: upload :: Blob upload failed for: " + identifier); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Upload failed for: " + identifier); - } - }*/ - - private Boolean upload(String channelId, String identifier, Map metadata) throws Exception { - Response resp = null; - Long downloadStartTime = System.currentTimeMillis(); - String sourceUrl = (String) metadata.get(AutoCreatorParams.artifactUrl.name()); - String mimeType = (String) metadata.getOrDefault("mimeType", ""); - if (CollectionUtils.isNotEmpty(ALLOWED_ARTIFACT_SOURCE) && CollectionUtils.isEmpty(ALLOWED_ARTIFACT_SOURCE.stream().filter(x -> sourceUrl.contains(x)).collect(Collectors.toList()))) { - LOGGER.info("Artifact Source is not from allowed one for : " + identifier + " | artifactUrl: " + sourceUrl + " | Allowed Sources : " + ALLOWED_ARTIFACT_SOURCE); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Artifact Source is not from allowed one for : " + identifier + " | artifactUrl: " + sourceUrl + " | Allowed Sources : " + ALLOWED_ARTIFACT_SOURCE); - } - File file = getFile(identifier, sourceUrl, mimeType); - Long downloadEndTime = System.currentTimeMillis(); - LOGGER.info("ContentUtil :: upload :: Total time taken for download: " + (downloadEndTime - downloadStartTime)); - if (null == file || !file.exists()) { - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Error Occurred while downloading file for " + identifier + " | File Url : "+sourceUrl); - } - LOGGER.info("ContentUtil :: upload :: File Path for " + identifier + "is : " + file.getAbsolutePath() + " | File Size : " + file.length()); - Long size = FileUtils.sizeOf(file); - LOGGER.info("ContentUtil :: upload :: file size (MB): " + (size / 1048576)); - String url = KP_CS_BASE_URL + "/content/v3/upload/" + identifier + "?validation=false"; - if (StringUtils.isNotBlank(mimeType) && (StringUtils.equalsIgnoreCase("application/vnd.ekstep.h5p-archive", mimeType) && !StringUtils.equalsIgnoreCase("h5p", FilenameUtils.getExtension(file.getAbsolutePath())))) - url = url + "&fileFormat=composed-h5p-zip"; - LOGGER.info("Upload API URL : " + url); - Map header = new HashMap() {{ - put("X-Channel-Id", channelId); - }}; - - if (size > CONTENT_UPLOAD_ARTIFACT_MAX_SIZE && !BULK_UPLOAD_MIMETYPES.contains(mimeType)) { - LOGGER.info("ContentUtil :: upload :: File Size is larger than allowed file size allowed in upload api for : " + identifier + " | File Size (MB): " + (size / 1048576) + " | mimeType : " + mimeType); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "File Size is larger than allowed file size allowed in upload api for : " + identifier + " | File Size (MB): " + (size / 1048576) + " | mimeType : " + mimeType); - } - - Long uploadStartTime = System.currentTimeMillis(); - String[] urls = uploadArtifact(file, identifier); - Long uploadEndTime = System.currentTimeMillis(); - LOGGER.info("ContentUtil :: upload :: Total time taken for upload: " + (uploadEndTime - uploadStartTime)); - if (null != urls && StringUtils.isNotBlank(urls[1])) { - String uploadUrl = urls[IDX_CLOUD_URL]; - LOGGER.info("ContentUtil :: upload :: Artifact Uploaded Successfully to cloud for : " + identifier + " | uploadUrl : " + uploadUrl); - resp = UnirestUtil.post(url, "fileUrl", uploadUrl, header); - } - - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - String artifactUrl = (String) resp.getResult().get(AutoCreatorParams.artifactUrl.name()); - if (StringUtils.isNotBlank(artifactUrl)) { - LOGGER.info("ContentUtil :: upload :: Content Uploaded Successfully for : " + identifier + " | artifactUrl : " + artifactUrl); - return true; - } - } else { - LOGGER.info("ContentUtil :: upload :: Invalid Response received while uploading for: " + identifier + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while uploading : " + identifier + getErrorDetails(resp)); - } - return false; - } - - private Boolean review(String channelId, String identifier) throws Exception { - String url = KP_LEARNING_BASE_URL + "/content/v3/review/" + identifier; - Map request = new HashMap() {{ - put("request", new HashMap() {{ - put("content", new HashMap()); - }}); - }}; - - Map header = new HashMap() {{ - put("X-Channel-Id", channelId); - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - Response resp = UnirestUtil.post(url, request, header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - String contentId = (String) resp.getResult().get("node_id"); - if(StringUtils.isNotBlank(contentId)) { - LOGGER.info("ContentUtil :: review :: Content Sent For Review Successfully having identifier : " + contentId); - return true; - } - } else { - LOGGER.info("ContentUtil :: review :: Invalid Response received while sending content to review for : " + identifier + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while sending content to review for : " + identifier + getErrorDetails(resp)); - } - return false; - } - - private Boolean publish(String channelId, String identifier, String lastPublishedBy) throws Exception { - String url = KP_LEARNING_BASE_URL + "/content/v3/publish/" + identifier; - Map request = new HashMap() {{ - put("request", new HashMap() {{ - put("content", new HashMap() {{ - put(AutoCreatorParams.lastPublishedBy.name(), lastPublishedBy); - }}); - }}); - }}; - Map header = new HashMap() {{ - put("X-Channel-Id", channelId); - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - Response resp = UnirestUtil.post(url, request, header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - String publishStatus = (String) resp.getResult().get("publishStatus"); - if (StringUtils.isNotBlank(publishStatus)) { - LOGGER.info("ContentUtil :: publish :: Content sent for publish successfully for : " + identifier); - return true; - } - else - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Content Publish Call Failed For : " + identifier); - } else { - LOGGER.info("ContentUtil :: publish :: Invalid Response received while publishing content for : " + identifier + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while publishing content for : " + identifier + getErrorDetails(resp)); - } - - } - - private String getPreSignedUrl(String identifier, String fileName) throws Exception { - String preSignedUrl = ""; - Map request = new HashMap(){{ - put("request", new HashMap(){{ - put("content", new HashMap(){{ - put("fileName", fileName); - }}); - }}); - }}; - Map header = new HashMap(){{ - put("Content-Type","application/json"); - }}; - String url = KP_CS_BASE_URL + "/content/v3/upload/url/" + identifier; - Response resp = UnirestUtil.post(url, request, header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - preSignedUrl = (String) resp.getResult().get("pre_signed_url"); - return preSignedUrl; - } else { - LOGGER.info("ContentUtil :: getPreSignedUrl :: Invalid Response received while generating pre-signed url for : " + identifier + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while generating pre-signed url for : " + identifier + getErrorDetails(resp)); - } - } - - private Boolean uploadBlob(String identifier, String url, File file) throws Exception { - Boolean result = false; - String contentType = tika.detect(file); - LOGGER.info("contentType of file : "+contentType); - Map header = new HashMap(){{ - put("x-ms-blob-type", "BlockBlob"); - put("Content-Type", contentType); - }}; - HttpResponse response = Unirest.put(url).headers(header).field("file", new File(file.getAbsolutePath())).asString(); - if (null != response && response.getStatus()==201) { - result = true; - } else { - LOGGER.info("ContentUtil :: uploadBlob :: Invalid Response received while uploading file to blob store for : " + identifier + " | Response Code : " + response.getStatus()); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while uploading file to blob store for : " + identifier); - } - return result; - } - - private void linkCollection(String channel, String eventObjectId, List> collection, String resourceId) throws Exception { - for (Map textbookInfo : collection) { - String textbookId = (String) textbookInfo.getOrDefault(AutoCreatorParams.identifier.name(), ""); - String unitId = (String) textbookInfo.getOrDefault(AutoCreatorParams.unitId.name(), ""); - if (StringUtils.isNotBlank(textbookId) && StringUtils.isNotEmpty(unitId)) { - Map rootHierarchy = null; - rootHierarchy = getHierarchy(textbookId); - if (validateHierarchy(textbookId, rootHierarchy, unitId)) { - Map hierarchyReq = new HashMap() {{ - put(AutoCreatorParams.request.name(), new HashMap() {{ - put(AutoCreatorParams.rootId.name(), textbookId); - put(AutoCreatorParams.unitId.name(), unitId); - put(AutoCreatorParams.children.name(), Arrays.asList(resourceId)); - }}); - }}; - addToHierarchy(channel, textbookId, hierarchyReq); - } else { - LOGGER.info("ContentUtil :: linkCollection :: Hierarchy Validation Failed For : " + textbookId); - } - } else { - LOGGER.info("ContentUtil :: linkCollection :: Collection Linking Skipped because required data is not available for : " + eventObjectId); - } - } - } - - //TODO: Remove this method in release-3.3.0, Added only for backward compatibility - private void linkTextbook(String channel, String eventObjectId, Map textbookInfo, String resourceId) throws Exception { - String textbookId = (String) textbookInfo.getOrDefault(AutoCreatorParams.identifier.name(), ""); - List unitIdentifiers = (List) textbookInfo.getOrDefault(AutoCreatorParams.unitIdentifiers.name(), new ArrayList()); - if (StringUtils.isNotBlank(textbookId) && CollectionUtils.isNotEmpty(unitIdentifiers)) { - Map rootHierarchy = getHierarchy(textbookId); - List childNodes = (List) rootHierarchy.getOrDefault(AutoCreatorParams.childNodes.name(), new ArrayList()); - if (CollectionUtils.isNotEmpty(childNodes) && childNodes.containsAll(unitIdentifiers)) { - Map hierarchyReq = new HashMap() {{ - put(AutoCreatorParams.request.name(), new HashMap() {{ - put(AutoCreatorParams.rootId.name(), textbookId); - put(AutoCreatorParams.unitId.name(), unitIdentifiers.get(0)); - put(AutoCreatorParams.children.name(), Arrays.asList(resourceId)); - }}); - }}; - addToHierarchy(channel, textbookId, hierarchyReq); - } else { - LOGGER.info("ContentUtil :: linkTextbook :: Hierarchy Validation Failed For : " + textbookId); - } - } else { - LOGGER.info("ContentUtil :: linkTextbook :: Textbook Linking Skipped because required data is not available for : " + eventObjectId); - } - } - - private boolean validateHierarchy(String textbookId, Map rootHierarchy, String unitId) { - List childNodes = (List) rootHierarchy.getOrDefault(AutoCreatorParams.childNodes.name(), new ArrayList()); - if (CollectionUtils.isNotEmpty(childNodes) && childNodes.contains(unitId)) { - return true; - } else { - LOGGER.info("ContentUtil :: validateHierarchy :: Unit Identifier is not found under : " + textbookId); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Unit Identifier is not found under : " + textbookId); - } - } - - private Boolean addToHierarchy(String channel, String textbookId, Map hierarchyReq) throws Exception { - Boolean result = false; - String url = KP_CS_BASE_URL + "/content/v3/hierarchy/add"; - Map header = new HashMap() {{ - put("X-Channel-Id", channel); - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - Response resp = UnirestUtil.patch(url, hierarchyReq, header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - String contentId = (String) resp.getResult().get("rootId"); - if (StringUtils.equalsIgnoreCase(contentId, textbookId)) { - LOGGER.info("ContentUtil :: addToHierarchy :: Content Hierarchy Updated Successfully for: " + textbookId); - result = true; - } - } else { - LOGGER.info("ContentUtil :: updateHierarchy :: Invalid Response received while adding resource to hierarchy for : " + textbookId + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while adding resource to hierarchy for : " + textbookId + getErrorDetails(resp)); - } - return result; - } - - private Map getHierarchy(String identifier) throws Exception { - Map result = new HashMap(); - String url = KP_CS_BASE_URL + "/content/v3/hierarchy/" + identifier; - Map header = new HashMap(){{ - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - Response resp = UnirestUtil.get(url, "mode=edit", header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - result = (Map) resp.getResult().getOrDefault("content", new HashMap()); - return result; - } else { - LOGGER.info("ContentUtil :: getHierarchy :: Invalid Response received while fetching hierarchy for : " + identifier + getErrorDetails(resp)); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Invalid Response received while fetching hierarchy for : " + identifier + getErrorDetails(resp)); - } - } - - private Map getOriginContent(String channelId, String identifier) throws Exception { - String contentId = ""; - String url = KP_CS_BASE_URL + "/content/v3/read/" + identifier; - LOGGER.info("ContentUtil :: getOriginContent :: Reading origin content having identifier : " + identifier); - Map header = new HashMap() {{ - put("X-Channel-Id", channelId); - put("Content-Type", DEFAULT_CONTENT_TYPE); - }}; - Response resp = UnirestUtil.get(url, "mode=edit", header); - if ((null != resp && resp.getResponseCode() == ResponseCode.OK) && MapUtils.isNotEmpty(resp.getResult())) { - contentId = ((String) ((Map) resp.getResult().getOrDefault("content", new HashMap())).getOrDefault("identifier", "")).replace(".img", ""); - if (StringUtils.equalsIgnoreCase(identifier, contentId)) { - LOGGER.info("ContentUtil :: getOriginContent :: Origin Content Fetched Successfully with identifier : " + contentId); - return ((Map) resp.getResult().getOrDefault("content", new HashMap())); - } else - throw new ServerException("ERR_API_CALL", "Identifier Mismatched while reading content for : " + identifier); - } else if (null != resp && resp.getResponseCode() == ResponseCode.RESOURCE_NOT_FOUND) { - LOGGER.info("ContentUtil :: getOriginContent :: Origin Content Not Found With Identifier : " + identifier + getErrorDetails(resp)); - } else - throw new ServerException("ERR_API_CALL", "ContentUtil :: getOriginContent :: Invalid Response Received While Reading Origin Content With Identifier : " + identifier + getErrorDetails(resp)); - return new HashMap(); - } - - private String getBasePath(String objectId) { - return StringUtils.isNotBlank(objectId) ? TEMP_FILE_LOCATION + File.separator + objectId + File.separator + "_temp_" + System.currentTimeMillis(): TEMP_FILE_LOCATION + File.separator + "_temp_" + System.currentTimeMillis(); - } - - private String getFileNameFromURL(String fileUrl) { - String fileName = FilenameUtils.getBaseName(fileUrl) + "_" + System.currentTimeMillis(); - if (!FilenameUtils.getExtension(fileUrl).isEmpty()) - fileName += "." + FilenameUtils.getExtension(fileUrl); - return fileName; - } - - private File getFile(String identifier, String fileUrl, String mimeType) throws Exception { - File file = null; - try { - if (StringUtils.isNotBlank(fileUrl) && fileUrl.contains("drive.google.com")) { - String fileId = fileUrl.split("download&id=")[1]; - if(StringUtils.isBlank(fileId)) - throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), "Invalid fileUrl received for : " + identifier + " | fileUrl : " + fileUrl); - while (null == file && GoogleDriveUtil.BACKOFF_DELAY <= GoogleDriveUtil.MAXIMUM_BACKOFF_DELAY) { - file = GoogleDriveUtil.downloadFile(fileId, getBasePath(identifier), mimeType); - } - } else { - file = HttpDownloadUtility.downloadFile(fileUrl, getBasePath(identifier)); - } - return file; - } catch (Exception e) { - if(e instanceof ServerException) - throw e; - else { - LOGGER.info("Invalid fileUrl received for : " + identifier + " | fileUrl : " + fileUrl + "Exception is : " + e.getMessage()); - throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), "Invalid fileUrl received for : " + identifier + " | fileUrl : " + fileUrl); - } - } - } - - private String[] uploadArtifact(File uploadedFile, String identifier) { - String[] urlArray = new String[] {}; - try { - String folder = S3PropertyReader.getProperty(CONTENT_FOLDER); - folder = folder + "/" + Slug.makeSlug(identifier, true) + "/" + S3PropertyReader.getProperty(ARTEFACT_FOLDER); - urlArray = CloudStore.uploadFile(folder, uploadedFile, true); - } catch (Exception e) { - LOGGER.info("ContentUtil :: uploadArtifact :: Exception occurred while uploading artifact for : " + identifier + "Exception is : " + e.getMessage()); - e.printStackTrace(); - throw new ServerException(ContentErrorCodes.ERR_CONTENT_UPLOAD_FILE.name(), - "Error while uploading the File.", e); - } - return urlArray; - } - - private void delay(long time) { - try { - Thread.sleep(time * 1000); - } catch (Exception e) { - - } - } - - private static String getErrorDetails(Response resp) { - return (null != resp) ? (" | Response Code :" + resp.getResponseCode().toString() + " | Result : " + resp.getResult() + " | Error Message : " + resp.getParams().getErrmsg()) : " | Null Response Received."; - } - -} diff --git a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/GoogleDriveUtil.java b/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/GoogleDriveUtil.java deleted file mode 100644 index 9e9a5fc3aa..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/GoogleDriveUtil.java +++ /dev/null @@ -1,156 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import com.google.api.client.auth.oauth2.Credential; -import com.google.api.client.googleapis.auth.oauth2.GoogleCredential; -import com.google.api.client.googleapis.javanet.GoogleNetHttpTransport; -import com.google.api.client.googleapis.json.GoogleJsonResponseException; -import com.google.api.client.googleapis.services.GoogleClientRequestInitializer; -import com.google.api.client.http.HttpResponseException; -import com.google.api.client.http.HttpTransport; -import com.google.api.client.json.JsonFactory; -import com.google.api.client.json.jackson2.JacksonFactory; -import com.google.api.services.drive.Drive; -import com.google.api.services.drive.DriveRequestInitializer; -import com.google.api.services.drive.DriveScopes; -import org.apache.commons.lang.StringUtils; -import org.sunbird.common.Platform; -import org.sunbird.common.Slug; -import org.sunbird.common.enums.TaxonomyErrorCodes; -import org.sunbird.common.exception.ServerException; - -import java.io.ByteArrayInputStream; -import java.io.File; -import java.io.FileOutputStream; -import java.io.InputStream; -import java.io.OutputStream; -import java.nio.charset.Charset; -import java.util.Arrays; -import java.util.Collections; -import java.util.List; - -public class GoogleDriveUtil { - - private static final JsonFactory JSON_FACTORY = new JacksonFactory(); - private static final String ERR_MSG = "Please Provide Valid Google Drive URL!"; - private static final String SERVICE_ERROR = "Unable to Connect to Google Service. Please Try Again After Sometime!"; - private static final List errorCodes = Arrays.asList("dailyLimitExceeded402", "limitExceeded", - "dailyLimitExceeded", "quotaExceeded", "userRateLimitExceeded", "quotaExceeded402", "keyExpired", - "keyInvalid"); - private static final List SCOPES = Arrays.asList(DriveScopes.DRIVE_READONLY); - private static final String APP_NAME = Platform.config.hasPath("auto_creator.gdrive.application_name") ? Platform.config.getString("auto_creator.gdrive.application_name") : "drive-download-sunbird"; - private static final String SERVICE_ACC_CRED = Platform.config.getString("auto_creator_g_service_acct_cred"); - public static final Integer INITIAL_BACKOFF_DELAY = Platform.config.hasPath("auto_creator.initial_backoff_delay") ? Platform.config.getInt("auto_creator.initial_backoff_delay") : 1200000; // 20 min - public static final Integer MAXIMUM_BACKOFF_DELAY = Platform.config.hasPath("auto_creator.maximum_backoff_delay") ? Platform.config.getInt("auto_creator.maximum_backoff_delay") : 3900000; // 65 min - public static final Integer INCREMENT_BACKOFF_DELAY = Platform.config.hasPath("auto_creator.increment_backoff_delay") ? Platform.config.getInt("auto_creator.increment_backoff_delay") : 300000; // 5 min - public static Integer BACKOFF_DELAY = INITIAL_BACKOFF_DELAY; - private static boolean limitExceeded = false; - private static Drive drive = null; - private static JobLogger LOGGER = new JobLogger(GoogleDriveUtil.class); - - static { - try { - HttpTransport HTTP_TRANSPORT = GoogleNetHttpTransport.newTrustedTransport(); - drive = new Drive.Builder(HTTP_TRANSPORT, JSON_FACTORY, getCredentials()).setApplicationName(APP_NAME).build(); - } catch (Exception e) { - LOGGER.error("Error occurred while creating google drive client ::: " + e.getMessage(), e); - e.printStackTrace(); - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), "Error occurred while creating google drive client ::: "+ e.getMessage()); - } - } - - private static Credential getCredentials() throws Exception { - InputStream credentialsStream = new ByteArrayInputStream(SERVICE_ACC_CRED.getBytes(Charset.forName("UTF-8"))); - GoogleCredential credential = GoogleCredential.fromStream(credentialsStream).createScoped(SCOPES); - return credential; - } - - public static File downloadFile(String fileId, String saveDir, String mimeType) throws Exception { - try { - Drive.Files.Get getFile = drive.files().get(fileId); - getFile.setFields("id,name,size,owners,mimeType,properties,permissionIds,webContentLink"); - com.google.api.services.drive.model.File googleDriveFile = getFile.execute(); - LOGGER.info("GoogleDriveUtil :: downloadFile ::: Drive File Details:: " + googleDriveFile); - String fileName = googleDriveFile.getName(); - String fileMimeType = googleDriveFile.getMimeType(); - LOGGER.info("GoogleDriveUtil :: downloadFile ::: Node mimeType :: "+mimeType + " | File mimeType :: "+fileMimeType); - if(!StringUtils.equalsIgnoreCase(mimeType,"image")) - validateMimeType(fileId, mimeType, fileMimeType); - File saveFile = new File(saveDir); - if (!saveFile.exists()) { - saveFile.mkdirs(); - } - String saveFilePath = saveDir + File.separator + fileName; - LOGGER.info("GoogleDriveUtil :: downloadFile :: File Id :" + fileId + " | Save File Path: " + saveFilePath); - - OutputStream outputStream = new FileOutputStream(saveFilePath); - getFile.executeMediaAndDownloadTo(outputStream); - outputStream.close(); - File file = new File(saveFilePath); - file = Slug.createSlugFile(file); - LOGGER.info("GoogleDriveUtil :: downloadFile :: File Downloaded Successfully. Sluggified File Name: " + file.getAbsolutePath()); - if (null != file && BACKOFF_DELAY != INITIAL_BACKOFF_DELAY) - BACKOFF_DELAY = INITIAL_BACKOFF_DELAY; - return file; - } catch(GoogleJsonResponseException ge) { - LOGGER.error("GoogleDriveUtil :: downloadFile :: GoogleJsonResponseException :: Error Occurred while downloading file having id "+fileId + " | Error is ::"+ge.getDetails().toString(), ge); - throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), "Invalid Response Received From Google API for file Id : " + fileId + " | Error is : " + ge.getDetails().toString()); - } catch(HttpResponseException he) { - LOGGER.error("GoogleDriveUtil :: downloadFile :: HttpResponseException :: Error Occurred while downloading file having id "+fileId + " | Error is ::"+he.getContent(), he); - he.printStackTrace(); - if(he.getStatusCode() == 403) { - if (BACKOFF_DELAY <= MAXIMUM_BACKOFF_DELAY) - delay(BACKOFF_DELAY); - if (BACKOFF_DELAY == 2400000) - BACKOFF_DELAY += 1500000; - else - BACKOFF_DELAY = BACKOFF_DELAY * INCREMENT_BACKOFF_DELAY; - } else throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), "Invalid Response Received From Google API for file Id : " + fileId + " | Error is : " + he.getContent()); - } catch (Exception e) { - LOGGER.error("GoogleDriveUtil :: downloadFile :: Exception :: Error Occurred While Downloading Google Drive File having Id " + fileId + " : " + e.getMessage(), e); - e.printStackTrace(); - if(e instanceof ServerException) - throw e; - else throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), "Invalid Response Received From Google API for file Id : " + fileId + " | Error is : " + e.getMessage()); - } - return null; - } - - private static void validateMimeType(String fileId, String mimeType, String fileMimeType) { - String errMsg = "Invalid File Url! File MimeType Is Not Same As Object MimeType for File Id : " + fileId + " | File MimeType is : " +fileMimeType + " | Node MimeType is : "+mimeType; - switch (mimeType){ - case "application/vnd.ekstep.h5p-archive" : { - if(!(StringUtils.equalsIgnoreCase("application/x-zip", fileMimeType) || StringUtils.equalsIgnoreCase("application/zip", fileMimeType))) - throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), errMsg); - break; - } - case "application/epub" : { - if(!StringUtils.equalsIgnoreCase("application/epub+zip", fileMimeType)) - throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), errMsg); - break; - } - case "audio/mp3" : { - if(!StringUtils.equalsIgnoreCase("audio/mpeg", fileMimeType)) - throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), errMsg); - break; - } - case "application/vnd.ekstep.html-archive" : { - if(!StringUtils.equalsIgnoreCase("application/x-zip-compressed", fileMimeType)) - throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), errMsg); - break; - } - default: { - if(!StringUtils.equalsIgnoreCase(mimeType, fileMimeType)) - throw new ServerException(TaxonomyErrorCodes.ERR_INVALID_UPLOAD_FILE_URL.name(), errMsg); - } - } - } - - public static void delay(int time) { - LOGGER.info("delay is called with : " + time); - try { - Thread.sleep(time); - } catch (Exception e) { - - } - } -} diff --git a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/UnirestUtil.java b/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/UnirestUtil.java deleted file mode 100644 index 6219df79b5..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/java/org/sunbird/jobs/samza/util/UnirestUtil.java +++ /dev/null @@ -1,148 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import com.fasterxml.jackson.databind.ObjectMapper; -import kong.unirest.HttpResponse; -import kong.unirest.Unirest; -import org.apache.commons.collections4.MapUtils; -import org.apache.commons.lang3.StringUtils; -import org.sunbird.common.Platform; -import org.sunbird.common.dto.Response; -import org.sunbird.common.enums.TaxonomyErrorCodes; -import org.sunbird.common.exception.ServerException; - -import java.io.File; -import java.util.Map; - -public class UnirestUtil { - - private static ObjectMapper mapper = new ObjectMapper(); - private static JobLogger LOGGER = new JobLogger(UnirestUtil.class); - public static final Long INITIAL_BACKOFF_DELAY = Platform.config.hasPath("auto_creator.internal_api.initial_backoff_delay") ? Platform.config.getLong("auto_creator.internal_api.initial_backoff_delay") : 10000; // 10 seconds - public static final Long MAXIMUM_BACKOFF_DELAY = Platform.config.hasPath("auto_creator.internal_api.maximum_backoff_delay") ? Platform.config.getLong("auto_creator.internal_api.initial_backoff_delay") : 300000; // 5 min - public static final Integer INCREMENT_BACKOFF_DELAY = Platform.config.hasPath("auto_creator.increment_backoff_delay") ? Platform.config.getInt("auto_creator.increment_backoff_delay") : 2; - public static Long BACKOFF_DELAY = INITIAL_BACKOFF_DELAY; - - public static Response post(String url, Map requestMap, Map headerParam) - throws Exception { - Response resp = null; - validateRequest(url, headerParam); - if (MapUtils.isEmpty(requestMap)) - throw new ServerException("ERR_INVALID_REQUEST_BODY", "Request Body is Missing!"); - try { - while (null == resp) { - HttpResponse response = Unirest.post(url).headers(headerParam).body(mapper.writeValueAsString(requestMap)).asString(); - resp = getResponse(url, response); - } - return resp; - } catch (Exception e) { - throw new ServerException("ERR_API_CALL", "Something Went Wrong While Making API Call | Error is: " + e.getMessage()); - } - } - - public static Response patch(String url, Map requestMap, Map headerParam) - throws Exception { - Response resp = null; - validateRequest(url, headerParam); - if (MapUtils.isEmpty(requestMap)) - throw new ServerException("ERR_INVALID_REQUEST_BODY", "Request Body is Missing!"); - try { - while (null == resp) { - HttpResponse response = Unirest.patch(url).headers(headerParam).body(mapper.writeValueAsString(requestMap)).asString(); - resp = getResponse(url, response); - } - return resp; - } catch (Exception e) { - throw new ServerException("ERR_API_CALL", "Something Went Wrong While Making API Call | Error is: " + e.getMessage()); - } - } - - public static Response post(String url, String paramName, File value, Map headerParam) - throws Exception { - Response resp = null; - validateRequest(url, headerParam); - if (null == value || null == value) - throw new ServerException("ERR_INVALID_REQUEST_PARAM", "Invalid Request Param!"); - try { - while (null == resp) { - HttpResponse response = Unirest.post(url).headers(headerParam).multiPartContent().field(paramName, new File(value.getAbsolutePath())).asString(); - resp = getResponse(url, response); - } - return resp; - } catch (Exception e) { - throw new ServerException("ERR_API_CALL", "Something Went Wrong While Making API Call | Error is: " + e.getMessage()); - } - } - - public static Response post(String url, String paramName, String value, Map headerParam) - throws Exception { - Response resp = null; - validateRequest(url, headerParam); - if (null == value || null == value) - throw new ServerException("ERR_INVALID_REQUEST_PARAM", "Invalid Request Param!"); - try { - while (null == resp) { - HttpResponse response = Unirest.post(url).headers(headerParam).multiPartContent().field(paramName, value).asString(); - resp = getResponse(url, response); - } - return resp; - } catch (Exception e) { - throw new ServerException("ERR_API_CALL", "Something Went Wrong While Making API Call | Error is: " + e.getMessage()); - } - } - - public static Response get(String url, String queryParam, Map headerParam) - throws Exception { - Response resp = null; - validateRequest(url, headerParam); - String reqUrl = StringUtils.isNotBlank(queryParam) ? url + "?" + queryParam : url; - try { - while (null == resp) { - HttpResponse response = Unirest.get(reqUrl).headers(headerParam).asString(); - resp = getResponse(reqUrl, response); - } - return resp; - } catch (Exception e) { - throw new ServerException("ERR_API_CALL", "Something Went Wrong While Making API Call | Error is: " + e.getMessage()); - } - } - - private static void validateRequest(String url, Map headerParam) { - if (StringUtils.isBlank(url)) - throw new ServerException("ERR_INVALID_URL", "Url Parameter is Missing!"); - if (null == headerParam) - throw new ServerException("ERR_INVALID_HEADER_PARAM", "Header Parameter is Missing!"); - } - - private static Response getResponse(String url, HttpResponse response) { - Response resp = null; - if (null != response && StringUtils.isNotBlank(response.getBody())) { - try { - resp = mapper.readValue(response.getBody(), Response.class); - BACKOFF_DELAY = INITIAL_BACKOFF_DELAY; - } catch (Exception e) { - LOGGER.error("UnirestUtil ::: getResponse ::: Error occurred while parsing api response for url ::: " + url + ". | Error is: " + e.getMessage(), e); - LOGGER.info("UnirestUtil :::: BACKOFF_DELAY ::: " + BACKOFF_DELAY); - if (BACKOFF_DELAY <= MAXIMUM_BACKOFF_DELAY) { - long delay = BACKOFF_DELAY; - BACKOFF_DELAY = BACKOFF_DELAY * INCREMENT_BACKOFF_DELAY; - LOGGER.info("UnirestUtil :::: BACKOFF_DELAY after increment::: " + BACKOFF_DELAY); - delay(delay); - } else throw new ServerException("ERR_API_CALL", "Unable to parse response data for url: "+ url +" | Error is: " + e.getMessage()); - } - } else { - LOGGER.info("Null Response Received While Making Api Call!"); - throw new ServerException("ERR_API_CALL", "Null Response Received While Making Api Call!"); - } - return resp; - } - - private static void delay(long time) { - LOGGER.info("UnirestUtil :::: backoff delay is called with : " + time); - try { - Thread.sleep(time); - } catch (Exception e) { - - } - } - -} diff --git a/platform-jobs/samza/auto-creator/src/main/resources/actor-config.xml b/platform-jobs/samza/auto-creator/src/main/resources/actor-config.xml deleted file mode 100644 index f349225f43..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/resources/actor-config.xml +++ /dev/null @@ -1,24 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/auto-creator/src/main/resources/application.conf b/platform-jobs/samza/auto-creator/src/main/resources/application.conf deleted file mode 100644 index 55482aeae5..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/resources/application.conf +++ /dev/null @@ -1,13 +0,0 @@ -LearningActorSystem{ - default-dispatcher { - type = "Dispatcher" - executor = "fork-join-executor" - fork-join-executor { - parallelism-min = 1 - parallelism-factor = 2.0 - parallelism-max = 4 - } - # Throughput for default Dispatcher, set to 1 for as fair as possible - throughput = 1 - } -} diff --git a/platform-jobs/samza/auto-creator/src/main/resources/log4j.xml b/platform-jobs/samza/auto-creator/src/main/resources/log4j.xml deleted file mode 100644 index d2db3940cc..0000000000 --- a/platform-jobs/samza/auto-creator/src/main/resources/log4j.xml +++ /dev/null @@ -1,20 +0,0 @@ - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/common/.gitignore b/platform-jobs/samza/common/.gitignore deleted file mode 100644 index b83d22266a..0000000000 --- a/platform-jobs/samza/common/.gitignore +++ /dev/null @@ -1 +0,0 @@ -/target/ diff --git a/platform-jobs/samza/common/pom.xml b/platform-jobs/samza/common/pom.xml deleted file mode 100644 index 4eedd8abfa..0000000000 --- a/platform-jobs/samza/common/pom.xml +++ /dev/null @@ -1,116 +0,0 @@ - - 4.0.0 - - org.sunbird - samza - 1.1-SNAPSHOT - - samza-common - - - org.sunbird - unit-tests - 1.1-SNAPSHOT - test - - - org.apache.samza - samza-api - ${samza.version} - - - org.apache.samza - samza-core_${scala.version} - ${samza.version} - - - org.apache.samza - samza-yarn_${scala.version} - ${samza.version} - - - org.apache.samza - samza-kafka_${scala.version} - ${samza.version} - - - org.apache.samza - samza-log4j - ${samza.version} - - - org.apache.hadoop - hadoop-yarn-client - ${hadoop.version} - - - org.apache.hadoop - hadoop-yarn-common - ${hadoop.version} - - - org.apache.hadoop - hadoop-hdfs - ${hadoop.version} - - - org.apache.httpcomponents - httpclient - 4.5.2 - - - org.apache.samza - samza-shell - dist - tgz - ${samza.version} - - - com.typesafe - config - 1.3.1 - - - org.sunbird - platform-telemetry - 1.1-SNAPSHOT - - - com.google.guava - guava - 19.0 - - - org.sunbird - learning-actors - 1.1-SNAPSHOT - - - log4j-1.2-api - org.apache.logging.log4j - - - - - org.mockito - mockito-core - 2.0.31-beta - test - - - - - - - - org.apache.maven.plugins - maven-compiler-plugin - - 1.8 - 1.8 - - - - - diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/exception/PlatformErrorCodes.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/exception/PlatformErrorCodes.java deleted file mode 100644 index a6fe50749a..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/exception/PlatformErrorCodes.java +++ /dev/null @@ -1,6 +0,0 @@ -package org.sunbird.jobs.samza.exception; - -public enum PlatformErrorCodes { - - ERR_DEFINITION_NOT_FOUND, ERR_HOST_UNAVAILABLE, SYSTEM_ERROR, DATA_ERROR, PROCESSING_ERROR, PUBLISH_FAILED -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/exception/PlatformException.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/exception/PlatformException.java deleted file mode 100644 index 206fee7ae5..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/exception/PlatformException.java +++ /dev/null @@ -1,24 +0,0 @@ -package org.sunbird.jobs.samza.exception; - -import org.sunbird.common.exception.MiddlewareException; - -public class PlatformException extends MiddlewareException { - - private static final long serialVersionUID = -8708641286413033915L; - - public PlatformException(String errCode, String message) { - super(errCode, message); - } - - public PlatformException(String errCode, String message, Object... params) { - super(errCode, message, params); - } - - public PlatformException(String errCode, String message, Throwable root) { - super(errCode, message, root); - } - - public PlatformException(String errCode, String message, Throwable root, Object... params) { - super(errCode, message, root, params); - } -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/serializers/EkstepJsonSerde.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/serializers/EkstepJsonSerde.java deleted file mode 100644 index 5574b1dba8..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/serializers/EkstepJsonSerde.java +++ /dev/null @@ -1,109 +0,0 @@ -package org.sunbird.jobs.samza.serializers; - -import java.io.UnsupportedEncodingException; -import java.util.HashMap; -import java.util.Map; - -import org.apache.samza.SamzaException; -import org.apache.samza.serializers.Serde; -import org.codehaus.jackson.map.ObjectMapper; -import org.codehaus.jackson.type.TypeReference; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -/* - * A serializer for JSON strings that - *
    - *
  1. - * returns a LinkedHashMap upon deserialization. - *
  2. - * enforces the 'dash-separated' property naming convention. - *
- * - * @author Mahesh Kumar Gangula - */ - -public class EkstepJsonSerde implements Serde { - - private static final Logger LOG = LoggerFactory.getLogger(EkstepJsonSerde.class); - private final Class clazz; - private transient ObjectMapper mapper = new ObjectMapper(); - - /** - * Constructs a EkstepJsonSerde that returns a LinkedHashMap<String, - * Object< upon deserialization. - */ - public EkstepJsonSerde() { - this(null); - } - - /** - * Constructs a EkstepJsonSerde that (de)serializes POJOs of class - * {@code clazz}. - * - * @param clazz - * the class of the POJO being (de)serialized. - */ - public EkstepJsonSerde(Class clazz) { - this.clazz = clazz; - } - - public static EkstepJsonSerde of(Class clazz) { - return new EkstepJsonSerde<>(clazz); - } - - @Override - public byte[] toBytes(T obj) { - if (obj != null) { - try { - String str = mapper.writeValueAsString(obj); - return str.getBytes("UTF-8"); - } catch (Exception e) { - throw new SamzaException("Error serializing data.", e); - } - } else { - return null; - } - } - - @SuppressWarnings("unchecked") - @Override - public T fromBytes(byte[] bytes) { - if (bytes != null) { - String str = null; - try { - str = new String(bytes, "UTF-8"); - if (clazz != null) { - return mapper.readValue(str, clazz); - } else { - return mapper.readValue(str, new TypeReference() { - }); - } - } catch (UnsupportedEncodingException e) { - LOG.error("Error deserializing data. Unsupported encoding: " + bytes, e); - Map map = exceptionMap(bytes, "Error deserializing data. Unsupported encoding", e); - return (T) map; - } catch (Exception e) { - LOG.error("Error deserializing data: " + str, e); - Map map = exceptionMap(str, "Error deserializing data", e); - return (T) map; - } - } else { - LOG.error("Bytes data is null"); - Map map = exceptionMap(bytes, "Bytes data is null", null); - return (T) map; - } - } - - public Map exceptionMap(Object data, String message, Exception e) { - Map map = new HashMap(); - if (data instanceof Byte) - map.put("bytes", data); - if (data instanceof String) - map.put("str", data); - map.put("message", message); - map.put("exception", e); - map.put("serde", "error"); - return map; - } -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/serializers/EkstepJsonSerdeFactory.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/serializers/EkstepJsonSerdeFactory.java deleted file mode 100644 index 92034a4feb..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/serializers/EkstepJsonSerdeFactory.java +++ /dev/null @@ -1,16 +0,0 @@ -package org.sunbird.jobs.samza.serializers; - -import org.apache.samza.config.Config; -import org.apache.samza.serializers.SerdeFactory; - -/** - * - * @author Mahesh Kumar Gangula - * - */ - -public class EkstepJsonSerdeFactory implements SerdeFactory { - public EkstepJsonSerde getSerde(String name, Config config) { - return new EkstepJsonSerde<>(); - } -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/service/ISamzaService.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/service/ISamzaService.java deleted file mode 100644 index cc0a1010aa..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/service/ISamzaService.java +++ /dev/null @@ -1,26 +0,0 @@ -package org.sunbird.jobs.samza.service; - -import java.util.Map; - -import org.apache.samza.config.Config; -import org.apache.samza.task.MessageCollector; -import org.sunbird.jobs.samza.service.task.JobMetrics; - -public interface ISamzaService { - - /** - * - * @param config - * @throws Exception - */ - public void initialize(Config config) throws Exception; - - /** - * The class processMessage is mainly responsible for processing the messages sent from consumers based on required - * specifications - * - * @param MessageData The messageData - */ - public void processMessage(Map message, JobMetrics metrics, MessageCollector collector) throws Exception; - -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/service/task/JobMetrics.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/service/task/JobMetrics.java deleted file mode 100644 index b50dbb3a60..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/service/task/JobMetrics.java +++ /dev/null @@ -1,129 +0,0 @@ -package org.sunbird.jobs.samza.service.task; - -import org.apache.samza.metrics.Counter; -import org.apache.samza.metrics.Metric; -import org.apache.samza.metrics.MetricsRegistry; -import org.apache.samza.metrics.MetricsRegistryMap; -import org.apache.samza.system.SystemStreamPartition; -import org.apache.samza.task.TaskContext; -import org.sunbird.jobs.samza.util.JobLogger; - -import java.util.HashMap; -import java.util.Map; -import java.util.concurrent.ConcurrentHashMap; - -public class JobMetrics { - - private static JobLogger LOGGER = new JobLogger(JobMetrics.class); - private String jobName; - private String topic; - private TaskContext context; - private final Counter successMessageCount; - private final Counter failedMessageCount; - private final Counter skippedMessageCount; - private final Counter errorMessageCount; - private int partition; - - public JobMetrics(TaskContext context) { - this(context, null, null); - } - - public JobMetrics(TaskContext context, String jName, String topic) { - MetricsRegistry metricsRegistry = context.getMetricsRegistry(); - successMessageCount = metricsRegistry.newCounter(getClass().getName(), "success-message-count"); - failedMessageCount = metricsRegistry.newCounter(getClass().getName(), "failed-message-count"); - skippedMessageCount = metricsRegistry.newCounter(getClass().getName(), "skipped-message-count"); - errorMessageCount = metricsRegistry.newCounter(getClass().getName(), "error-message-count"); - jobName = jName; - this.topic = topic; - this.context=context; - } - - public void clear() { - successMessageCount.clear(); - failedMessageCount.clear(); - skippedMessageCount.clear(); - errorMessageCount.clear(); - } - - public void incSuccessCounter() { - successMessageCount.inc(); - } - - public void incFailedCounter() { - failedMessageCount.inc(); - } - - public void incSkippedCounter() { - skippedMessageCount.inc(); - } - - public void incErrorCounter() { - errorMessageCount.inc(); - } - - public String getJobName() { - return jobName; - } - - public void setJobName(String jobName) { - this.jobName = jobName; - } - - public String getTopic() { - return topic; - } - - public void setTopic(String topic) { - this.topic = topic; - } - - /** - * - * @param containerMetricsRegistry - * @return - */ - public long computeConsumerLag(Map> containerMetricsRegistry) { - long consumerLag = 0; - try { - for (SystemStreamPartition sysPartition : context.getSystemStreamPartitions()) { - if (!sysPartition.getStream().endsWith("system.command")) { - long highWatermarkOffset = - Long.valueOf(containerMetricsRegistry.get("org.apache.samza.system.kafka.KafkaSystemConsumerMetrics") - .get(getSamzaMetricKey(sysPartition, "high-watermark")).toString()); - long checkPointOffset = Long.valueOf(containerMetricsRegistry.get("org.apache.samza.checkpoint.OffsetManagerMetrics") - .get(getSamzaMetricKey(sysPartition, "checkpointed-offset")).toString()); - String lagMessage = "Job Name : " + getJobName() + " , partition : " + sysPartition.getPartition().getPartitionId() + " , Stream : " + sysPartition.toString() + " , Current High Water Mark Offset : " + highWatermarkOffset + " , Current Checkpoint Offset : " + checkPointOffset + " , consumer lag : " + (highWatermarkOffset - checkPointOffset) + " , timestamp :" + System.currentTimeMillis(); - System.out.println(lagMessage); - LOGGER.info(lagMessage); - consumerLag += highWatermarkOffset - checkPointOffset; - this.partition = sysPartition.getPartition().getPartitionId(); - } - } - - } catch (Exception e) { - LOGGER.error("Exception Occurred While Computing Consumer Lag. Exception is : ", "", e); - } - return consumerLag; - } - - private String getSamzaMetricKey(SystemStreamPartition partition, String samzaMetricName) { - return String.format("%s-%s-%s-%s", - partition.getSystem(), partition.getStream(), partition.getPartition().getPartitionId(), samzaMetricName); - } - - public Map collect() { - LOGGER.info("collect is called for Job : "+getJobName()+" , partition : "+partition); - Map metricsEvent = new HashMap<>(); - metricsEvent.put("job-name", jobName); - metricsEvent.put("success-message-count", successMessageCount.getCount()); - metricsEvent.put("failed-message-count", failedMessageCount.getCount()); - metricsEvent.put("error-message-count", errorMessageCount.getCount()); - metricsEvent.put("skipped-message-count", skippedMessageCount.getCount()); - metricsEvent.put("partition",partition); - metricsEvent.put("consumer-lag", - computeConsumerLag(((MetricsRegistryMap) context.getSamzaContainerContext().metricsRegistry).metrics())); - return metricsEvent; - } - -} \ No newline at end of file diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/service/util/AbstractESIndexer.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/service/util/AbstractESIndexer.java deleted file mode 100644 index e3aa023b71..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/service/util/AbstractESIndexer.java +++ /dev/null @@ -1,20 +0,0 @@ -/** - * - */ -package org.sunbird.jobs.samza.service.util; - -/** - * @author pradyumna - * - */ -public abstract class AbstractESIndexer { - - /** - * - */ - public AbstractESIndexer() { - init(); - } - - protected abstract void init(); -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/task/AbstractTask.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/task/AbstractTask.java deleted file mode 100644 index 4656d9a472..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/task/AbstractTask.java +++ /dev/null @@ -1,204 +0,0 @@ -package org.sunbird.jobs.samza.task; - -import java.util.HashMap; -import java.util.Map; -import java.util.UUID; - -import org.apache.commons.lang3.StringUtils; -import org.apache.samza.config.Config; -import org.apache.samza.system.IncomingMessageEnvelope; -import org.apache.samza.system.OutgoingMessageEnvelope; -import org.apache.samza.system.SystemStream; -import org.apache.samza.task.InitableTask; -import org.apache.samza.task.MessageCollector; -import org.apache.samza.task.StreamTask; -import org.apache.samza.task.TaskContext; -import org.apache.samza.task.TaskCoordinator; -import org.apache.samza.task.WindowableTask; -import org.sunbird.common.Platform; -import org.sunbird.jobs.samza.service.ISamzaService; -import org.sunbird.jobs.samza.service.task.JobMetrics; -import org.sunbird.jobs.samza.util.SamzaCommonParams; -import org.sunbird.learning.util.ControllerUtil; -import org.sunbird.telemetry.TelemetryGenerator; -import org.sunbird.telemetry.TelemetryParams; -import org.sunbird.telemetry.handler.Level; - -public abstract class AbstractTask extends BaseTask { - - protected JobMetrics metrics; - private Config config = null; - private String eventId = ""; - protected String jobType = ""; - protected String jobStartMessage = ""; - protected String jobEndMessage = ""; - protected String jobClass = ""; - - private static String mid = "LP."+UUID.randomUUID(); - private static String startJobEventId = "JOB_START"; - private static String endJobEventId = "JOB_END"; - private static int MAXITERTIONCOUNT= 2; - private static ControllerUtil controllerUtil = new ControllerUtil(); - - @Override - public void init(Config config, TaskContext context) throws Exception { - metrics = new JobMetrics(context, config.get("output.metrics.job.name"), config.get("output.metrics.topic.name")); - ISamzaService service = initialize(); - service.initialize(config); - this.config = config; - this.eventId = "BE_JOB_REQUEST"; - } - - public abstract ISamzaService initialize() throws Exception; - - protected int getMaxIterations() { - if(Platform.config.hasPath("max.iteration.count.samza.job")) - return Platform.config.getInt("max.iteration.count.samza.job"); - else - return MAXITERTIONCOUNT; - } - - @SuppressWarnings("unchecked") - @Override - public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) throws Exception { - Map message = (Map) envelope.getMessage(); - Map execution = new HashMap<>(); - int maxIterations = getMaxIterations(); - String eid = (String) message.get(SamzaCommonParams.eid.name()); - Map edata = (Map) message.getOrDefault(SamzaCommonParams.edata.name(), new HashMap()); - if(StringUtils.equalsIgnoreCase(this.eventId, eid)) { - String requestedJobType = (String) edata.get(SamzaCommonParams.action.name()); - if(StringUtils.equalsIgnoreCase(this.jobType, requestedJobType)) { - int currentIteration = ((Number) edata.get(SamzaCommonParams.iteration.name())).intValue(); - preProcess(message, collector, execution, maxIterations, currentIteration); - process(message, collector, coordinator); - postProcess(message, collector, execution, maxIterations, currentIteration); - } else if(StringUtils.equalsIgnoreCase("definition_update", requestedJobType)){ - String graphId = edata.getOrDefault("graphId","").toString(); - String objectType = edata.getOrDefault("objectType","").toString(); - controllerUtil.updateDefinitionCache(graphId, objectType); - }else{ - //Throw exception has to be added. - } - } else { - //Throw exception has to be added. - } - } - - public abstract void process(Map message, MessageCollector collector, TaskCoordinator coordinator) throws Exception; - - public void preProcess(Map message, MessageCollector collector, Map execution, int maxIterationCount, int iterationCount) { - if (isInvalidMessage(message)) { - String event = generateEvent(Level.ERROR.name(), "Samza job de-serialization error", message); - collector.send(new OutgoingMessageEnvelope(new SystemStream(SamzaCommonParams.kafka.name(), this.config.get("kafka.topics.backend.telemetry")), event)); - } - try { - if(iterationCount <= maxIterationCount) { - Map jobStartEvent = getJobEvent("JOBSTARTEVENT", message); - - execution.put(SamzaCommonParams.submitted_date.name(), (long)message.get(SamzaCommonParams.ets.name())); - execution.put(SamzaCommonParams.processing_date.name(), (long)jobStartEvent.get(SamzaCommonParams.ets.name())); - execution.put(SamzaCommonParams.latency.name(), (long)jobStartEvent.get(SamzaCommonParams.ets.name()) - (long)message.get(SamzaCommonParams.ets.name())); - - pushEvent(jobStartEvent, collector, this.config.get("kafka.topics.backend.telemetry")); - } - }catch (Exception e) { - e.printStackTrace(); - } - } - - - @SuppressWarnings("unchecked") - public void postProcess(Map message, MessageCollector collector, Map execution, int maxIterationCount, int iterationCount) throws Exception { - try { - if(iterationCount <= maxIterationCount) { - Map jobEndEvent = getJobEvent("JOBENDEVENT", message); - - execution.put(SamzaCommonParams.completed_date.name(), (long)jobEndEvent.get(SamzaCommonParams.ets.name())); - execution.put(SamzaCommonParams.execution_time.name(), (long)jobEndEvent.get(SamzaCommonParams.ets.name()) - (long)execution.get(SamzaCommonParams.processing_date.name())); - Map eks = (Map)((Map)jobEndEvent.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.eks.name()); - eks.put(SamzaCommonParams.execution.name(), execution); - //addExecutionTime(jobEndEvent, execution); //Call to add execution time - - pushEvent(jobEndEvent, collector, this.config.get("kafka.topics.backend.telemetry")); - } - String eventExecutionStatus = (String)((Map) message.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.status.name()); - if(StringUtils.equalsIgnoreCase(eventExecutionStatus, SamzaCommonParams.FAILED.name()) && iterationCount < maxIterationCount) { - ((Map) message.get(SamzaCommonParams.edata.name())).put(SamzaCommonParams.iteration.name(), iterationCount+1); - collector.send(new OutgoingMessageEnvelope(new SystemStream(SamzaCommonParams.kafka.name(), this.config.get("kafka.topics.failed")), message)); - } - }catch(Exception e) { - e.printStackTrace(); - } - } - - /*@SuppressWarnings("unchecked") - private void addExecutionTime(Map jobEndEvent, Map execution) { - Map eks = (Map)((Map)jobEndEvent.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.eks.name()); - eks.put(SamzaCommonParams.execution.name(), execution); - }*/ - - private void pushEvent(Map message, MessageCollector collector, String topicId) throws Exception { - try { - //TODO: Fix Event Template for "START" & "END" Event and enable below line for backend telemetry. - //collector.send(new OutgoingMessageEnvelope(new SystemStream(SamzaCommonParams.kafka.name(), topicId), message)); - } catch (Exception e) { - e.printStackTrace(); - } - } - - @SuppressWarnings("unchecked") - public Map getJobEvent(String jobEvendID, Map message){ - - long unixTime = System.currentTimeMillis(); - Map jobEvent = new HashMap<>(); - - jobEvent.put(SamzaCommonParams.ets.name(), unixTime); - jobEvent.put(SamzaCommonParams.mid.name(), mid); - - Map edata = new HashMap<>(); - Map eks = new HashMap<>(); - eks.put(SamzaCommonParams.ets.name(), message.get(SamzaCommonParams.ets.name())); - eks.put(SamzaCommonParams.action.name(), ((Map) message.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.action.name())); - eks.put(SamzaCommonParams.iteration.name(), ((Map) message.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.iteration.name())); - eks.put(SamzaCommonParams.status.name(), ((Map) message.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.status.name())); - eks.put(SamzaCommonParams.reqid.name(), message.get(SamzaCommonParams.mid.name())); - edata.put(SamzaCommonParams.eks.name(), eks); - edata.put(SamzaCommonParams.level.name(), SamzaCommonParams.INFO.name()); - edata.put(SamzaCommonParams.jobclass.name(), this.jobClass); - edata.put(SamzaCommonParams.object.name(), message.get("object")); - - - if(StringUtils.equalsIgnoreCase(jobEvendID, "JOBSTARTEVENT")) { - jobEvent.put(SamzaCommonParams.eid.name(), startJobEventId); - edata.put(SamzaCommonParams.message.name(), this.jobStartMessage); - } - else if(StringUtils.equalsIgnoreCase(jobEvendID, "JOBENDEVENT")) { - jobEvent.put(SamzaCommonParams.eid.name(), endJobEventId); - edata.put(SamzaCommonParams.message.name(), this.jobEndMessage); - } - - jobEvent.put(SamzaCommonParams.edata.name(), edata); - return jobEvent; - } - - private String generateEvent(String logLevel, String message, Map data) { - Map context = new HashMap(); - context.put(TelemetryParams.ACTOR.name(), "org.sunbird.learning.platform"); - context.put(TelemetryParams.ENV.name(), "content"); - context.put(TelemetryParams.CHANNEL.name(), Platform.config.getString("channel.default")); - return TelemetryGenerator.log(context, "system", logLevel, message); - } - - protected boolean isInvalidMessage(Map message) { - return (message == null || (null != message && message.containsKey("serde") - && "error".equalsIgnoreCase((String) message.get("serde")))); - } - - @Override - public void window(MessageCollector collector, TaskCoordinator coordinator) throws Exception { - Map event = metrics.collect(); - collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", metrics.getTopic()), event)); - metrics.clear(); - } -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/task/BaseTask.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/task/BaseTask.java deleted file mode 100644 index a3a6d2f0b9..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/task/BaseTask.java +++ /dev/null @@ -1,14 +0,0 @@ -package org.sunbird.jobs.samza.task; - -import org.apache.samza.task.InitableTask; -import org.apache.samza.task.StreamTask; -import org.apache.samza.task.WindowableTask; - -/** - * Base Class for Samza Task - * - * @author Kumar Gauraw - */ -public abstract class BaseTask implements StreamTask, InitableTask, WindowableTask { - //TODO: Provide Common Method Implementation Here. -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/FailedEventsUtil.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/FailedEventsUtil.java deleted file mode 100644 index d0203fcd13..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/FailedEventsUtil.java +++ /dev/null @@ -1,41 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import java.util.Arrays; -import java.util.HashMap; -import java.util.List; -import java.util.Map; - -import org.apache.commons.lang3.exception.ExceptionUtils; -import org.apache.samza.system.OutgoingMessageEnvelope; -import org.apache.samza.system.SystemStream; -import org.apache.samza.task.MessageCollector; -import org.sunbird.jobs.samza.service.task.JobMetrics; - -/** - * @author gauraw - * - */ -public class FailedEventsUtil { - - private static JobLogger LOGGER = new JobLogger(FailedEventsUtil.class); - - public static void pushEventForRetry(SystemStream sysStream, Map eventMessage, - JobMetrics metrics, MessageCollector collector, String errorCode, Throwable error) { - Map failedEventMap = new HashMap(); - String errorString[] = ExceptionUtils.getStackTrace(error).split("\\n\\t"); - - List stackTrace; - if(errorString.length > 21) { - stackTrace = Arrays.asList(errorString).subList((errorString.length - 21), errorString.length - 1); - }else{ - stackTrace = Arrays.asList(errorString); - } - - failedEventMap.put("errorCode", errorCode); - failedEventMap.put("error", error.getMessage() + " : : " + stackTrace); - eventMessage.put("jobName", metrics.getJobName()); - eventMessage.put("failInfo", failedEventMap); - collector.send(new OutgoingMessageEnvelope(sysStream, eventMessage)); - LOGGER.debug("Event sent to fail topic for job : " + metrics.getJobName()); - } -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/JSONUtils.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/JSONUtils.java deleted file mode 100644 index a441b9d78c..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/JSONUtils.java +++ /dev/null @@ -1,32 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import com.typesafe.config.ConfigFactory; -import org.apache.commons.lang.StringUtils; -import org.apache.samza.config.Config; -import org.codehaus.jackson.map.ObjectMapper; -import org.sunbird.common.Platform; - -import java.util.HashMap; -import java.util.Map; -import java.util.Map.Entry; - -public class JSONUtils { - - private static ObjectMapper mapper = new ObjectMapper();; - - public static String serialize(Object object) throws Exception { - return mapper.writeValueAsString(object); - } - - public static void loadProperties(Config config){ - Map props = new HashMap(); - for (Entry entry : config.entrySet()) { - if (StringUtils.equalsIgnoreCase("True", entry.getValue()) || StringUtils.equalsIgnoreCase("False", entry.getValue())) - props.put(entry.getKey(), entry.getValue().toLowerCase()); - else - props.put(entry.getKey(), entry.getValue()); - } - com.typesafe.config.Config conf = ConfigFactory.parseMap(props); - Platform.loadProperties(conf); - } -} \ No newline at end of file diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/JobLogger.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/JobLogger.java deleted file mode 100644 index 653eebfe83..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/JobLogger.java +++ /dev/null @@ -1,90 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import java.text.MessageFormat; -import java.util.Map; - -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -public class JobLogger { - - private final Logger logger; - - @SuppressWarnings("rawtypes") - public JobLogger(Class clazz) { - logger = LoggerFactory.getLogger(clazz); - } - - public void debug(String msg, Map event) { - if (logger.isDebugEnabled()) - try { - debug(msg, JSONUtils.serialize(event)); - } catch (Exception e) { - e.printStackTrace(); - } - } - - public void info(String msg, Map event) { - if (logger.isInfoEnabled()) - try { - info(msg, JSONUtils.serialize(event)); - } catch (Exception e) { - e.printStackTrace(); - } - } - - public void error(String msg, Map event, Throwable t) { - if (logger.isErrorEnabled()) - try { - error(msg, JSONUtils.serialize(event), t); - } catch (Exception e) { - e.printStackTrace(); - } - } - - public void debug(String msg) { - logger.debug(getLogMessage(msg, null)); - } - - public void debug(String msg, String event) { - logger.debug(getLogMessage(msg, event)); - } - - public void info(String msg) { - logger.info(getLogMessage(msg, null)); - } - - public void warn(String msg) { - logger.warn(getLogMessage(msg, null)); - } - - public void warn(String msg, Map event) { - if (logger.isWarnEnabled()) { - try { - warn(msg, JSONUtils.serialize(event)); - } catch (Exception e) { - e.printStackTrace(); - } - } - } - - private void warn(String msg, String event) { - logger.warn(getLogMessage(msg, event)); - } - - public void info(String msg, String event) { - logger.info(getLogMessage(msg, event)); - } - - public void error(String msg, Throwable t) { - logger.error(getLogMessage(msg, null), t); - } - - public void error(String msg, String event, Throwable t) { - logger.error(getLogMessage(msg, event), t); - } - - private String getLogMessage(String msg, String event) { - return event == null ? MessageFormat.format("Message: {0}", msg) : MessageFormat.format("Message: {0} | event:{1}", msg, event); - } -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/SamzaCommonParams.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/SamzaCommonParams.java deleted file mode 100644 index c5a2fc7b0c..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/SamzaCommonParams.java +++ /dev/null @@ -1,6 +0,0 @@ -package org.sunbird.jobs.samza.util; - -public enum SamzaCommonParams { - kafka, eid, edata, action, iteration, status, SUCCESS, FAILED, ets, submitted_date, processing_date, completed_date, latency, execution_time, execution, mid, reqid, eks, - level, INFO, message, object, jobclass, domain, context -} diff --git a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/TrackableENUM.java b/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/TrackableENUM.java deleted file mode 100644 index e252797d1f..0000000000 --- a/platform-jobs/samza/common/src/main/java/org/sunbird/jobs/samza/util/TrackableENUM.java +++ /dev/null @@ -1,5 +0,0 @@ -package org.sunbird.jobs.samza.util; - -public enum TrackableENUM { - Yes, No -} diff --git a/platform-jobs/samza/common/src/test/java/org/sunbird/samza/jobs/test/JSONUtilTest.java b/platform-jobs/samza/common/src/test/java/org/sunbird/samza/jobs/test/JSONUtilTest.java deleted file mode 100644 index 93d40374cb..0000000000 --- a/platform-jobs/samza/common/src/test/java/org/sunbird/samza/jobs/test/JSONUtilTest.java +++ /dev/null @@ -1,33 +0,0 @@ -package org.eksep.samza.jobs.test; - -import java.util.HashMap; -import java.util.Map; - -import org.sunbird.common.Platform; -import org.junit.Assert; -import org.junit.Test; - -import com.typesafe.config.ConfigFactory; - -public class JSONUtilTest { - - static Map configMap = new HashMap(); - - static { - configMap.put("elastic-search-host", "http://localhost"); - configMap.put("elastic-search-port", "9200"); - configMap.put("graph.dir", "/data/graphDB/"); - configMap.put("route.bolt.write.domain", "bolt://localhost:7687"); - configMap.put("graph.bolt.enable", "true"); - } - - @Test - public void loadConfigProps_1() { - com.typesafe.config.Config conf = ConfigFactory.parseMap(configMap); - Platform.loadProperties(conf); - String route = Platform.config.getString("route.bolt.write.domain"); - Assert.assertEquals("bolt://localhost:7687", route); - Assert.assertEquals("/data/graphDB/", Platform.config.getString("graph.dir")); - Assert.assertTrue(Platform.config.getBoolean("graph.bolt.enable")); - } -} diff --git a/platform-jobs/samza/common/src/test/java/org/sunbird/samza/jobs/test/JobMetricsTest.java b/platform-jobs/samza/common/src/test/java/org/sunbird/samza/jobs/test/JobMetricsTest.java deleted file mode 100644 index 9fa4472090..0000000000 --- a/platform-jobs/samza/common/src/test/java/org/sunbird/samza/jobs/test/JobMetricsTest.java +++ /dev/null @@ -1,120 +0,0 @@ -package org.eksep.samza.jobs.test; - -import org.apache.samza.Partition; -import org.apache.samza.metrics.Counter; -import org.apache.samza.metrics.Metric; -import org.apache.samza.metrics.MetricsRegistry; -import org.apache.samza.system.SystemStreamPartition; -import org.apache.samza.task.TaskContext; -import org.sunbird.jobs.samza.service.task.JobMetrics; -import org.junit.Before; -import org.junit.Test; - -import java.util.HashSet; -import java.util.Map; -import java.util.Set; -import java.util.concurrent.ConcurrentHashMap; - -import static org.mockito.Mockito.anyString; -import static org.mockito.Mockito.mock; -import static org.mockito.Mockito.stub; -import static org.mockito.Mockito.when; -import static org.junit.Assert.assertEquals; - -/** - * JobMetrics Test for Consumer Lag Computation - * - * @author Kumar Gauraw - */ -public class JobMetricsTest { - - private TaskContext contextMock; - private JobMetrics jobMetricsMock; - - @Before - public void setUp() { - contextMock = mock(TaskContext.class); - MetricsRegistry metricsRegistry = mock(MetricsRegistry.class); - Counter counter = mock(Counter.class); - stub(metricsRegistry.newCounter(anyString(), anyString())).toReturn(counter); - stub(contextMock.getMetricsRegistry()).toReturn(metricsRegistry); - } - - @Test - public void testConsumerLagWithMultipleTopicEventProcessed() { - - jobMetricsMock = new JobMetrics(contextMock); - - Set systemStreamPartitions = new HashSet<>(); - SystemStreamPartition systemStreamTopic1Partition0 = new SystemStreamPartition("kafka", "topic1", new Partition(0)); - SystemStreamPartition systemStreamTopic2Partition0 = new SystemStreamPartition("kafka", "topic2", new Partition(0)); - systemStreamPartitions.add(systemStreamTopic1Partition0); - systemStreamPartitions.add(systemStreamTopic2Partition0); - - Map> concurrentHashMap = MetricsStreamStub.getMetricMap(MetricsStreamStub.METRIC_STREAM_SOME_EVENT_MULTI_PARTITION); - - when(contextMock.getSystemStreamPartitions()).thenReturn(systemStreamPartitions); - long consumer_lag = jobMetricsMock.computeConsumerLag(concurrentHashMap); - assertEquals(55, consumer_lag); - - } - - @Test - public void testConsumerLagWithMultiplePartitionEventProcessed() { - - jobMetricsMock = new JobMetrics(contextMock); - - Set systemStreamPartitions = new HashSet<>(); - SystemStreamPartition systemStreamTopic1Partition0 = new SystemStreamPartition("kafka", "topic1", new Partition(0)); - SystemStreamPartition systemStreamTopic1Partition1 = new SystemStreamPartition("kafka", "topic1", new Partition(1)); - systemStreamPartitions.add(systemStreamTopic1Partition0); - systemStreamPartitions.add(systemStreamTopic1Partition1); - - Map> concurrentHashMap = MetricsStreamStub.getMetricMap(MetricsStreamStub.METRIC_STREAM_SOME_EVENT_SINGLE_TOPIC_MULTI_PARTITION); - - when(contextMock.getSystemStreamPartitions()).thenReturn(systemStreamPartitions); - long consumer_lag = jobMetricsMock.computeConsumerLag(concurrentHashMap); - assertEquals(55, consumer_lag); - - } - - @Test - public void testConsumerLagWithNoEventProcessed() { - - jobMetricsMock = new JobMetrics(contextMock); - - Set systemStreamPartitions = new HashSet<>(); - SystemStreamPartition systemStreamPartition = new SystemStreamPartition("kafka", "test.topic", new Partition(0)); - systemStreamPartitions.add(systemStreamPartition); - - Map> concurrentHashMap = MetricsStreamStub.getMetricMap(MetricsStreamStub.METRIC_STREAM_NO_EVENT); - - when(contextMock.getSystemStreamPartitions()).thenReturn(systemStreamPartitions); - long consumer_lag = jobMetricsMock.computeConsumerLag(concurrentHashMap); - assertEquals(0, consumer_lag); - - } - - @Test - public void testConsumerLagWithSystemCommandStream() { - jobMetricsMock = new JobMetrics(contextMock); - Set systemStreamPartitions = new HashSet<>(); - - SystemStreamPartition streamSysCommand = new SystemStreamPartition("kafka", "system.command", new Partition(0)); - SystemStreamPartition streamJobReq = new SystemStreamPartition("kafka", "learning.job.request", new Partition(0)); - systemStreamPartitions.add(streamSysCommand); - systemStreamPartitions.add(streamJobReq); - - Map> concurrentHashMap = MetricsStreamStub.getMetricMap(MetricsStreamStub.SAMZA_EVENT_STREAM_WITH_SYSTEM_COMMAND); - - when(contextMock.getSystemStreamPartitions()).thenReturn(systemStreamPartitions); - long consumer_lag = jobMetricsMock.computeConsumerLag(concurrentHashMap); - System.out.println("consumer_lag :"+consumer_lag); - //Before Ignoring system.command stream - //assertEquals(110, consumer_lag); - - //After Ignoring system.command stream - assertEquals(100, consumer_lag); - - } -} diff --git a/platform-jobs/samza/common/src/test/java/org/sunbird/samza/jobs/test/MetricsStreamStub.java b/platform-jobs/samza/common/src/test/java/org/sunbird/samza/jobs/test/MetricsStreamStub.java deleted file mode 100644 index 31cf1fa233..0000000000 --- a/platform-jobs/samza/common/src/test/java/org/sunbird/samza/jobs/test/MetricsStreamStub.java +++ /dev/null @@ -1,138 +0,0 @@ -package org.eksep.samza.jobs.test; - -import com.google.gson.Gson; -import com.google.gson.reflect.TypeToken; -import org.apache.samza.metrics.Counter; -import org.apache.samza.metrics.Metric; - -import java.lang.reflect.Type; -import java.util.Map; -import java.util.concurrent.ConcurrentHashMap; - -/** - * Stub for Consumer Metric Stream - * - * @author Kumar Gauraw - */ -public class MetricsStreamStub { - - public static final String METRIC_STREAM_NO_EVENT = "{\n" + - " \"org.apache.samza.system.kafka.KafkaSystemConsumerMetrics\": {\n" + - " \"kafka-test.topic-0-high-watermark\": {\n" + - " \"name\": \"kafka-test.topic-0-high-watermark\",\n" + - " \"count\": {\n" + - " \"value\": 0\n" + - " }\n" + - " }\n" + - " },\n" + - " \"org.apache.samza.checkpoint.OffsetManagerMetrics\": {\n" + - " \"kafka-test.topic-0-checkpointed-offset\": {\n" + - " \"name\": \"kafka-test.topic-0-checkpointed-offset\",\n" + - " \"count\": {\n" + - " \"value\": 0\n" + - " }\n" + - " }\n" + - " }\n" + - "}"; - - public static final String METRIC_STREAM_SOME_EVENT_MULTI_PARTITION = "{\n" + - " \"org.apache.samza.system.kafka.KafkaSystemConsumerMetrics\": {\n" + - " \"kafka-topic1-0-high-watermark\": {\n" + - " \"name\": \"kafka-topic1-0-high-watermark\",\n" + - " \"count\": {\n" + - " \"value\": 10\n" + - " }\n" + - " },\n" + - " \"kafka-topic2-0-high-watermark\": {\n" + - " \"name\": \"kafka-topic2-0-high-watermark\",\n" + - " \"count\": {\n" + - " \"value\": 100\n" + - " }\n" + - " }\n" + - " },\n" + - " \"org.apache.samza.checkpoint.OffsetManagerMetrics\": {\n" + - " \"kafka-topic1-0-checkpointed-offset\": {\n" + - " \"name\": \"kafka-topic1-0-checkpointed-offset\",\n" + - " \"count\": {\n" + - " \"value\": 5\n" + - " }\n" + - " },\n" + - " \"kafka-topic2-0-checkpointed-offset\": {\n" + - " \"name\": \"kafka-topic2-0-checkpointed-offset\",\n" + - " \"count\": {\n" + - " \"value\": 50\n" + - " }\n" + - " }\n" + - " }\n" + - "}"; - - public static final String SAMZA_EVENT_STREAM_WITH_SYSTEM_COMMAND = "{\n" + - " \"org.apache.samza.system.kafka.KafkaSystemConsumerMetrics\": {\n" + - " \"kafka-system.command-0-high-watermark\": {\n" + - " \"name\": \"kafka-system.command-0-high-watermark\",\n" + - " \"count\": {\n" + - " \"value\": 100\n" + - " }\n" + - " },\n" + - " \"kafka-learning.job.request-0-high-watermark\": {\n" + - " \"name\": \"kafka-learning.job.request-0-high-watermark\",\n" + - " \"count\": {\n" + - " \"value\": 200\n" + - " }\n" + - " }\n" + - " },\n" + - " \"org.apache.samza.checkpoint.OffsetManagerMetrics\": {\n" + - " \"kafka-system.command-0-checkpointed-offset\": {\n" + - " \"name\": \"kafka-system.command-0-checkpointed-offset\",\n" + - " \"count\": {\n" + - " \"value\": 90\n" + - " }\n" + - " },\n" + - " \"kafka-learning.job.request-0-checkpointed-offset\": {\n" + - " \"name\": \"kafka-learning.job.request-0-checkpointed-offset\",\n" + - " \"count\": {\n" + - " \"value\": 100\n" + - " }\n" + - " }\n" + - " }\n" + - "}"; - - public static final String METRIC_STREAM_SOME_EVENT_SINGLE_TOPIC_MULTI_PARTITION = "{\n" + - " \"org.apache.samza.system.kafka.KafkaSystemConsumerMetrics\": {\n" + - " \"kafka-topic1-0-high-watermark\": {\n" + - " \"name\": \"kafka-topic1-0-high-watermark\",\n" + - " \"count\": {\n" + - " \"value\": 10\n" + - " }\n" + - " },\n" + - " \"kafka-topic1-1-high-watermark\": {\n" + - " \"name\": \"kafka-topic1-1-high-watermark\",\n" + - " \"count\": {\n" + - " \"value\": 100\n" + - " }\n" + - " }\n" + - " },\n" + - " \"org.apache.samza.checkpoint.OffsetManagerMetrics\": {\n" + - " \"kafka-topic1-0-checkpointed-offset\": {\n" + - " \"name\": \"kafka-topic1-0-checkpointed-offset\",\n" + - " \"count\": {\n" + - " \"value\": 5\n" + - " }\n" + - " },\n" + - " \"kafka-topic1-1-checkpointed-offset\": {\n" + - " \"name\": \"kafka-topic1-1-checkpointed-offset\",\n" + - " \"count\": {\n" + - " \"value\": 50\n" + - " }\n" + - " }\n" + - " }\n" + - "}"; - - - public static Map> getMetricMap(String message) { - Type type = new TypeToken>>() { - }.getType(); - return (Map>) new Gson().fromJson(message, type); - } - -} diff --git a/platform-jobs/samza/course-common/pom.xml b/platform-jobs/samza/course-common/pom.xml deleted file mode 100644 index f5eb486a11..0000000000 --- a/platform-jobs/samza/course-common/pom.xml +++ /dev/null @@ -1,30 +0,0 @@ - - 4.0.0 - - org.sunbird - samza - 1.1-SNAPSHOT - - course-common - - - org.sunbird - samza-common - 1.1-SNAPSHOT - - - - - - - org.apache.maven.plugins - maven-compiler-plugin - - 1.8 - 1.8 - - - - - diff --git a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/task/BaseTask.java b/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/task/BaseTask.java deleted file mode 100644 index 44b985bbd0..0000000000 --- a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/task/BaseTask.java +++ /dev/null @@ -1,189 +0,0 @@ -package org.sunbird.jobs.samza.task; - -import org.apache.commons.lang3.StringUtils; -import org.apache.samza.config.Config; -import org.apache.samza.system.IncomingMessageEnvelope; -import org.apache.samza.system.OutgoingMessageEnvelope; -import org.apache.samza.system.SystemStream; -import org.apache.samza.task.InitableTask; -import org.apache.samza.task.MessageCollector; -import org.apache.samza.task.StreamTask; -import org.apache.samza.task.TaskContext; -import org.apache.samza.task.TaskCoordinator; -import org.apache.samza.task.WindowableTask; -import org.sunbird.common.Platform; -import org.sunbird.jobs.samza.service.ISamzaService; -import org.sunbird.jobs.samza.service.task.JobMetrics; -import org.sunbird.jobs.samza.util.SamzaCommonParams; -import org.sunbird.telemetry.TelemetryGenerator; -import org.sunbird.telemetry.TelemetryParams; -import org.sunbird.telemetry.handler.Level; - -import java.util.ArrayList; -import java.util.HashMap; -import java.util.List; -import java.util.Map; -import java.util.UUID; - -public abstract class BaseTask implements StreamTask, InitableTask, WindowableTask { - protected JobMetrics metrics; - protected Config config = null; - protected String eventId = ""; - protected List action = new ArrayList<>(); - protected String jobStartMessage = ""; - protected String jobEndMessage = ""; - protected String jobClass = ""; - - protected static String mid = "LP."+ UUID.randomUUID(); - protected static String startJobEventId = "JOB_START"; - protected static String endJobEventId = "JOB_END"; - protected static int MAXITERTIONCOUNT= 2; - - @Override - public void init(Config config, TaskContext context) throws Exception { - metrics = new JobMetrics(context, config.get("output.metrics.job.name"), config.get("output.metrics.topic.name")); - this.config = config; - this.eventId = "BE_JOB_REQUEST"; - ISamzaService service = initialize(); - service.initialize(config); - } - - public abstract ISamzaService initialize() throws Exception; - - protected int getMaxIterations() { - if(Platform.config.hasPath("max.iteration.count.samza.job")) - return Platform.config.getInt("max.iteration.count.samza.job"); - else - return MAXITERTIONCOUNT; - } - - @SuppressWarnings("unchecked") - @Override - public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) throws Exception { - Map message = (Map) envelope.getMessage(); - Map execution = new HashMap<>(); - int maxIterations = getMaxIterations(); - String eid = (String) message.get(SamzaCommonParams.eid.name()); - Map edata = (Map) message.getOrDefault(SamzaCommonParams.edata.name(), new HashMap()); - if(StringUtils.equalsIgnoreCase(this.eventId, eid)) { - String action = (String) edata.get(SamzaCommonParams.action.name()); - if(this.action.contains(action)) { - int currentIteration = ((Number) edata.get(SamzaCommonParams.iteration.name())).intValue(); - - preProcess(message, collector, execution, maxIterations, currentIteration); - process(message, collector, coordinator); - postProcess(message, collector, execution, maxIterations, currentIteration); - } else{ - //Throw exception has to be added. - } - } else { - //Throw exception has to be added. - } - } - - public abstract void process(Map message, MessageCollector collector, TaskCoordinator coordinator) throws Exception; - - public void preProcess(Map message, MessageCollector collector, Map execution, int maxIterationCount, int iterationCount) { - if (isInvalidMessage(message)) { - String event = generateEvent(Level.ERROR.name(), "Samza job de-serialization error", message); - collector.send(new OutgoingMessageEnvelope(new SystemStream(SamzaCommonParams.kafka.name(), this.config.get("kafka.topics.backend.telemetry")), event)); - } - try { - if(iterationCount <= maxIterationCount) { - Map jobStartEvent = getJobEvent("JOBSTARTEVENT", message); - - execution.put(SamzaCommonParams.submitted_date.name(), (long)message.get(SamzaCommonParams.ets.name())); - execution.put(SamzaCommonParams.processing_date.name(), (long)jobStartEvent.get(SamzaCommonParams.ets.name())); - execution.put(SamzaCommonParams.latency.name(), (long)jobStartEvent.get(SamzaCommonParams.ets.name()) - (long)message.get(SamzaCommonParams.ets.name())); - - pushEvent(jobStartEvent, collector, this.config.get("kafka.topics.backend.telemetry")); - } - }catch (Exception e) { - e.printStackTrace(); - } - } - - - @SuppressWarnings("unchecked") - public void postProcess(Map message, MessageCollector collector, Map execution, int maxIterationCount, int iterationCount) throws Exception { - try { - if(iterationCount <= maxIterationCount) { - Map jobEndEvent = getJobEvent("JOBENDEVENT", message); - - execution.put(SamzaCommonParams.completed_date.name(), (long)jobEndEvent.get(SamzaCommonParams.ets.name())); - execution.put(SamzaCommonParams.execution_time.name(), (long)jobEndEvent.get(SamzaCommonParams.ets.name()) - (long)execution.get(SamzaCommonParams.processing_date.name())); - Map eks = (Map)((Map)jobEndEvent.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.eks.name()); - eks.put(SamzaCommonParams.execution.name(), execution); - //addExecutionTime(jobEndEvent, execution); //Call to add execution time - - pushEvent(jobEndEvent, collector, this.config.get("kafka.topics.backend.telemetry")); - } - }catch(Exception e) { - e.printStackTrace(); - } - } - - private void pushEvent(Map message, MessageCollector collector, String topicId) throws Exception { - try { - //TODO: Fix Event Template for "START" & "END" Event and enable below line for backend telemetry. - //collector.send(new OutgoingMessageEnvelope(new SystemStream(SamzaCommonParams.kafka.name(), topicId), message)); - } catch (Exception e) { - e.printStackTrace(); - } - } - - @SuppressWarnings("unchecked") - public Map getJobEvent(String jobEvendID, Map message){ - - long unixTime = System.currentTimeMillis(); - Map jobEvent = new HashMap<>(); - - jobEvent.put(SamzaCommonParams.ets.name(), unixTime); - jobEvent.put(SamzaCommonParams.mid.name(), mid); - - Map edata = new HashMap<>(); - Map eks = new HashMap<>(); - eks.put(SamzaCommonParams.ets.name(), message.get(SamzaCommonParams.ets.name())); - eks.put(SamzaCommonParams.action.name(), ((Map) message.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.action.name())); - eks.put(SamzaCommonParams.iteration.name(), ((Map) message.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.iteration.name())); - eks.put(SamzaCommonParams.status.name(), ((Map) message.get(SamzaCommonParams.edata.name())).get(SamzaCommonParams.status.name())); - eks.put(SamzaCommonParams.reqid.name(), message.get(SamzaCommonParams.mid.name())); - edata.put(SamzaCommonParams.eks.name(), eks); - edata.put(SamzaCommonParams.level.name(), SamzaCommonParams.INFO.name()); - edata.put(SamzaCommonParams.jobclass.name(), this.jobClass); - edata.put(SamzaCommonParams.object.name(), message.get("object")); - - - if(StringUtils.equalsIgnoreCase(jobEvendID, "JOBSTARTEVENT")) { - jobEvent.put(SamzaCommonParams.eid.name(), startJobEventId); - edata.put(SamzaCommonParams.message.name(), this.jobStartMessage); - } - else if(StringUtils.equalsIgnoreCase(jobEvendID, "JOBENDEVENT")) { - jobEvent.put(SamzaCommonParams.eid.name(), endJobEventId); - edata.put(SamzaCommonParams.message.name(), this.jobEndMessage); - } - - jobEvent.put(SamzaCommonParams.edata.name(), edata); - return jobEvent; - } - - protected boolean isInvalidMessage(Map message) { - return (message == null || (null != message && message.containsKey("serde") - && "error".equalsIgnoreCase((String) message.get("serde")))); - } - - @Override - public void window(MessageCollector collector, TaskCoordinator coordinator) throws Exception { - Map event = metrics.collect(); - collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", metrics.getTopic()), event)); - metrics.clear(); - } - - private String generateEvent(String logLevel, String message, Map data) { - Map context = new HashMap(); - context.put(TelemetryParams.ACTOR.name(), "org.sunbird.learning.platform"); - context.put(TelemetryParams.ENV.name(), "content"); - context.put(TelemetryParams.CHANNEL.name(), Platform.config.getString("channel.default")); - return TelemetryGenerator.log(context, "system", logLevel, message); - } -} diff --git a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/CassandraConnector.java b/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/CassandraConnector.java deleted file mode 100644 index 91d3fad667..0000000000 --- a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/CassandraConnector.java +++ /dev/null @@ -1,37 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import com.datastax.driver.core.Cluster; -import com.datastax.driver.core.ConsistencyLevel; -import com.datastax.driver.core.QueryOptions; -import com.datastax.driver.core.Session; -import org.apache.samza.config.Config; - -import java.net.InetSocketAddress; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; - -public class CassandraConnector { - private Session session = null; - - public CassandraConnector(Config config) { - List connectionInfo = Arrays.asList(config.get("cassandra.connection.platform_courses", "localhost:9042").split(",")); - List addresses = getSocketAddress(connectionInfo); - session = Cluster.builder().addContactPointsWithPorts(addresses).withQueryOptions(new QueryOptions().setConsistencyLevel(ConsistencyLevel.QUORUM)).build().connect(); - } - - private static List getSocketAddress(List hosts) { - List connectionList = new ArrayList<>(); - for (String connection : hosts) { - String[] conn = connection.split(":"); - String host = conn[0]; - int port = Integer.valueOf(conn[1]); - connectionList.add(new InetSocketAddress(host, port)); - } - return connectionList; - } - - public Session getSession() { - return session; - } -} diff --git a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/RedisConnect.java b/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/RedisConnect.java deleted file mode 100644 index b7120e6e5e..0000000000 --- a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/RedisConnect.java +++ /dev/null @@ -1,37 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import org.apache.samza.config.Config; -import redis.clients.jedis.Jedis; - -public class RedisConnect { - - private Config config; - - public RedisConnect(Config config) { - this.config = config; - } - - private Jedis getConnection(long backoffTimeInMillis) { - String redisHost = config.get("redis.host", "localhost"); - Integer redisPort = config.getInt("redis.port", 6379); - if(backoffTimeInMillis > 0) { - try { - Thread.sleep(backoffTimeInMillis); - } catch (InterruptedException e) { - e.printStackTrace(); - } - } - return new Jedis(redisHost, redisPort, 30000); - } - - public Jedis getConnection(int db, long backoffTimeInMillis) { - - Jedis jedis = getConnection(backoffTimeInMillis); - jedis.select(db); - return jedis; - } - - public Jedis getConnection() { - return getConnection(0); - } -} diff --git a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/SunbirdCassandraColumnMapper.java b/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/SunbirdCassandraColumnMapper.java deleted file mode 100644 index 4c3131630b..0000000000 --- a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/SunbirdCassandraColumnMapper.java +++ /dev/null @@ -1,56 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import java.util.Map; -import java.util.HashMap; - -public class SunbirdCassandraColumnMapper { - - private static Map COLUMN_MAPPING = new HashMap<>(); - - public static Map getColumnMapping() { - if(COLUMN_MAPPING.isEmpty()) { - //sunbird_courses.user_courses table - COLUMN_MAPPING.put("batchid", "batchId"); - COLUMN_MAPPING.put("userid", "userId"); - COLUMN_MAPPING.put("active", "active"); - COLUMN_MAPPING.put("addedby", "addedBy"); - COLUMN_MAPPING.put("completedon","completedOn"); - COLUMN_MAPPING.put("completionpercentage","completionPercentage"); - COLUMN_MAPPING.put("contentstatus", "contentStatus"); - COLUMN_MAPPING.put("courseid", "courseId"); - COLUMN_MAPPING.put("datetime", "dateTime"); - COLUMN_MAPPING.put("delta", "delta"); - COLUMN_MAPPING.put("enrolleddate","enrolledDate"); - COLUMN_MAPPING.put("grade","grade"); - COLUMN_MAPPING.put("lastreadcontentid", "lastReadContentId"); - COLUMN_MAPPING.put("lastreadcontentstatus", "lastReadContentStatus"); - COLUMN_MAPPING.put("progress","progress"); - COLUMN_MAPPING.put("status","status"); - - //sunbird_courses.content_consumption table - COLUMN_MAPPING.put("contentid", "contentId"); - COLUMN_MAPPING.put("completedcount", "completedCount"); - COLUMN_MAPPING.put("contentversion", "contentVersion"); - COLUMN_MAPPING.put("lastaccesstime", "lastAccessTime"); - COLUMN_MAPPING.put("lastcompletedtime","lastCompletedTime"); - COLUMN_MAPPING.put("lastupdatedtime","lastUpdatedTime"); - COLUMN_MAPPING.put("result", "result"); - COLUMN_MAPPING.put("score", "score"); - COLUMN_MAPPING.put("viewcount", "viewCount"); - - //sunbird_courses.course_batch table - COLUMN_MAPPING.put("createdby", "createdBy"); - COLUMN_MAPPING.put("createddate","createdDate"); - COLUMN_MAPPING.put("createdfor","createdFor"); - COLUMN_MAPPING.put("description", "description"); - COLUMN_MAPPING.put("enddate", "endDate"); - COLUMN_MAPPING.put("enrollmentenddate","enrollmentEndDate"); - COLUMN_MAPPING.put("enrollmenttype","enrollmentType"); - COLUMN_MAPPING.put("mentors", "mentors"); - COLUMN_MAPPING.put("name", "name"); - COLUMN_MAPPING.put("startdate","startDate"); - COLUMN_MAPPING.put("updateddate","updatedDate"); - } - return COLUMN_MAPPING; - } -} diff --git a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/SunbirdCassandraUtil.java b/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/SunbirdCassandraUtil.java deleted file mode 100644 index 29cf94461b..0000000000 --- a/platform-jobs/samza/course-common/src/main/java/org/sunbird/jobs/samza/util/SunbirdCassandraUtil.java +++ /dev/null @@ -1,102 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import com.datastax.driver.core.ResultSet; -import com.datastax.driver.core.Row; -import com.datastax.driver.core.Session; -import com.datastax.driver.core.querybuilder.QueryBuilder; -import com.datastax.driver.core.querybuilder.Select; -import com.datastax.driver.core.querybuilder.Update; -import com.datastax.driver.core.querybuilder.Insert; -import com.datastax.driver.core.querybuilder.Delete; -import org.apache.commons.collections4.CollectionUtils; -import org.apache.commons.collections4.MapUtils; -import org.sunbird.cassandra.connector.util.CassandraConnector; - -import java.util.ArrayList; -import java.util.HashMap; -import java.util.List; -import java.util.Map; - -public class SunbirdCassandraUtil { - - private static Map COLUMN_MAPPING = SunbirdCassandraColumnMapper.getColumnMapping(); - - public static void update(Session session, String keyspace, String table, Map propertiesToUpdate, Map propertiesToSelect) { - Update.Where updateQuery = QueryBuilder.update(keyspace, table).where(); - propertiesToUpdate.entrySet().forEach(entry -> updateQuery.with(QueryBuilder.set(entry.getKey(), entry.getValue()))); - propertiesToSelect.entrySet().forEach(entry -> { - if (entry.getValue() instanceof List) - updateQuery.and(QueryBuilder.in(entry.getKey(), (List) entry.getValue())); - else - updateQuery.and(QueryBuilder.eq(entry.getKey(), entry.getValue())); - }); - - - session.execute(updateQuery); - } - - public static ResultSet read(Session session, String keyspace, String table, Map propertiesToSelect) { - Select.Where selectQuery = QueryBuilder.select().all().from(keyspace, table).where(); - propertiesToSelect.entrySet().forEach(entry -> { - if (entry.getValue() instanceof List) - selectQuery.and(QueryBuilder.in(entry.getKey(), (List) entry.getValue())); - else - selectQuery.and(QueryBuilder.eq(entry.getKey(), entry.getValue())); - }); - ResultSet results = session.execute(selectQuery); - return results; - } - - public static List> readAsListOfMap(Session session, String keyspace, String table, Map propertiesToSelect) { - ResultSet resultSet = read(session, keyspace, table, convertKeyCase(propertiesToSelect)); - List rows = resultSet.all(); - List> response = new ArrayList>(); - if (CollectionUtils.isNotEmpty(rows)) { - for (Row row : rows) { - Map rowMap = new HashMap(); - row.getColumnDefinitions().forEach(column -> rowMap.put(COLUMN_MAPPING.get(column.getName()), row.getObject(column.getName()))); - response.add(rowMap); - } - } - return response; - } - - public static List> readAsListOfMap(String keyspace, String table, Map propertiesToSelect) { - Session session = CassandraConnector.getSession("platform-courses"); - return readAsListOfMap(session, keyspace, table, propertiesToSelect); - } - - public static void upsert(String keyspace, String table, Map properties) { - Session session = CassandraConnector.getSession("platform-courses"); - Insert insertQuery = QueryBuilder.insertInto(keyspace, table); - convertKeyCase(properties).entrySet().forEach(entry -> insertQuery.value(entry.getKey(), entry.getValue())); - session.execute(insertQuery); - } - - public static void delete(String keyspace, String table, Map properties) { - Session session = CassandraConnector.getSession("platform-courses"); - Delete.Where deleteQuery = QueryBuilder.delete().from(keyspace, table).where(); - convertKeyCase(properties).entrySet().forEach(entry -> { - deleteQuery.and(QueryBuilder.eq(entry.getKey(), entry.getValue())); - }); - session.execute(deleteQuery); - } - - private static Map convertKeyCase(Map properties) { - Map keyLowerCaseMap = new HashMap<>(); - if (MapUtils.isNotEmpty(properties)) { - properties.entrySet().forEach(entry -> { - if (null != entry && null != entry.getKey()) { - keyLowerCaseMap.put(entry.getKey().toLowerCase(), entry.getValue()); - } - }); - } - return keyLowerCaseMap; - } - - public static ResultSet execute(Session cassandraSession, String query) { - ResultSet results = cassandraSession.execute(query); - return results; - } - -} \ No newline at end of file diff --git a/platform-jobs/samza/distribution/.gitignore b/platform-jobs/samza/distribution/.gitignore deleted file mode 100644 index b83d22266a..0000000000 --- a/platform-jobs/samza/distribution/.gitignore +++ /dev/null @@ -1 +0,0 @@ -/target/ diff --git a/platform-jobs/samza/distribution/pom.xml b/platform-jobs/samza/distribution/pom.xml deleted file mode 100644 index 968b50ffef..0000000000 --- a/platform-jobs/samza/distribution/pom.xml +++ /dev/null @@ -1,69 +0,0 @@ - - 4.0.0 - - org.sunbird - samza - 1.1-SNAPSHOT - - distribution - pom - Distribution - - - org.sunbird - qrcode-image-generator - 0.0.31 - tar.gz - distribution - - - org.sunbird - publish-pipeline - 0.0.386 - tar.gz - distribution - - - org.sunbird - merge-user-courses - 0.0.19 - tar.gz - distribution - - - org.sunbird - auto-creator - 0.0.39 - tar.gz - distribution - - - org.sunbird - mvc-processor-indexer - 1.3.7 - tar.gz - distribution - - - - - - maven-assembly-plugin - - - distro-assembly - package - - single - - - - src/main/assembly/src.xml - - - - - - - - diff --git a/platform-jobs/samza/distribution/src/main/assembly/src.xml b/platform-jobs/samza/distribution/src/main/assembly/src.xml deleted file mode 100644 index 92246c444b..0000000000 --- a/platform-jobs/samza/distribution/src/main/assembly/src.xml +++ /dev/null @@ -1,14 +0,0 @@ - - distribution - false - - tar.gz - - - - - false - false - - - \ No newline at end of file diff --git a/platform-jobs/samza/merge-user-courses/pom.xml b/platform-jobs/samza/merge-user-courses/pom.xml deleted file mode 100644 index e7cb6dc116..0000000000 --- a/platform-jobs/samza/merge-user-courses/pom.xml +++ /dev/null @@ -1,139 +0,0 @@ - - - - samza - org.sunbird - 1.1-SNAPSHOT - - 4.0.0 - - merge-user-courses - - - UTF-8 - 0.12.0 - 2.11 - 2.6.2 - - 0.0.19 - - - org.sunbird - course-common - 1.1-SNAPSHOT - - - org.sunbird - unit-tests - 1.1-SNAPSHOT - test - - - org.mockito - mockito-all - 1.10.19 - test - - - com.fasterxml.jackson.core - jackson-databind - 2.7.8 - - - com.fasterxml.jackson.core - jackson-core - 2.6.0 - - - com.fasterxml.jackson.core - jackson-annotations - 2.7.8 - - - org.powermock - powermock-api-mockito - 1.7.4 - test - - - org.powermock - powermock-module-junit4 - 1.7.4 - test - - - - - - org.apache.maven.plugins - maven-compiler-plugin - - 1.8 - 1.8 - - - - - maven-assembly-plugin - - - src/main/assembly/src.xml - - - - - make-assembly - package - - single - - - - - - org.apache.maven.plugins - maven-surefire-plugin - 2.20 - - - - org.jacoco - jacoco-maven-plugin - 0.7.9 - - - **/common/** - **/dto/** - **/enums/** - **/pipeline/** - **/servlet/** - **/interceptor/** - - - - - default-prepare-agent - - prepare-agent - - - - default-report - prepare-package - - report - - - - report-aggregate - verify - - report-aggregate - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/merge-user-courses/src/main/assembly/src.xml b/platform-jobs/samza/merge-user-courses/src/main/assembly/src.xml deleted file mode 100644 index b8c4bf8a85..0000000000 --- a/platform-jobs/samza/merge-user-courses/src/main/assembly/src.xml +++ /dev/null @@ -1,69 +0,0 @@ - - - - - distribution - - tar.gz - - false - - - ${basedir} - - README* - LICENSE* - NOTICE* - - - - - - ${basedir}/src/main/resources/log4j.xml - lib - - - - ${basedir}/src/main/config/merge-user-courses.properties - config - true - - - - - bin - - org.apache.samza:samza-shell:tgz:dist:* - - 0744 - true - - - lib - - org.apache.samza:samza-api - org.sunbird:merge-user-courses - org.apache.samza:samza-core_2.11 - org.apache.samza:samza-kafka_2.11 - org.apache.samza:samza-yarn_2.11 - org.apache.samza:samza-log4j - org.apache.kafka:kafka_2.11 - org.apache.hadoop:hadoop-hdfs - - true - - - \ No newline at end of file diff --git a/platform-jobs/samza/merge-user-courses/src/main/config/local.merge-user-courses.properties b/platform-jobs/samza/merge-user-courses/src/main/config/local.merge-user-courses.properties deleted file mode 100644 index d162cc8e60..0000000000 --- a/platform-jobs/samza/merge-user-courses/src/main/config/local.merge-user-courses.properties +++ /dev/null @@ -1,81 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=local.merge-user-courses - -# YARN -yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-distribution.tar.gz - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.local.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory - -# Task -task.class=org.sunbird.jobs.samza.task.MergeUserCoursesTask -task.inputs=kafka.local.lms.user.account.merge -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=1 -task.commit.ms=60000 -task.window.ms=300000 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=localhost:2181 -systems.kafka.consumer.auto.offset.reset=smallest -systems.kafka.samza.offset.default=oldest -systems.kafka.producer.bootstrap.servers=localhost:9092 - -# Job Coordinator -job.coordinator.system=kafka - -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=1 - -# Job specific configuration - -# Metrics -output.metrics.job.name=merge-user-courses -output.metrics.topic.name=local.pipeline_metrics -kafka.topics.backend.telemetry=local.telemetry.raw - -#Failed Topic Config -output.failed.events.topic.name=local.learning.events.failed - -# Retry Topic -kafka.topics.failed=local.lms.user.account.merge - -#Remote Debug Configuration -# task.opts=-agentlib:jdwp=transport=dt_socket,address=localhost:9009,server=y,suspend=y - -# Configuration for default channel ID -channel.default=in.ekstep - -#elastic-search -sunbird_es_cluster=local.lms.es.cluster -sunbird_es_host=127.0.0.1 -sunbird_es_port=9200 - -cassandra.lp.connection=localhost:9042 -cassandra.lpa.connection=localhost:9042 - -cassandra.connection.platform_courses=localhost:9042 -kp.learning_service.base_url=https://dev.sunbirded.org/action -courses.keyspace.name=sunbird_courses -search.es_conn_info=localhost:9200 -job.time_zone=IST -sunbird.installation=local -user.courses.table=user_enrolments -content.consumption.table=user_content_consumption -user.courses.es.index=user-courses -user.courses.es.type=_doc -course.batch.updater.kafka.topic=local.coursebatch.job.request -max.iteration.count.samza.job=2 -course.date.format=yyyy-MM-dd HH:mm:ss:SSSZ \ No newline at end of file diff --git a/platform-jobs/samza/merge-user-courses/src/main/config/merge-user-courses.properties b/platform-jobs/samza/merge-user-courses/src/main/config/merge-user-courses.properties deleted file mode 100644 index f386bc21c3..0000000000 --- a/platform-jobs/samza/merge-user-courses/src/main/config/merge-user-courses.properties +++ /dev/null @@ -1,78 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=__env__.merge-user-courses - -# YARN -yarn.package.path=http://__yarn_host__:__yarn_port__/__env__/${project.artifactId}-${pom.version}-distribution.tar.gz - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.__env__.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory - -# Task -task.class=org.sunbird.jobs.samza.task.MergeUserCoursesTask -task.inputs=kafka.__env__.lms.user.account.merge -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=__samza_checkpoint_replication_factor__ -task.commit.ms=60000 -task.window.ms=300000 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=__zookeepers__ -systems.kafka.consumer.auto.offset.reset=smallest -systems.kafka.samza.offset.default=oldest -systems.kafka.producer.bootstrap.servers=__kafka_brokers__ - -# Job Coordinator -job.coordinator.system=kafka - -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=__samza_coordinator_replication_factor__ - -# Job specific configuration - -# Metrics -output.metrics.job.name=merge-user-courses -output.metrics.topic.name=__env__.pipeline_metrics -kafka.topics.backend.telemetry=__env__.telemetry.raw - -#Failed Topic Config -output.failed.events.topic.name=__env__.learning.events.failed - -# Retry Topic -kafka.topics.failed=__env__.lms.user.account.merge - -# Configuration for default channel ID -channel.default=in.ekstep - -#elastic-search -sunbird_es_cluster=__lms_es_cluster__ -sunbird_es_host=__lms_es_host__ -sunbird_es_port=__lms_es_port__ - -cassandra.lp.connection=__cassandra_lp_connection__ -cassandra.lpa.connection=__cassandra_lpa_connection__ - -cassandra.connection.platform_courses=__cassandra_sunbird_connection__ -kp.learning_service.base_url=__kp_learning_service_base_url__ -courses.keyspace.name=sunbird_courses -search.es_conn_info=__search_lms_es_host__ -job.time_zone=IST -sunbird.installation=__sunbird_installation__ -user.courses.table=user_enrolments -content.consumption.table=user_content_consumption -user.courses.es.index=user-courses -user.courses.es.type=_doc -course.batch.updater.kafka.topic=__env__.coursebatch.job.request -max.iteration.count.samza.job=__max_iteration_count_for_samza_job__ -course.date.format=yyyy-MM-dd HH:mm:ss:SSSZ \ No newline at end of file diff --git a/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/model/BatchEnrollmentSyncModel.java b/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/model/BatchEnrollmentSyncModel.java deleted file mode 100644 index 7e29db4a19..0000000000 --- a/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/model/BatchEnrollmentSyncModel.java +++ /dev/null @@ -1,32 +0,0 @@ -package org.sunbird.jobs.samza.model; - -public class BatchEnrollmentSyncModel { - - private String batchId; - private String userId; - private String courseId; - - public String getBatchId() { - return batchId; - } - - public void setBatchId(String batchId) { - this.batchId = batchId; - } - - public String getUserId() { - return userId; - } - - public void setUserId(String userId) { - this.userId = userId; - } - - public String getCourseId() { - return courseId; - } - - public void setCourseId(String courseId) { - this.courseId = courseId; - } -} diff --git a/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/service/MergeUserCoursesService.java b/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/service/MergeUserCoursesService.java deleted file mode 100644 index 91bf959f85..0000000000 --- a/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/service/MergeUserCoursesService.java +++ /dev/null @@ -1,460 +0,0 @@ -package org.sunbird.jobs.samza.service; - -import com.datastax.driver.core.RegularStatement; -import com.datastax.driver.core.Session; -import com.datastax.driver.core.querybuilder.Batch; -import com.datastax.driver.core.querybuilder.QueryBuilder; -import com.datastax.driver.core.querybuilder.Update; -import com.fasterxml.jackson.databind.ObjectMapper; -import org.apache.commons.collections4.CollectionUtils; -import org.apache.commons.collections4.MapUtils; -import org.apache.commons.lang3.StringUtils; -import org.apache.samza.config.Config; -import org.apache.samza.system.OutgoingMessageEnvelope; -import org.apache.samza.system.SystemStream; -import org.apache.samza.task.MessageCollector; -import org.sunbird.common.Platform; -import org.sunbird.common.exception.ClientException; -import org.sunbird.jobs.samza.exception.PlatformErrorCodes; -import org.sunbird.jobs.samza.service.ISamzaService; -import org.sunbird.jobs.samza.service.task.JobMetrics; -import org.sunbird.jobs.samza.util.FailedEventsUtil; -import org.sunbird.jobs.samza.util.JSONUtils; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.searchindex.elasticsearch.ElasticSearchUtil; -import org.sunbird.jobs.samza.model.BatchEnrollmentSyncModel; -import org.sunbird.jobs.samza.util.CassandraConnector; -import org.sunbird.jobs.samza.util.MergeUserCoursesParams; -import org.sunbird.jobs.samza.util.SunbirdCassandraUtil; - -import java.text.ParseException; -import java.text.SimpleDateFormat; -import java.util.*; -import java.util.stream.Collectors; - -public class MergeUserCoursesService implements ISamzaService { - private static JobLogger LOGGER = new JobLogger(MergeUserCoursesService.class); - private SystemStream systemStream; - private Config config = null; - private static final String UNDERSCORE = "_"; - private ObjectMapper mapper = new ObjectMapper(); - private static final String ACTION = "merge-user-courses-and-cert"; - private static int MAXITERTIONCOUNT = 2; - - private static String KEYSPACE; - private static String CONTENT_CONSUMPTION_TABLE; - private static String USER_COURSES_TABLE; - private static String USER_COURSE_ES_INDEX; - private static String USER_COURSE_ES_TYPE; - private static String COURSE_BATCH_UPDATER_KAFKA_TOPIC; - private static String COURSE_DATE_FORMAT; - private static SimpleDateFormat DateFormatter; - private static String USER_ACTIVITY_AGG; - private Session cassandraSession = null; - - protected int getMaxIterations() { - if (Platform.config.hasPath("max.iteration.count.samza.job")) - return Platform.config.getInt("max.iteration.count.samza.job"); - else - return MAXITERTIONCOUNT; - } - - private boolean validateObject(Map edata) { - String action = (String) edata.get(MergeUserCoursesParams.action.name()); - Integer iteration = (Integer) edata.get(MergeUserCoursesParams.iteration.name()); - if (StringUtils.equalsIgnoreCase(ACTION, action) && (iteration <= getMaxIterations())) { - return true; - } - return false; - } - - private static void initializeConfigurations() { - KEYSPACE = Platform.config.hasPath("courses.keyspace.name") ? - Platform.config.getString("courses.keyspace.name") : "sunbird_courses"; - - CONTENT_CONSUMPTION_TABLE = Platform.config.hasPath("content.consumption.table") ? - Platform.config.getString("content.consumption.table") : "user_content_consumption"; - - USER_COURSES_TABLE = Platform.config.hasPath("user.courses.table") ? - Platform.config.getString("user.courses.table") : "user_enrolments"; - - USER_COURSE_ES_INDEX = Platform.config.hasPath("user.courses.es.index") ? - Platform.config.getString("user.courses.es.index") : "user-courses"; - - USER_COURSE_ES_TYPE = Platform.config.hasPath("user.courses.es.type") ? - Platform.config.getString("user.courses.es.type") : "_doc"; - - COURSE_BATCH_UPDATER_KAFKA_TOPIC = Platform.config.getString("course.batch.updater.kafka.topic"); - - COURSE_DATE_FORMAT = Platform.config.hasPath("course.date.format") ? - Platform.config.getString("course.date.format") : "yyyy-MM-dd HH:mm:ss:SSSZ"; - - USER_ACTIVITY_AGG = "user_activity_agg"; - - DateFormatter = new SimpleDateFormat(COURSE_DATE_FORMAT); - } - - @Override - public void initialize(Config config) throws Exception { - this.config = config; - JSONUtils.loadProperties(config); - initializeConfigurations(); - this.cassandraSession = new CassandraConnector(config).getSession(); - LOGGER.info("MergeUserCoursesService:initialize: Service config initialized"); - ElasticSearchUtil.initialiseESClient(USER_COURSE_ES_INDEX, Platform.config.getString("search.es_conn_info")); - LOGGER.info("MergeUserCoursesService:initialize: ESClient initialized for index:" + USER_COURSE_ES_INDEX); - systemStream = new SystemStream("kafka", config.get("output.failed.events.topic.name")); - LOGGER.info("MergeUserCoursesService:initialize: Stream initialized for Failed Events"); - } - - @Override - public void processMessage(Map message, JobMetrics metrics, MessageCollector collector) throws Exception { - if (MapUtils.isEmpty(message)) { - LOGGER.info("MergeUserCoursesService:processMessage: Ignoring the event since message is empty."); - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - PlatformErrorCodes.DATA_ERROR.name(), new ClientException("ERR_MERGE_USER_COURSES_SAMZA", "message is empty")); - metrics.incSkippedCounter(); - return; - } - - Map edata = (Map) message.get(MergeUserCoursesParams.edata.name()); - if (MapUtils.isEmpty(edata)) { - LOGGER.info("MergeUserCoursesService:processMessage: Ignoring the event since edata is empty."); - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - PlatformErrorCodes.DATA_ERROR.name(), new ClientException("ERR_MERGE_USER_COURSES_SAMZA", "message.edata is empty")); - metrics.incSkippedCounter(); - return; - } - - String fromUserId = (String) edata.get(MergeUserCoursesParams.fromAccountId.name()); - String toUserId = (String) edata.get(MergeUserCoursesParams.toAccountId.name()); - - if (StringUtils.isBlank(fromUserId) || StringUtils.isBlank(toUserId) || !validateObject(edata)) { - LOGGER.info("MergeUserCoursesService:processMessage: Ignoring the event due to invalid edata:" + edata); - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - PlatformErrorCodes.DATA_ERROR.name(), new ClientException("ERR_MERGE_USER_COURSES_SAMZA", "message.edata values are not valid")); - metrics.incSkippedCounter(); - return; - } - - try { - mergeContentConsumption(fromUserId, toUserId); - mergeUserBatches(fromUserId, toUserId); - generateBatchEnrollmentSyncEvents(toUserId, collector); - mergeUserActivityAggregates(fromUserId, toUserId); - metrics.incSuccessCounter(); - LOGGER.info("MergeUserCoursesService:processMessage: Event processed successfully", message); - } catch (Exception e) { - edata.put(MergeUserCoursesParams.status.name(), MergeUserCoursesParams.FAILED.name()); - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - PlatformErrorCodes.PROCESSING_ERROR.name(), e); - throw e; - } - - } - - private void generateBatchEnrollmentSyncEvents(String userId, MessageCollector collector) throws Exception { - List objects = getBatchDetailsOfUser(userId); - if (CollectionUtils.isNotEmpty(objects)) { - for (BatchEnrollmentSyncModel model : objects) { - Map event = getBatchEnrollmentSyncEvent(model); - collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", COURSE_BATCH_UPDATER_KAFKA_TOPIC), event)); - } - } - } - - private void mergeUserBatches(String fromUserId, String toUserId) throws Exception { - List fromBatches = getBatchDetailsOfUser(fromUserId); - List toBatches = getBatchDetailsOfUser(toUserId); - - Map fromBatchIds = new HashMap<>(); - Map toBatchIds = new HashMap<>(); - if (CollectionUtils.isNotEmpty(fromBatches)) { - for (BatchEnrollmentSyncModel fromBatch : fromBatches) { - if (StringUtils.isNotBlank(fromBatch.getBatchId())) - fromBatchIds.put(fromBatch.getBatchId(), fromBatch); - } - } - if (CollectionUtils.isNotEmpty(toBatches)) { - for (BatchEnrollmentSyncModel toBatch : toBatches) { - if (StringUtils.isNotBlank(toBatch.getBatchId())) - toBatchIds.put(toBatch.getBatchId(), toBatch); - } - } - - List batchIdsToBeMigrated = (List) CollectionUtils.subtract(fromBatchIds.keySet(), toBatchIds.keySet()); - - //Migrate batch records in Cassandra and ES - if (CollectionUtils.isNotEmpty(batchIdsToBeMigrated)) { - for (String batchId : batchIdsToBeMigrated) { - String courseId = fromBatchIds.get(batchId).getCourseId(); - Map userCourse = getUserCourse(batchId, fromUserId, courseId); - if (MapUtils.isNotEmpty(userCourse)) { - userCourse.put(MergeUserCoursesParams.userId.name(), toUserId); - LOGGER.info("MergeUserCoursesService:mergeUserBatches: Merging batch:" + batchId + " updated record:" + userCourse); - SunbirdCassandraUtil.upsert(KEYSPACE, USER_COURSES_TABLE, userCourse); - - /*String documentJson = ElasticSearchUtil.getDocumentAsStringById(USER_COURSE_ES_INDEX, USER_COURSE_ES_TYPE, - batchId + UNDERSCORE + fromUserId); - Map userCourseDoc = mapper.readValue(documentJson, Map.class); - userCourseDoc.put(MergeUserCoursesParams.userId.name(), toUserId); - userCourseDoc.put(MergeUserCoursesParams.id.name(), batchId + UNDERSCORE + toUserId); - userCourseDoc.put(MergeUserCoursesParams.identifier.name(), batchId + UNDERSCORE + toUserId); - ElasticSearchUtil.addDocumentWithId(USER_COURSE_ES_INDEX, USER_COURSE_ES_TYPE, - batchId + UNDERSCORE + toUserId, mapper.writeValueAsString(userCourseDoc));*/ - } else { - LOGGER.info("MergeUserCoursesService:mergeUserBatches: user_courses record with batchId:" + batchId + " userId:" + fromUserId + " found in ES but not in Cassandra"); - } - } - } - } - - private void mergeContentConsumption(String fromUserId, String toUserId) { - //Get content consumption data - List> fromContentConsumptionList = getContentConsumption(fromUserId); - List> toContentConsumptionList = getContentConsumption(toUserId); - - if (CollectionUtils.isNotEmpty(fromContentConsumptionList)) { - for (Map contentConsumption : fromContentConsumptionList) { - Map matchingRecord = getMatchingRecord(contentConsumption, toContentConsumptionList); - if (MapUtils.isEmpty(matchingRecord)) { - matchingRecord = contentConsumption; - matchingRecord.put(MergeUserCoursesParams.userId.name(), toUserId); - } else { - mergeContentConsumptionRecord(contentConsumption, matchingRecord); - } - SunbirdCassandraUtil.upsert(KEYSPACE, CONTENT_CONSUMPTION_TABLE, matchingRecord); - } - } - } - - private void mergeContentConsumptionRecord(Map oldRecord, Map newRecord) { - /* - * for status, progress, datetime, lastaccesstime, lastcompletedtime, lastupdatedtime fields, - * max value should be considered - * for completedcount, viewcount fields, sum of both records should be considered - * */ - newRecord.put(MergeUserCoursesParams.status.name(), getUpdatedValue("Integer", "Max", - MergeUserCoursesParams.status.name(), oldRecord, newRecord)); - newRecord.put(MergeUserCoursesParams.progress.name(), getUpdatedValue("Integer", "Max", - MergeUserCoursesParams.progress.name(), oldRecord, newRecord)); - newRecord.put(MergeUserCoursesParams.viewCount.name(), getUpdatedValue("Integer", "Sum", - MergeUserCoursesParams.viewCount.name(), oldRecord, newRecord)); - newRecord.put(MergeUserCoursesParams.completedCount.name(), getUpdatedValue("Integer", "Sum", - MergeUserCoursesParams.completedCount.name(), oldRecord, newRecord)); - - newRecord.put(MergeUserCoursesParams.dateTime.name(), getUpdatedValue("Date", "Max", - MergeUserCoursesParams.dateTime.name(), oldRecord, newRecord)); - newRecord.put(MergeUserCoursesParams.lastAccessTime.name(), getUpdatedValue("DateString", "Max", - MergeUserCoursesParams.lastAccessTime.name(), oldRecord, newRecord)); - newRecord.put(MergeUserCoursesParams.lastCompletedTime.name(), getUpdatedValue("DateString", "Max", - MergeUserCoursesParams.lastCompletedTime.name(), oldRecord, newRecord)); - newRecord.put(MergeUserCoursesParams.lastUpdatedTime.name(), getUpdatedValue("DateString", "Max", - MergeUserCoursesParams.lastUpdatedTime.name(), oldRecord, newRecord)); - } - - private Object getUpdatedValue(String dataType, String operation, String fieldName, Map oldRecord, Map newRecord) { - if (null == oldRecord.get(fieldName)) { - return newRecord.get(fieldName); - } - if (null == newRecord.get(fieldName)) { - return oldRecord.get(fieldName); - } - switch (dataType) { - case "Integer": - if (oldRecord.get(fieldName) instanceof Integer && - newRecord.get(fieldName) instanceof Integer) { - int val1 = (int) oldRecord.get(fieldName); - int val2 = (int) newRecord.get(fieldName); - if (StringUtils.equalsIgnoreCase("Sum", operation)) { - return val1 + val2; - } else if (StringUtils.equalsIgnoreCase("Max", operation)) { - return val1 > val2 ? val1 : val2; - } - } - break; - case "DateString": - if (oldRecord.get(fieldName) instanceof String && - newRecord.get(fieldName) instanceof String) { - String dateStr1 = (String) oldRecord.get(fieldName); - String dateStr2 = (String) newRecord.get(fieldName); - Date date1; - Date date2; - try { - date1 = DateFormatter.parse(dateStr1); - } catch (ParseException pe) { - LOGGER.info("MergeUserCoursesService:getUpdatedValue: Date Parsing failed for field:" + fieldName + " value:" + dateStr1); - return dateStr2; - } - try { - date2 = DateFormatter.parse(dateStr2); - } catch (ParseException pe) { - LOGGER.info("MergeUserCoursesService:getUpdatedValue: Date Parsing failed for field:" + fieldName + " value:" + dateStr2); - return dateStr1; - } - if (StringUtils.equalsIgnoreCase("Max", operation)) { - if (date1.after(date2)) { - return dateStr1; - } else { - return dateStr2; - } - } - } - break; - case "Date": - if (oldRecord.get(fieldName) instanceof Date && - newRecord.get(fieldName) instanceof Date) { - Date date1 = (Date) oldRecord.get(fieldName); - Date date2 = (Date) newRecord.get(fieldName); - if (StringUtils.equalsIgnoreCase("Max", operation)) { - if (date1.after(date2)) { - return date1; - } else { - return date2; - } - } - } - break; - } - return newRecord.get(fieldName); - } - - private Map getMatchingRecord(Map contentConsumption, List> toContentConsumptionList) { - Map matchingRecord = new HashMap(); - if (CollectionUtils.isNotEmpty(toContentConsumptionList)) { - for (Map toContentConsumption : toContentConsumptionList) { - if (StringUtils.equalsIgnoreCase((String) contentConsumption.get(MergeUserCoursesParams.contentId.name()), (String) toContentConsumption.get(MergeUserCoursesParams.contentId.name())) && - StringUtils.equalsIgnoreCase((String) contentConsumption.get(MergeUserCoursesParams.batchId.name()), (String) toContentConsumption.get(MergeUserCoursesParams.batchId.name())) && - StringUtils.equalsIgnoreCase((String) contentConsumption.get(MergeUserCoursesParams.courseId.name()), (String) toContentConsumption.get(MergeUserCoursesParams.courseId.name()))) { - matchingRecord = toContentConsumption; - break; - } - } - } - return matchingRecord; - } - - private List> getContentConsumption(String userId) { - Map key = new HashMap<>(); - key.put(MergeUserCoursesParams.userId.name(), userId); - return SunbirdCassandraUtil.readAsListOfMap(KEYSPACE, CONTENT_CONSUMPTION_TABLE, key); - } - - private Map getUserCourse(String batchId, String userId, String courseId) { - Map key = new HashMap<>(); - key.put(MergeUserCoursesParams.batchId.name(), batchId); - key.put(MergeUserCoursesParams.userId.name(), userId); - key.put(MergeUserCoursesParams.courseId.name(), courseId); - List> data = SunbirdCassandraUtil.readAsListOfMap(KEYSPACE, USER_COURSES_TABLE, key); - return CollectionUtils.isEmpty(data) ? new HashMap() : data.get(0); - } - - private List getBatchDetailsOfUser(String userId) throws Exception { - List objects = new ArrayList<>(); - Map searchQuery = new HashMap<>(); - List userIdList = new ArrayList<>(); - userIdList.add(userId); - searchQuery.put(MergeUserCoursesParams.userId.name(), userIdList); - Map key = new HashMap<>(); - key.put(MergeUserCoursesParams.userId.name(), userIdList); - List> documents = SunbirdCassandraUtil.readAsListOfMap(KEYSPACE, USER_COURSES_TABLE, key); - //List documents = ElasticSearchUtil.textSearchReturningId(searchQuery, USER_COURSE_ES_INDEX, USER_COURSE_ES_TYPE); - if (CollectionUtils.isNotEmpty(documents)) { - documents.forEach(doc -> { - BatchEnrollmentSyncModel model = new BatchEnrollmentSyncModel(); - model.setBatchId((String) doc.get(MergeUserCoursesParams.batchId.name())); - model.setUserId((String) doc.get(MergeUserCoursesParams.userId.name())); - model.setCourseId((String) doc.get(MergeUserCoursesParams.courseId.name())); - objects.add(model); - }); - } - return objects; - } - - private Map getBatchEnrollmentSyncEvent(BatchEnrollmentSyncModel model) { - return new HashMap() {{ - put("actor", new HashMap() {{ - put("id", "Course Batch Updater"); - put("type", "System"); - }}); - put("eid", "BE_JOB_REQUEST"); - put("edata", new HashMap() {{ - put("action", "batch-enrolment-sync"); - put("iteration", 1); - put("batchId", model.getBatchId()); - put("userId", model.getUserId()); - put("courseId", model.getCourseId()); - put("reset", Arrays.asList("completionPercentage", "status", "progress")); - }}); - put("ets", System.currentTimeMillis()); - put("context", new HashMap() {{ - put("pdata", new HashMap() {{ - put("ver", "1.0"); - put("id", "org.sunbird.platform"); - }}); - }}); - put("mid", "LP." + System.currentTimeMillis() + "." + UUID.randomUUID()); - put("object", new HashMap() {{ - put("id", model.getBatchId() + UNDERSCORE + model.getUserId()); - put("type", "CourseBatchEnrolment"); - }}); - }}; - } - - - private void mergeUserActivityAggregates(String fromUserId, String toUserId) throws Exception { - List fromBatches = getBatchDetailsOfUser(fromUserId); - if(CollectionUtils.isNotEmpty(fromBatches)) { - List fromCourseIds = fromBatches.stream().map(enrol -> enrol.getCourseId()).collect(Collectors.toList()); - List toCourseIds = fromBatches.stream().map(enrol -> enrol.getCourseId()).collect(Collectors.toList()); - Map key = new HashMap<>(); - key.put(MergeUserCoursesParams.activity_type.name(), "Course"); - key.put(MergeUserCoursesParams.user_id.name(), fromUserId); - key.put(MergeUserCoursesParams.activity_id.name(), fromCourseIds); - List> fromData = SunbirdCassandraUtil.readAsListOfMap(KEYSPACE, USER_ACTIVITY_AGG, key); - key.put(MergeUserCoursesParams.activity_id.name(), toCourseIds); - List> toData = SunbirdCassandraUtil.readAsListOfMap(KEYSPACE, USER_ACTIVITY_AGG, key); - Map toDataMap = toData.stream().collect(Collectors.toMap(m -> (String)m.get("context_id"), m -> m)); - List updateQueryList = new ArrayList<>(); - if(CollectionUtils.isNotEmpty(fromData)) { - fromData.stream().filter(data -> MapUtils.isNotEmpty(data)).collect(Collectors.toList()).forEach(data -> { - data.put(MergeUserCoursesParams.user_id.name(), toUserId); - Map fromAgg = (Map) data.get("agg"); - Map toAgg = (Map) ((Map)toDataMap.getOrDefault(data.get("context_id"), new HashMap())).getOrDefault("agg", new HashMap()); - data.put("agg", new HashMap(){{ - put("completedCount", Math.max(fromAgg.getOrDefault("completedCount", 0), toAgg.getOrDefault("completedCount", 0))); - }}); - data.put("agg_last_updated", new HashMap(){{ - put("completedCount", new Date()); - }}); - Map dataToSelect = new HashMap() {{ - put(MergeUserCoursesParams.activity_type.name(), "Course"); - put(MergeUserCoursesParams.activity_id.name(), data.get("activity_id")); - put(MergeUserCoursesParams.user_id.name(), toUserId); - put("context_id", data.get("context_id")); - }}; - updateQueryList.add(updateQuery(KEYSPACE, USER_ACTIVITY_AGG, data, dataToSelect)); - }); - } - if(CollectionUtils.isNotEmpty(updateQueryList)){ - Batch batch = QueryBuilder.batch(updateQueryList.toArray(new RegularStatement[updateQueryList.size()])); - cassandraSession.execute(batch); - } - } - - } - - - public Update.Where updateQuery(String keyspace, String table, Map propertiesToUpdate, Map propertiesToSelect) { - Update.Where updateQuery = QueryBuilder.update(keyspace, table).where(); - propertiesToUpdate.entrySet().forEach(entry -> updateQuery.with(QueryBuilder.set(entry.getKey(), entry.getValue()))); - propertiesToSelect.entrySet().forEach(entry -> { - if (entry.getValue() instanceof List) - updateQuery.and(QueryBuilder.in(entry.getKey(), (List) entry.getValue())); - else - updateQuery.and(QueryBuilder.eq(entry.getKey(), entry.getValue())); - }); - return updateQuery; - } - -} diff --git a/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/task/MergeUserCoursesTask.java b/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/task/MergeUserCoursesTask.java deleted file mode 100644 index 3c360c690b..0000000000 --- a/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/task/MergeUserCoursesTask.java +++ /dev/null @@ -1,40 +0,0 @@ -package org.sunbird.jobs.samza.task; - - -import org.apache.samza.task.MessageCollector; -import org.apache.samza.task.TaskCoordinator; -import org.sunbird.jobs.samza.service.ISamzaService; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.jobs.samza.service.MergeUserCoursesService; - -import java.util.Arrays; -import java.util.Map; - -public class MergeUserCoursesTask extends BaseTask { - - private ISamzaService service = new MergeUserCoursesService(); - private static JobLogger LOGGER = new JobLogger(MergeUserCoursesTask.class); - - @Override - public ISamzaService initialize() throws Exception { - LOGGER.info("MergeUserCoursesTask:initialize: Task initialized"); - this.action = Arrays.asList("merge-user-courses-and-cert"); - this.jobStartMessage = "Started processing of merge-user-courses samza job"; - this.jobEndMessage = "merge-user-courses job processing complete"; - this.jobClass = "org.sunbird.jobs.samza.task.MergeUserCoursesTask"; - return service; - } - - @Override - public void process(Map message, MessageCollector collector, TaskCoordinator coordinator) throws Exception { - try { - LOGGER.info("MergeUserCoursesTask:process: Starting to process for mid : " + message.get("mid") + " at :: " + System.currentTimeMillis()); - service.processMessage(message, metrics, collector); - LOGGER.info("MergeUserCoursesTask:process: Successfully completed processing for mid : " + message.get("mid") + " at :: " + System.currentTimeMillis()); - } catch (Exception e) { - metrics.incErrorCounter(); - LOGGER.error("MergeUserCoursesTask:process: Message processing failed", message, e); - } - } - -} diff --git a/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/util/MergeUserCoursesParams.java b/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/util/MergeUserCoursesParams.java deleted file mode 100644 index 8645c89bcd..0000000000 --- a/platform-jobs/samza/merge-user-courses/src/main/java/org/sunbird/jobs/samza/util/MergeUserCoursesParams.java +++ /dev/null @@ -1,9 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import org.sunbird.graph.dac.util.RelationType; - -public enum MergeUserCoursesParams { - userId, batchId, contentId, courseId, status, edata, id, identifier, action, fromAccountId, - toAccountId, FAILED, iteration, progress, dateTime, lastAccessTime, lastCompletedTime, - lastUpdatedTime, completedCount, viewCount, activity_type, activity_id, user_id; -} diff --git a/platform-jobs/samza/merge-user-courses/src/main/resources/log4j.xml b/platform-jobs/samza/merge-user-courses/src/main/resources/log4j.xml deleted file mode 100644 index 0f37824c0c..0000000000 --- a/platform-jobs/samza/merge-user-courses/src/main/resources/log4j.xml +++ /dev/null @@ -1,20 +0,0 @@ - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/mvc-processor-indexer/pom.xml b/platform-jobs/samza/mvc-processor-indexer/pom.xml deleted file mode 100644 index 38fb46df62..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/pom.xml +++ /dev/null @@ -1,114 +0,0 @@ - - - 4.0.0 - - org.sunbird - samza - 1.1-SNAPSHOT - - 1.3.7 - mvc-processor-indexer - - - - org.mockito - mockito-all - 1.10.19 - test - - - org.sunbird - unit-tests - 1.1-SNAPSHOT - test - - - org.sunbird - samza-common - 1.1-SNAPSHOT - - - io.netty - netty-transport - - - io.netty - netty - - - io.netty - netty-handler - - - org.sunbird - searchindex-elasticsearch - - - - - org.sunbird - mvcsearchindex-elasticsearch - 1.1-SNAPSHOT - jar - - - org.apache.logging.log4j - log4j-api - - - org.apache.logging.log4j - log4j-core - - - - - io.netty - netty-all - 4.1.16.Final - - - org.powermock - powermock-module-junit4 - 1.7.4 - test - - - org.powermock - powermock-api-mockito - 1.7.4 - test - - - - - - org.apache.maven.plugins - maven-compiler-plugin - - 1.8 - 1.8 - - - - - maven-assembly-plugin - - - src/main/assembly/src.xml - - - - - make-assembly - package - - single - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/assembly/src.xml b/platform-jobs/samza/mvc-processor-indexer/src/main/assembly/src.xml deleted file mode 100644 index c6a072bd63..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/assembly/src.xml +++ /dev/null @@ -1,70 +0,0 @@ - - - - - distribution - - tar.gz - - false - - - ${basedir} - - README* - LICENSE* - NOTICE* - - - - - - ${basedir}/src/main/resources/log4j.xml - lib - - - - - ${basedir}/src/main/config/mvc-processor-indexer.properties - config - true - - - - - bin - - org.apache.samza:samza-shell:tgz:dist:* - - 0744 - true - - - lib - - org.apache.samza:samza-api - org.sunbird:mvc-processor-indexer - org.apache.samza:samza-core_2.11 - org.apache.samza:samza-kafka_2.11 - org.apache.samza:samza-yarn_2.11 - org.apache.samza:samza-log4j - org.apache.kafka:kafka_2.11 - org.apache.hadoop:hadoop-hdfs - - true - - - diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/config/local.mvc-processor-indexer.properties b/platform-jobs/samza/mvc-processor-indexer/src/main/config/local.mvc-processor-indexer.properties deleted file mode 100644 index 8525e9692a..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/config/local.mvc-processor-indexer.properties +++ /dev/null @@ -1,78 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=local.mvc-processor-indexer - -# YARN -yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-distribution.tar.gz - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory - -# Task -task.class=org.sunbird.mvcjobs.samza.task.MVCSearchIndexerTask -task.inputs=kafka.local.mvc.processor.job.request -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=1 -task.commit.ms=60000 -task.window.ms=300000 -task.broadcast.inputs=kafka.dev.system.command#0 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=localhost:2181 -systems.kafka.consumer.auto.offset.reset=smallest -systems.kafka.producer.bootstrap.servers=localhost:9092 - -# Job Coordinator -job.coordinator.system=kafka -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=1 - -# Job specific config properties -search.es_conn_info=localhost:9200 -platform-api-url= -ekstepPlatformApiUserId=system - -content.keyspace.name=content_store -cassandra.lp.connection=127.0.0.1:9042 - -# Consistency Level for Multi Node Cassandra cluster -cassandra.lp.consistency.level=QUORUM - -# Metrics -output.metrics.job.name=mvc-processor-indexer -output.metrics.topic.name=local.pipeline_metrics - -# Nested Fields -nested.fields=badgeAssertions,targets,badgeAssociations,plugins,batches - -#Failed Topic Config -output.failed.events.topic.name=local.mvc.events.failed - -#Remote Debug Configuration -task.opts=-agentlib:jdwp=transport=dt_socket,address=localhost:9009,server=y,suspend=y - -telemetry_env=local -installation.id=local - -# Configuration for default channel ID -channel.default=in.ekstep - -# Definition update window -definitions.update.window.ms=300000 - -# Filter Metadata based on Definition while indexing into ES. -restrict.metadata.objectTypes=Content,ContentImage - -kp.content_service.base_url=localhost:3000 -cassandra.keyspace=sunbirddev_content_store \ No newline at end of file diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/config/mvc-processor-indexer.properties b/platform-jobs/samza/mvc-processor-indexer/src/main/config/mvc-processor-indexer.properties deleted file mode 100644 index f1123305c7..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/config/mvc-processor-indexer.properties +++ /dev/null @@ -1,99 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=__env__.mvc-processor-indexer -job.container.count=__mvc_search_indexer_container_count__ - - -# YARN -yarn.package.path=http://__yarn_host__:__yarn_port__/__env__/${project.artifactId}-${pom.version}-distribution.tar.gz - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.__env__.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory - -# Task -task.class=org.sunbird.mvcjobs.samza.task.MVCSearchIndexerTask -task.inputs=kafka.__env__.mvc.processor.job.request -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=__samza_checkpoint_replication_factor__ -task.commit.ms=60000 -task.window.ms=300000 -task.opts=-Dfile.encoding=UTF8 -#task.broadcast.inputs=kafka.__env__.system.command#0 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=__zookeepers__ -systems.kafka.consumer.auto.offset.reset=smallest -systems.kafka.samza.offset.default=oldest -systems.kafka.producer.bootstrap.servers=__kafka_brokers__ - -# Job Coordinator -job.coordinator.system=kafka -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=__samza_coordinator_replication_factor__ - -# Job specific config properties -search.es_conn_info=__search_es7_host__ -platform-api-url=__lp_url__ -ekstepPlatformApiUserId=ilimi - -# neo4j configurations -redis.host=__redis_host__ -redis.port=__redis_port__ -redis.maxConnections=128 -akka.request_timeout=30 -environment.id=__environment_id__ -graph.passport.key.base=__graph_passport_key__ -route.domain=__lp_bolt_url__ -route.bolt.read.domain=__lp_bolt_read_url__ -route.bolt.write.domain=__lp_bolt_write_url__ -route.all=__other_bolt_url__ -route.bolt.read.all=__other_bolt_read_url__ -route.bolt.write.all=__other_bolt_write_url__ -shard.id=__mw_shard_id__ -graph.dir="/data/graphDB" -graph.ids=domain,language,en,hi,ka,te,ta -platform.auth.check.enabled=false -platform.cache.ttl=3600000 - -# Metrics -output.metrics.job.name=mvc-processor-indexer -output.metrics.topic.name=__env__.pipeline_metrics - -# Nested Fields -nested.fields=trackable,credentials - -#Failed Topic Config -output.failed.events.topic.name=__env__.mvc.events.failed - -telemetry_env=__env_name__ -installation.id=__installation_id__ - -# Configuration for default channel ID -channel.default=in.ekstep - -# Definition update window -definitions.update.window.ms=300000 - -# Filter Metadata based on Definition while indexing into ES. -#restrict.metadata.objectTypes=Content,ContentImage,AssessmentItem,Channel,Framework,Category,CategoryInstance,Term,Concept,Dimension,Domain - -#kafka.topic.system.command=__env__.system.command - -kp.content_service.base_url=__kp_content_service_base_url__ - -cassandra.lp.connection=__cassandra_lp_connection__ -cassandra.keyspace = __keyspace_name__ - -ml.keyword.api=__ml-keywordapi__ -ml.vector.api=__ml-keywordapi__ \ No newline at end of file diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/MVCProcessorService.java b/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/MVCProcessorService.java deleted file mode 100644 index 006c514342..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/MVCProcessorService.java +++ /dev/null @@ -1,96 +0,0 @@ -package org.sunbird.mvcjobs.samza.service; - -import org.apache.commons.lang3.BooleanUtils; -import org.apache.samza.config.Config; -import org.apache.samza.system.SystemStream; -import org.apache.samza.task.MessageCollector; -import org.sunbird.jobs.samza.exception.PlatformErrorCodes; -import org.sunbird.jobs.samza.exception.PlatformException; -import org.sunbird.jobs.samza.service.ISamzaService; -import org.sunbird.jobs.samza.service.task.JobMetrics; -import org.sunbird.mvcjobs.samza.service.util.ContentUtil; -import org.sunbird.mvcjobs.samza.service.util.MVCProcessorCassandraIndexer; -import org.sunbird.mvcjobs.samza.service.util.MVCProcessorESIndexer; -import org.sunbird.jobs.samza.util.FailedEventsUtil; -import org.sunbird.jobs.samza.util.JSONUtils; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.searchindex.util.CompositeSearchConstants; -import org.elasticsearch.client.transport.NoNodeAvailableException; - -import java.util.Map; - -public class MVCProcessorService implements ISamzaService { - - private JobLogger LOGGER = new JobLogger(MVCProcessorService.class); - private Config config = null; - private MVCProcessorESIndexer mvcIndexer = null; - private SystemStream systemStream = null; - private MVCProcessorCassandraIndexer cassandraManager ; - public MVCProcessorService() {} - - public MVCProcessorService(MVCProcessorESIndexer mvcIndexer) throws Exception { - this.mvcIndexer = mvcIndexer; - } - - @Override - public void initialize(Config config) throws Exception { - this.config = config; - JSONUtils.loadProperties(config); - LOGGER.info("Service config initialized"); - systemStream = new SystemStream("kafka", config.get("output.failed.events.topic.name")); - mvcIndexer = mvcIndexer == null ? new MVCProcessorESIndexer(): mvcIndexer; - mvcIndexer.createMVCSearchIndex(); - LOGGER.info(CompositeSearchConstants.MVC_SEARCH_INDEX + " created"); - cassandraManager = new MVCProcessorCassandraIndexer(); - } - - @Override - public void processMessage(Map message, JobMetrics metrics, MessageCollector collector) - throws Exception { - Object index = message.get("index"); - Boolean shouldindex = BooleanUtils.toBoolean(null == index ? "true" : index.toString()); - String identifier = (String) ((Map) message.get("object")).get("id"); - if (!BooleanUtils.isFalse(shouldindex)) { - LOGGER.debug("Indexing event into ES"); - try { - processMessage(message); - LOGGER.debug("Record Added/Updated into mvc index for " + identifier); - metrics.incSuccessCounter(); - } catch (PlatformException ex) { - LOGGER.error("Error while processing message:", message, ex); - metrics.incFailedCounter(); - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - PlatformErrorCodes.SYSTEM_ERROR.name(), ex); - } catch (Exception ex) { - LOGGER.error("Error while processing message:", message, ex); - metrics.incErrorCounter(); - if (null != message) { - String errorCode = ex instanceof NoNodeAvailableException ? PlatformErrorCodes.SYSTEM_ERROR.name() - : PlatformErrorCodes.PROCESSING_ERROR.name(); - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - errorCode, ex); - } - } - } else { - LOGGER.info("Learning event not qualified for indexing"); - } - } - - public void processMessage(Map message) throws Exception { - if (message != null && message.get("eventData") != null) { - Map eventData = (Map) message.get("eventData"); - String action = eventData.get("action").toString(); - String objectId = (String) ((Map) message.get("object")).get("id"); - if(!action.equalsIgnoreCase("update-content-rating")) { - if (action.equalsIgnoreCase("update-es-index")) { - eventData = ContentUtil.getContentMetaData(eventData, objectId); - } - LOGGER.info("MVCProcessorService :: processMessage ::: Calling cassandra insertion for " + objectId); - cassandraManager.insertIntoCassandra(eventData, objectId); - } - LOGGER.info("MVCProcessorService :: processMessage ::: Calling elasticsearch insertion for " + objectId); - mvcIndexer.upsertDocument(objectId, eventData); - } - } - -} diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/CassandraConnector.java b/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/CassandraConnector.java deleted file mode 100644 index 093369c5ad..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/CassandraConnector.java +++ /dev/null @@ -1,95 +0,0 @@ -package org.sunbird.mvcjobs.samza.service.util; - -import com.datastax.driver.core.BoundStatement; -import com.datastax.driver.core.Cluster; -import com.datastax.driver.core.PreparedStatement; -import com.datastax.driver.core.Session; -import org.apache.commons.lang3.StringUtils; -import org.sunbird.common.Platform; -import org.sunbird.jobs.samza.util.JobLogger; - -import java.net.InetSocketAddress; -import java.util.*; - -public class CassandraConnector { - private static JobLogger LOGGER = new JobLogger(CassandraConnector.class); - - static String arr[],table = "content_data"; - static Session session; - static public Session getSession() { - if(session != null) { - LOGGER.info("CassandraSession Exists"); - return session; - } - String serverIP = Platform.config.getString("cassandra.lp.connection"); - LOGGER.info("Cassandra keyspace is " + Platform.config.getString("cassandra.keyspace")); - if(serverIP == null) { - LOGGER.info("Server ip of cassandra is null"); - } - LOGGER.info("Server ip of cassandra is " + serverIP); - List connectionInfo = Arrays.asList(serverIP.split(",")); - List addressList = getSocketAddress(connectionInfo); - Cluster cluster = Cluster.builder() - .addContactPointsWithPorts(addressList) - .build(); - - session = cluster.connect(Platform.config.getString("cassandra.keyspace")); - LOGGER.info("The server IP " + serverIP + "\n Session created " + session); - return session; - } - public static void updateContentProperties(String contentId, Map map) { - Session session = getSession(); - if (null == map || map.isEmpty()) - return; - String query = getUpdateQuery(map.keySet()); - if(query == null) - return; - PreparedStatement ps = session.prepare(query); - Object[] values = new Object[map.size() + 1]; - try { - int i = 0; - for (Map.Entry entry : map.entrySet()) { - - if (null == entry.getValue()) { - continue; - } else { - values[i] = entry.getValue(); - } - - i += 1; - } - values[i] = contentId; - BoundStatement bound = ps.bind(values); - LOGGER.info("Executing the statement to insert into cassandra for identifier " + contentId); - session.execute(bound); - } catch (Exception e) { - System.out.println("Exception " + e); - LOGGER.info("Exception while inserting data into cassandra " + e); - } - } - private static String getUpdateQuery(Set properties) { - StringBuilder sb = new StringBuilder(); - if (null != properties && !properties.isEmpty()) { - sb.append("UPDATE " + table + " SET last_updated_on = dateOf(now()), "); - StringBuilder updateFields = new StringBuilder(); - for (String property : properties) { - if (StringUtils.isBlank(property)) - return null; - updateFields.append(property.trim()).append(" = ?, "); - } - sb.append(StringUtils.removeEnd(updateFields.toString(), ", ")); - sb.append(" where content_id = ?"); - } - return sb.toString(); - } - private static List getSocketAddress(List hosts) { - List connectionList = new ArrayList<>(); - for (String connection : hosts) { - String[] conn = connection.split(":"); - String host = conn[0]; - int port = Integer.valueOf(conn[1]); - connectionList.add(new InetSocketAddress(host, port)); - } - return connectionList; - } -} diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/ContentUtil.java b/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/ContentUtil.java deleted file mode 100644 index c4940b5681..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/ContentUtil.java +++ /dev/null @@ -1,47 +0,0 @@ -package org.sunbird.mvcjobs.samza.service.util; - -import com.fasterxml.jackson.databind.ObjectMapper; -import org.sunbird.common.Platform; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.searchindex.util.HTTPUtil; - -import java.util.HashMap; -import java.util.Map; - -public class ContentUtil { - private static JobLogger LOGGER = new JobLogger(ContentUtil.class); - - public static Map getContentMetaData(Map newmap , String identifer) throws Exception { - ObjectMapper mapper = new ObjectMapper(); - String contentReadURL = ""; - try { - contentReadURL = Platform.config.hasPath("kp.content_service.base_url") ? Platform.config.getString("kp.content_service.base_url") : "" ; - LOGGER.info("MVCProcessorCassandraIndexer :: getContentMetaData ::: Making API call to read content " + contentReadURL + "/content/v3/read/"); - String content = HTTPUtil.makeGetRequest(contentReadURL + "/content/v3/read/" +identifer); - LOGGER.info("MVCProcessorCassandraIndexer :: getContentMetaData ::: retrieved content meta " + content); - Map obj = mapper.readValue(content,Map.class); - Map contentobj = (HashMap) (((HashMap)obj.get("result")).get("content")); - newmap = filterData(newmap,contentobj); - - }catch (Exception e) { - LOGGER.info("MVCProcessorCassandraIndexer :: getContentDefinition ::: Error in getContentDefinitionFunction " + e); - throw new Exception("Get content metdata failed"); - } - return newmap; - } - public static Map filterData(Map obj ,Map content) { - String elasticSearchParamArr[] = {"organisation","channel","framework","board","medium","subject","gradeLevel","name","description","language","appId","appIcon","appIconLabel","contentEncoding","identifier","node_id","nodeType","mimeType","resourceType","contentType","allowedContentTypes","objectType","posterImage","artifactUrl","launchUrl","previewUrl","streamingUrl","downloadUrl","status","pkgVersion","source","lastUpdatedOn","ml_contentText","ml_contentTextVector","ml_Keywords","level1Name","level1Concept","level2Name","level2Concept","level3Name","level3Concept","textbook_name","sourceURL","label","all_fields"}; - String key = null; - Object value = null; - for(int i = 0 ; i < elasticSearchParamArr.length ; i++ ) { - key = (elasticSearchParamArr[i]); - value = content.containsKey(key) ? content.get(key) : null; - if(value != null) { - obj.put(key,value); - value = null; - } - } - return obj; - - } -} diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/MVCProcessorCassandraIndexer.java b/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/MVCProcessorCassandraIndexer.java deleted file mode 100644 index f042a15151..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/MVCProcessorCassandraIndexer.java +++ /dev/null @@ -1,149 +0,0 @@ -package org.sunbird.mvcjobs.samza.service.util; - -import org.apache.commons.lang3.StringUtils; -import org.sunbird.common.Platform; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.searchindex.util.HTTPUtil; -import org.json.JSONArray; -import org.json.JSONObject; - -import java.util.*; -import java.util.concurrent.CompletableFuture; - -public class MVCProcessorCassandraIndexer { - String mlworkbenchapirequest = "", mlvectorListRequest = "" , jobname = "" ; - Map mapStage1 = new HashMap<>(); - List level1concept = null,level2concept = null, level3concept = null , textbook_name , level1_name , level2_name , level3_name ; - private JobLogger LOGGER = new JobLogger(MVCProcessorCassandraIndexer.class); - public MVCProcessorCassandraIndexer() { - mlworkbenchapirequest = "{\"request\":{ \"input\" :{ \"content\" : [] } } }"; - mlvectorListRequest = "{\"request\":{\"text\":[],\"cid\": \"\",\"language\":\"en\",\"method\":\"BERT\",\"params\":{\"dim\":768,\"seq_len\":25}}}"; - jobname = "vidyadaan_content_keyword_tagging"; - - } - // Insert to cassandra - public void insertIntoCassandra(Map obj, String identifier) throws Exception { - String action = obj.get("action").toString(); - - if(StringUtils.isNotBlank(action)) { - if(action.equalsIgnoreCase("update-es-index")) { - LOGGER.info("MVCProcessorCassandraIndexer :: getContentMetaData ::: extracting required fields" + obj); - extractFieldsToBeInserted(obj); - LOGGER.info("MVCProcessorCassandraIndexer :: getContentMetaData ::: making ml workbench api request"); - getMLKeywords(obj); - LOGGER.info("MVCProcessorCassandraIndexer :: insertIntoCassandra ::: update-es-index-1 event"); - LOGGER.info("MVCProcessorCassandraIndexer :: insertIntoCassandra ::: Inserting into cassandra stage-1"); - CassandraConnector.updateContentProperties(identifier,mapStage1); - } else if(action.equalsIgnoreCase("update-ml-keywords")) { - LOGGER.info("MVCProcessorCassandraIndexer :: insertIntoCassandra ::: update-ml-keywords"); - String ml_contentText; - List ml_Keywords; - ml_contentText = obj.get("ml_contentText") != null ? obj.get("ml_contentText").toString() : null; - ml_Keywords = obj.get("ml_Keywords") != null ? (List) obj.get("ml_Keywords") : null; - - getMLVectors(ml_contentText,identifier); - Map mapForStage2 = new HashMap<>(); - mapForStage2.put("ml_keywords",ml_Keywords); - mapForStage2.put("ml_content_text",ml_contentText); - CassandraConnector.updateContentProperties(identifier,mapForStage2); - - } - else if(action.equalsIgnoreCase("update-ml-contenttextvector")) { - LOGGER.info("MVCProcessorCassandraIndexer :: insertIntoCassandra ::: update-ml-contenttextvector event"); - List> ml_contentTextVectorList; - Set ml_contentTextVector = null; - ml_contentTextVectorList = obj.get("ml_contentTextVector") != null ? (List>) obj.get("ml_contentTextVector") : null; - if(ml_contentTextVectorList != null) - { - ml_contentTextVector = new HashSet(ml_contentTextVectorList.get(0)); - - } - Map mapForStage3 = new HashMap<>(); - mapForStage3.put("ml_content_text_vector",ml_contentTextVector); - CassandraConnector.updateContentProperties(identifier,mapForStage3); - - - } - } - } - - //Getting Fields to be inserted into cassandra - private void extractFieldsToBeInserted(Map contentobj) { - if(contentobj.containsKey("level1Concept")){ - level1concept = (List)contentobj.get("level1Concept"); - mapStage1.put("level1_concept", level1concept); - } - if(contentobj.containsKey("level2Concept")){ - level2concept = (List)contentobj.get("level2Concept"); - mapStage1.put("level2_concept", level2concept); - } - if(contentobj.containsKey("level3Concept")){ - level3concept = (List)contentobj.get("level3Concept"); - mapStage1.put("level3_concept",level3concept ); - } - if(contentobj.containsKey("textbook_name")){ - textbook_name = (List)contentobj.get("textbook_name"); - mapStage1.put("textbook_name", textbook_name); - } - if(contentobj.containsKey("level1Name")){ - level1_name = (List)contentobj.get("level1Name"); - mapStage1.put("level1_name", level1_name); - } - if(contentobj.containsKey("level2Name")){ - level2_name = (List)contentobj.get("level2Name"); - mapStage1.put("level2_name", level2_name); - } - if(contentobj.containsKey("level3Name")){ - level3_name = (List)contentobj.get("level3Name"); - mapStage1.put("level3_name", level3_name); - } - if(contentobj.containsKey("source")){ - mapStage1.put("source",contentobj.get("source")); - } - if(contentobj.containsKey("sourceURL")){ - mapStage1.put("sourceurl",contentobj.get("sourceURL")); - } - LOGGER.info("MVCProcessorCassandraIndexer :: extractedmetadata"); - - } - - // POST reqeuest for ml keywords api - void getMLKeywords(Map contentdef) throws Exception { - JSONObject obj = new JSONObject(mlworkbenchapirequest); - JSONObject req = ((JSONObject) (obj.get("request"))); - JSONObject input = (JSONObject) req.get("input"); - JSONArray content = (JSONArray) input.get("content"); - content.put(contentdef); - req.put("job", jobname); - LOGGER.info("MVCProcessorCassandraIndexer :: getMLKeywords ::: The ML workbench URL is " + "http://" + Platform.config.getString("ml.keyword.api") + ":3579/daggit/submit"); - - try { - String resp = HTTPUtil.makePostRequest("http://" + Platform.config.getString("ml.keyword.api") + ":3579/daggit/submit", obj.toString()); - LOGGER.info("MVCProcessorCassandraIndexer :: getMLKeywords ::: The ML workbench response is " + resp); - - } catch (Exception e) { - LOGGER.info("MVCProcessorCassandraIndexer :: getMLKeywords ::: ML workbench api request failed "); - } - - } - - - // Post reqeuest for vector api - public void getMLVectors(String contentText, String identifier) throws Exception { - String mlVectorApi = Platform.config.hasPath("ml.vector.api") ? Platform.config.getString("ml.vector.api") : ""; - JSONObject obj = new JSONObject(mlvectorListRequest); - JSONObject req = ((JSONObject) (obj.get("request"))); - JSONArray text = (JSONArray) req.get("text"); - req.put("cid", identifier); - text.put(contentText); - LOGGER.info("MVCProcessorCassandraIndexer :: getMLVectors ::: The ML vector URL is " + "http://" + mlVectorApi + ":1729/ml/vector/ContentText"); - - try { - String resp = HTTPUtil.makePostRequest("http://" + mlVectorApi + ":1729/ml/vector/ContentText", obj.toString()); - LOGGER.info("MVCProcessorCassandraIndexer :: getMLVectors ::: ML vector api request response is " + resp); - } catch (Exception e) { - LOGGER.info("MVCProcessorCassandraIndexer :: getMLVectors ::: ML vector api request failed "); - } - } - -} diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/MVCProcessorESIndexer.java b/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/MVCProcessorESIndexer.java deleted file mode 100644 index f2e4a35fee..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/service/util/MVCProcessorESIndexer.java +++ /dev/null @@ -1,130 +0,0 @@ -/** - * - */ -package org.sunbird.mvcjobs.samza.service.util; - -import org.apache.commons.collections4.MapUtils; -import org.codehaus.jackson.map.ObjectMapper; -import org.codehaus.jackson.type.TypeReference; -import org.sunbird.common.Platform; -import org.sunbird.jobs.samza.service.util.AbstractESIndexer; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.learning.util.ControllerUtil; -import org.sunbird.mvcsearchindex.elasticsearch.ElasticSearchUtil; -import org.sunbird.searchindex.util.CompositeSearchConstants; - -import java.io.IOException; -import java.util.*; -import java.util.concurrent.CompletableFuture; - - -/** - * @author pradyumna - * - */ -public class MVCProcessorESIndexer extends AbstractESIndexer { - - private JobLogger LOGGER = new JobLogger(MVCProcessorESIndexer.class); - private ObjectMapper mapper = new ObjectMapper(); - private ControllerUtil util = new ControllerUtil(); - private static List NESTED_FIELDS = Platform.config.hasPath("nested.fields")? Arrays.asList(Platform.config.getString("nested.fields").split(",")): new ArrayList(); - - @Override - public void init() { - ElasticSearchUtil.initialiseESClient(CompositeSearchConstants.MVC_SEARCH_INDEX, - Platform.config.getString("search.es_conn_info")); - } - - /** - * @return - */ - - - public void createMVCSearchIndex() throws IOException { - String alias = "mvc-content"; - String settings = "{\"settings\":{\"index\":{\"max_ngram_diff\":\"29\",\"mapping\":{\"total_fields\":{\"limit\":\"1500\"}},\"number_of_shards\":\"5\",\"provided_name\":\"mvc-content-v1\",\"creation_date\":\"1593168273071\",\"analysis\":{\"filter\":{\"mynGram\":{\"token_chars\":[\"letter\",\"digit\",\"whitespace\",\"punctuation\",\"symbol\"],\"min_gram\":\"1\",\"type\":\"nGram\",\"max_gram\":\"30\"}},\"analyzer\":{\"cs_index_analyzer\":{\"filter\":[\"lowercase\",\"mynGram\"],\"type\":\"custom\",\"tokenizer\":\"standard\"},\"keylower\":{\"filter\":\"lowercase\",\"tokenizer\":\"keyword\"},\"ml_custom_analyzer\":{\"type\":\"standard\",\"stopwords\":[\"_english_\",\"_hindi_\"]},\"cs_search_analyzer\":{\"filter\":[\"lowercase\"],\"type\":\"custom\",\"tokenizer\":\"standard\"}}},\"number_of_replicas\":\"1\",\"uuid\":\"esGBPk9aQiqeRWrJA4wu9g\",\"version\":{\"created\":\"7050099\"}}}}"; - String mappings = "{\"mappings\":{\"dynamic\":\"strict\",\"properties\":{\"all_fields\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"allowedContentTypes\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"appIcon\":{\"type\":\"text\",\"index\":false},\"appIconLabel\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"appId\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"artifactUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"board\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"channel\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"contentEncoding\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"contentType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"description\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"downloadUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"framework\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"gradeLevel\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"identifier\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"label\":{\"type\":\"text\",\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"standard\"},\"language\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"lastUpdatedOn\":{\"type\":\"date\",\"fields\":{\"raw\":{\"type\":\"date\"}}},\"launchUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level1Concept\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level1Name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level2Concept\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level2Name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level3Concept\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level3Name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"medium\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"mimeType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"ml_Keywords\":{\"type\":\"text\",\"analyzer\":\"ml_custom_analyzer\",\"search_analyzer\":\"standard\"},\"ml_contentText\":{\"type\":\"text\",\"analyzer\":\"ml_custom_analyzer\",\"search_analyzer\":\"standard\"},\"ml_contentTextVector\":{\"type\":\"dense_vector\",\"dims\":768},\"name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"nodeType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"node_id\":{\"type\":\"long\",\"fields\":{\"raw\":{\"type\":\"long\"}}},\"objectType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"organisation\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"pkgVersion\":{\"type\":\"double\",\"fields\":{\"raw\":{\"type\":\"double\"}}},\"posterImage\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"previewUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"resourceType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"source\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"sourceURL\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"status\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"streamingUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"subject\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"textbook_name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"}}}}"; - String esindex = "{\"aliases\":{\"mvc-content\":{}},\"settings\":{\"index\":{\"max_ngram_diff\":\"29\",\"mapping\":{\"total_fields\":{\"limit\":\"1500\"}},\"number_of_shards\":\"5\",\"analysis\":{\"filter\":{\"mynGram\":{\"token_chars\":[\"letter\",\"digit\",\"whitespace\",\"punctuation\",\"symbol\"],\"min_gram\":\"1\",\"type\":\"nGram\",\"max_gram\":\"30\"}},\"analyzer\":{\"cs_index_analyzer\":{\"filter\":[\"lowercase\",\"mynGram\"],\"type\":\"custom\",\"tokenizer\":\"standard\"},\"keylower\":{\"filter\":\"lowercase\",\"tokenizer\":\"keyword\"},\"cs_search_analyzer\":{\"filter\":[\"lowercase\"],\"type\":\"custom\",\"tokenizer\":\"standard\"},\"ml_custom_analyzer\":{\"type\":\"standard\",\"stopwords\":[\"_english_\",\"_hindi_\"]}}},\"number_of_replicas\":\"1\"}},\"mappings\":{\"dynamic\":\"strict\",\"properties\":{\"organisation\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"channel\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"framework\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"board\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"medium\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"subject\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"gradeLevel\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"description\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"language\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"appId\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"appIcon\":{\"type\":\"text\",\"index\":false},\"appIconLabel\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"contentEncoding\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"identifier\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"node_id\":{\"type\":\"long\",\"fields\":{\"raw\":{\"type\":\"long\"}}},\"nodeType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"mimeType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"resourceType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"contentType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"allowedContentTypes\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"objectType\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"posterImage\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"artifactUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"launchUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"previewUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"streamingUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"downloadUrl\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"status\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"pkgVersion\":{\"type\":\"double\",\"fields\":{\"raw\":{\"type\":\"double\"}}},\"lastUpdatedOn\":{\"type\":\"date\",\"fields\":{\"raw\":{\"type\":\"date\"}}},\"textbook_name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level1Name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level1Concept\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level2Name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level2Concept\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level3Name\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"level3Concept\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"source\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"sourceURL\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"copy_to\":[\"all_fields\"],\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"},\"ml_Keywords\":{\"type\":\"text\",\"analyzer\":\"ml_custom_analyzer\",\"search_analyzer\":\"standard\"},\"ml_contentText\":{\"type\":\"text\",\"analyzer\":\"ml_custom_analyzer\",\"search_analyzer\":\"standard\"},\"ml_contentTextVector\":{\"type\":\"dense_vector\",\"dims\":768},\"label\":{\"type\":\"text\",\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"standard\"},\"all_fields\":{\"type\":\"text\",\"fields\":{\"raw\":{\"type\":\"text\",\"analyzer\":\"keylower\",\"fielddata\":true}},\"analyzer\":\"cs_index_analyzer\",\"search_analyzer\":\"cs_search_analyzer\"}}}}"; - ElasticSearchUtil.addIndex(CompositeSearchConstants.MVC_SEARCH_INDEX, - CompositeSearchConstants.MVC_SEARCH_INDEX_TYPE, settings, mappings,alias,esindex); - } - - @SuppressWarnings({ "rawtypes", "unchecked" }) - - - public void upsertDocument(String uniqueId, Map jsonIndexDocument) throws Exception { - - - String action = jsonIndexDocument.get("action").toString(); - jsonIndexDocument = removeExtraParams(jsonIndexDocument); - proocessNestedProps(jsonIndexDocument); - String jsonAsString = mapper.writeValueAsString(jsonIndexDocument); - switch (action) { - case "update-es-index": { - // Insert a new doc - ElasticSearchUtil.addDocumentWithId(CompositeSearchConstants.MVC_SEARCH_INDEX, - uniqueId, jsonAsString); - break; - } - case "update-content-rating" : { - String resp = ElasticSearchUtil.getDocumentAsStringById(CompositeSearchConstants.MVC_SEARCH_INDEX, - uniqueId); - if (null != resp && resp.contains(uniqueId)) { - LOGGER.info("ES Document Found With Identifier " + uniqueId + " | Updating Content Rating."); - Map metadata = (Map) jsonIndexDocument.get("metadata"); - String finalJsonindexasString = mapper.writeValueAsString(metadata); - CompletableFuture.runAsync(() -> { - ElasticSearchUtil.updateDocument(CompositeSearchConstants.MVC_SEARCH_INDEX, - uniqueId, finalJsonindexasString); - - }); - } else - LOGGER.info("ES Document Not Found With Identifier " + uniqueId + " | Skipped Updating Content Rating."); - } - case "update-ml-contenttextvector": { - List> ml_contentTextVectorList; - Set ml_contentTextVector = null; - ml_contentTextVectorList = jsonIndexDocument.get("ml_contentTextVector") != null ? (List>) jsonIndexDocument.get("ml_contentTextVector") : null; - if (ml_contentTextVectorList != null) { - ml_contentTextVector = new HashSet(ml_contentTextVectorList.get(0)); - - } - jsonIndexDocument.put("ml_contentTextVector", ml_contentTextVector); - jsonAsString = mapper.writeValueAsString(jsonIndexDocument); - } - case "update-ml-keywords": { - // Update a doc - ElasticSearchUtil.updateDocument(CompositeSearchConstants.MVC_SEARCH_INDEX, - uniqueId, jsonAsString); - - break; - } - default: - LOGGER.info("No Action Matched. Skipped Processing Event For " + uniqueId); - } - - } - - private void proocessNestedProps(Map jsonIndexDocument) throws IOException { - if (MapUtils.isNotEmpty(jsonIndexDocument)) { - for (String propertyName : jsonIndexDocument.keySet()) { - if (NESTED_FIELDS.contains(propertyName)) { - Map propertyNewValue = mapper.readValue((String) jsonIndexDocument.get(propertyName), - new TypeReference() { - }); - jsonIndexDocument.put(propertyName, propertyNewValue); - } - } - } - } - - - // Remove params which should not be inserted into ES - public Map removeExtraParams(Map obj) { - obj.remove("action"); - obj.remove("stage"); - return obj; - } - - -} diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/task/MVCSearchIndexerTask.java b/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/task/MVCSearchIndexerTask.java deleted file mode 100644 index 88d84d2469..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/java/org/sunbird/mvcjobs/samza/task/MVCSearchIndexerTask.java +++ /dev/null @@ -1,100 +0,0 @@ -package org.sunbird.mvcjobs.samza.task; - -import org.apache.commons.collections.MapUtils; -import org.apache.commons.lang.StringUtils; -import org.apache.samza.config.Config; -import org.apache.samza.system.IncomingMessageEnvelope; -import org.apache.samza.system.OutgoingMessageEnvelope; -import org.apache.samza.system.SystemStream; -import org.apache.samza.task.MessageCollector; -import org.apache.samza.task.TaskContext; -import org.apache.samza.task.TaskCoordinator; -import org.sunbird.jobs.samza.task.BaseTask; -import org.sunbird.mvcjobs.samza.service.MVCProcessorService; -import org.sunbird.jobs.samza.service.ISamzaService; -import org.sunbird.jobs.samza.service.task.JobMetrics; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.jobs.samza.util.SamzaCommonParams; -import org.sunbird.learning.util.ControllerUtil; - -import java.util.HashMap; -import java.util.Map; - - -public class MVCSearchIndexerTask extends BaseTask { - - private JobLogger LOGGER = new JobLogger(MVCSearchIndexerTask.class); - private ControllerUtil controllerUtil = new ControllerUtil(); - - private ISamzaService service; - private JobMetrics metrics; - - public ISamzaService getService() { - return service; - } - - public MVCSearchIndexerTask(Config config, TaskContext context, ISamzaService service) throws Exception { - init(config, context, service); - } - - public MVCSearchIndexerTask() { - - } - - public void init(Config config, TaskContext context, ISamzaService service) throws Exception { - try { - metrics = new JobMetrics(context, config.get("output.metrics.job.name"), config.get("output.metrics.topic.name")); - this.service = (service == null ? new MVCProcessorService() : service); - this.service.initialize(config); - LOGGER.info("Task initialized"); - } catch (Exception ex) { - LOGGER.error("Task initialization failed", ex); - throw ex; - } - } - - @Override - public void init(Config config, TaskContext context) throws Exception { - init(config, context, null); - } - - - @Override - public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) throws Exception { - Map outgoingMap = getMessage(envelope); - try { - if (outgoingMap.containsKey(SamzaCommonParams.edata.name())) { - Map edata = (Map) outgoingMap.getOrDefault(SamzaCommonParams.edata.name(), new HashMap()); - if (MapUtils.isNotEmpty(edata) && StringUtils.equalsIgnoreCase("definition_update", edata.getOrDefault("action", "").toString())) { - LOGGER.info("definition_update event received for objectType: " + edata.getOrDefault("objectType", "").toString()); - String graphId = edata.getOrDefault("graphId", "").toString(); - String objectType = edata.getOrDefault("objectType", "").toString(); - controllerUtil.updateDefinitionCache(graphId, objectType); - } - } else { - service.processMessage(outgoingMap, metrics, collector); - } - } catch (Exception e) { - metrics.incErrorCounter(); - LOGGER.error("Error while processing message:", outgoingMap, e); - } - } - - @SuppressWarnings("unchecked") - private Map getMessage(IncomingMessageEnvelope envelope) { - try { - return (Map) envelope.getMessage(); - } catch (Exception e) { - e.printStackTrace(); - LOGGER.error("Invalid message:" + envelope.getMessage(), e); - return new HashMap(); - } - } - - @Override - public void window(MessageCollector collector, TaskCoordinator coordinator) { - Map event = metrics.collect(); - collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", metrics.getTopic()), event)); - metrics.clear(); - } -} \ No newline at end of file diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/resources/application.conf b/platform-jobs/samza/mvc-processor-indexer/src/main/resources/application.conf deleted file mode 100644 index e69de29bb2..0000000000 diff --git a/platform-jobs/samza/mvc-processor-indexer/src/main/resources/log4j.xml b/platform-jobs/samza/mvc-processor-indexer/src/main/resources/log4j.xml deleted file mode 100644 index d2db3940cc..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/main/resources/log4j.xml +++ /dev/null @@ -1,20 +0,0 @@ - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/mvc-processor-indexer/src/test/java/org/sunbird/mvcjobs/samza/test/ContentUtilTest.java b/platform-jobs/samza/mvc-processor-indexer/src/test/java/org/sunbird/mvcjobs/samza/test/ContentUtilTest.java deleted file mode 100644 index a511d1816e..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/test/java/org/sunbird/mvcjobs/samza/test/ContentUtilTest.java +++ /dev/null @@ -1,48 +0,0 @@ -package org.sunbird.mvcjobs.samza.test; - -import com.google.gson.Gson; -import org.sunbird.mvcjobs.samza.service.util.ContentUtil; -import org.sunbird.searchindex.util.HTTPUtil; -import org.junit.Before; -import org.junit.Test; -import org.junit.runner.RunWith; -import org.mockito.Mockito; -import org.mockito.MockitoAnnotations; -import org.powermock.api.mockito.PowerMockito; -import org.powermock.core.classloader.annotations.PowerMockIgnore; -import org.powermock.core.classloader.annotations.PrepareForTest; -import org.powermock.modules.junit4.PowerMockRunner; -import org.apache.samza.config.Config; - -import java.io.IOException; -import java.util.Map; - -import static org.mockito.Mockito.*; - -@RunWith(PowerMockRunner.class) -@PrepareForTest({HTTPUtil.class}) -@PowerMockIgnore({"javax.management.*", "sun.security.ssl.*", "javax.net.ssl.*" , "javax.crypto.*"}) -public class ContentUtilTest { - private Config configMock; - private String getResp = "{\"id\":\"api.content.read\",\"ver\":\"1.0\",\"ts\":\"2020-07-21T05:38:46.301Z\",\"params\":{\"resmsgid\":\"7224a4d0-cb14-11ea-9313-0912071b8abe\",\"msgid\":\"722281f0-cb14-11ea-9313-0912071b8abe\",\"status\":\"successful\",\"err\":null,\"errmsg\":null},\"responseCode\":\"OK\",\"result\":{\"content\":{\"ownershipType\":[\"createdBy\"],\"code\":\"test.res.1\",\"channel\":\"in.ekstep\",\"language\":[\"English\"],\"mediaType\":\"content\",\"osId\":\"org.sunbird.quiz.app\",\"languageCode\":[\"en\"],\"version\":2,\"versionKey\":\"1591949601174\",\"license\":\"CC BY 4.0\",\"idealScreenDensity\":\"hdpi\",\"framework\":\"NCFCOPY\",\"s3Key\":\"content/do_113041248230580224116/artifact/validecml_1591949596304.zip\",\"createdBy\":\"95e4942d-cbe8-477d-aebd-ad8e6de4bfc8\",\"compatibilityLevel\":1,\"name\":\"Resource Content 1\",\"status\":\"Draft\",\"level1Concept\":[\"Addition\"],\"level1Name\":[\"Math-Magic\"],\"textbook_name\":[\"How Many Times?\"],\"sourceURL\":\"https://diksha.gov.in/play/content/do_30030488\",\"source\":[\"Diksha 1\"]}}}"; - private String eventData = "{\"identifier\":\"do_113041248230580224116\",\"action\":\"update-es-index\",\"stage\":1}"; - private String uniqueId = "do_113041248230580224116"; - @Before - public void setup(){ - MockitoAnnotations.initMocks(this); - configMock = mock(Config.class); - stub(configMock.get("nested.fields")).toReturn("badgeAssertions,targets,badgeAssociations,plugins,me_totalTimeSpent,me_totalPlaySessionCount,me_totalTimeSpentInSec,batches"); - } - - @Test - public void getContentMetaData()throws Exception { - PowerMockito.mockStatic(HTTPUtil.class); - when(HTTPUtil.makeGetRequest(Mockito.anyString())).thenReturn(getResp); - ContentUtil.getContentMetaData(getEvent(eventData),uniqueId); - } - - public Map getEvent(String message) throws IOException { - return new Gson().fromJson(message, Map.class); - } -} - diff --git a/platform-jobs/samza/mvc-processor-indexer/src/test/java/org/sunbird/mvcjobs/samza/test/MVCProcessorCassandraTest.java b/platform-jobs/samza/mvc-processor-indexer/src/test/java/org/sunbird/mvcjobs/samza/test/MVCProcessorCassandraTest.java deleted file mode 100644 index 570839cb57..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/test/java/org/sunbird/mvcjobs/samza/test/MVCProcessorCassandraTest.java +++ /dev/null @@ -1,77 +0,0 @@ -package org.sunbird.mvcjobs.samza.test; - -import com.google.gson.Gson; -import org.sunbird.mvcjobs.samza.service.util.CassandraConnector; -import org.sunbird.mvcjobs.samza.service.util.MVCProcessorCassandraIndexer; -import org.sunbird.mvcsearchindex.elasticsearch.ElasticSearchUtil; -import org.sunbird.searchindex.util.HTTPUtil; -import org.junit.Before; -import org.junit.Test; -import org.junit.runner.RunWith; -import org.mockito.Mockito; -import org.mockito.MockitoAnnotations; -import org.powermock.api.mockito.PowerMockito; -import org.powermock.core.classloader.annotations.PowerMockIgnore; -import org.powermock.core.classloader.annotations.PrepareForTest; -import org.powermock.modules.junit4.PowerMockRunner; -import org.apache.samza.config.Config; - -import java.io.IOException; -import java.util.Map; - -import static org.mockito.Mockito.*; - -@RunWith(PowerMockRunner.class) -@PrepareForTest({ElasticSearchUtil.class, Config.class,MVCProcessorCassandraIndexer.class,CassandraConnector.class, HTTPUtil.class}) -@PowerMockIgnore({"javax.management.*", "sun.security.ssl.*", "javax.net.ssl.*" , "javax.crypto.*"}) -public class MVCProcessorCassandraTest { - private String uniqueId = "do_113041248230580224116"; - private String eventData = "{\"identifier\":\"do_113041248230580224116\",\"action\":\"update-es-index\",\"stage\":1,\"ownershipType\":[\"createdBy\"],\"code\":\"test.res.1\",\"channel\":\"in.ekstep\",\"language\":[\"English\"],\"mediaType\":\"content\",\"osId\":\"org.sunbird.quiz.app\",\"languageCode\":[\"en\"],\"version\":2,\"versionKey\":\"1591949601174\",\"license\":\"CC BY 4.0\",\"idealScreenDensity\":\"hdpi\",\"framework\":\"NCFCOPY\",\"s3Key\":\"content/do_113041248230580224116/artifact/validecml_1591949596304.zip\",\"createdBy\":\"95e4942d-cbe8-477d-aebd-ad8e6de4bfc8\",\"compatibilityLevel\":1,\"name\":\"Resource Content 1\",\"status\":\"Draft\",\"level1Concept\":[\"Addition\"],\"level1Name\":[\"Math-Magic\"],\"textbook_name\":[\"How Many Times?\"],\"sourceURL\":\"https://diksha.gov.in/play/content/do_30030488\",\"source\":[\"Diksha 1\"]}"; - private String eventData2 = "{\"action\":\"update-ml-keywords\",\"stage\":\"2\",\"ml_Keywords\":[\"maths\",\"addition\",\"add\"],\"ml_contentText\":\"This is the content text for addition of two numbers.\"}"; - private String eventData3 = "{\"action\":\"update-ml-contenttextvector\",\"stage\":3,\"ml_contentTextVector\":[[0.2961231768131256, 0.13621050119400024, 0.655802309513092, -0.33641257882118225]]}"; - private String getResp = "{\"id\":\"api.content.read\",\"ver\":\"1.0\",\"ts\":\"2020-07-21T05:38:46.301Z\",\"params\":{\"resmsgid\":\"7224a4d0-cb14-11ea-9313-0912071b8abe\",\"msgid\":\"722281f0-cb14-11ea-9313-0912071b8abe\",\"status\":\"successful\",\"err\":null,\"errmsg\":null},\"responseCode\":\"OK\",\"result\":{\"content\":{\"ownershipType\":[\"createdBy\"],\"code\":\"test.res.1\",\"channel\":\"in.ekstep\",\"language\":[\"English\"],\"mediaType\":\"content\",\"osId\":\"org.sunbird.quiz.app\",\"languageCode\":[\"en\"],\"version\":2,\"versionKey\":\"1591949601174\",\"license\":\"CC BY 4.0\",\"idealScreenDensity\":\"hdpi\",\"framework\":\"NCFCOPY\",\"s3Key\":\"content/do_113041248230580224116/artifact/validecml_1591949596304.zip\",\"createdBy\":\"95e4942d-cbe8-477d-aebd-ad8e6de4bfc8\",\"compatibilityLevel\":1,\"name\":\"Resource Content 1\",\"status\":\"Draft\",\"level1Concept\":[\"Addition\"],\"level1Name\":[\"Math-Magic\"],\"textbook_name\":[\"How Many Times?\"],\"sourceURL\":\"https://diksha.gov.in/play/content/do_30030488\",\"source\":[\"Diksha 1\"]}}}"; - private String postReqStage1Resp = "{\"id\":\"api.daggit\",\"params\":{\"err\":\"null\",\"errmsg\":\"Dag Initialization failed\",\"msgid\":\"\",\"resmsgid\":\"null\",\"status\":\"success\"},\"responseCode\":\"OK\",\"result\":{\"execution_date\":\"2020-07-08\",\"experiment_name\":\"Content_tagging_20200708-141923\",\"status\":200},\"ts\":\"2020-07-08 14:19:23:1594198163\",\"ver\":\"v1\"}"; - private String postReqStage2Resp = "{\"ets\":\"2020-07-14 15:27:23:1594720643\",\"id\":\"api.ml.vector\",\"params\":{\"err\":\"null\",\"errmsg\":\"null\",\"msgid\":\"\",\"resmsgid\":\"null\",\"status\":\"success\"},\"result\":{\"action\":\"get_BERT_embedding\",\"vector\":[[]]}}"; - private Config configMock; - - @Before - public void setup(){ - MockitoAnnotations.initMocks(this); - configMock = mock(Config.class); - stub(configMock.get("nested.fields")).toReturn("badgeAssertions,targets,badgeAssociations,plugins,me_totalTimeSpent,me_totalPlaySessionCount,me_totalTimeSpentInSec,batches"); - } - - @Test - public void testInsertToCassandraForStage1() throws Exception { - PowerMockito.mockStatic(HTTPUtil.class); - when(HTTPUtil.makePostRequest(Mockito.anyString(),Mockito.anyString())).thenReturn(postReqStage1Resp); - PowerMockito.mockStatic(CassandraConnector.class); - PowerMockito.doNothing().when(CassandraConnector.class); - CassandraConnector.updateContentProperties(Mockito.anyString(),Mockito.anyMap()); - MVCProcessorCassandraIndexer cassandraManager = new MVCProcessorCassandraIndexer(); - cassandraManager.insertIntoCassandra(getEvent(eventData),uniqueId); - } - @Test - public void testInsertToCassandraForStage2() throws Exception { - PowerMockito.mockStatic(HTTPUtil.class); - when(HTTPUtil.makePostRequest(Mockito.anyString(),Mockito.anyString())).thenReturn(postReqStage2Resp); - PowerMockito.mockStatic(CassandraConnector.class); - PowerMockito.doNothing().when(CassandraConnector.class); - CassandraConnector.updateContentProperties(Mockito.anyString(),Mockito.anyMap()); - MVCProcessorCassandraIndexer cassandraManager = new MVCProcessorCassandraIndexer(); - cassandraManager.insertIntoCassandra(getEvent(eventData2),uniqueId); - } - @Test - public void testInsertToCassandraForStage3() throws Exception { - PowerMockito.mockStatic(CassandraConnector.class); - PowerMockito.doNothing().when(CassandraConnector.class); - CassandraConnector.updateContentProperties(Mockito.anyString(),Mockito.anyMap()); - MVCProcessorCassandraIndexer cassandraManager = new MVCProcessorCassandraIndexer(); - cassandraManager.insertIntoCassandra(getEvent(eventData3),uniqueId); - } - - public Map getEvent(String message) throws IOException { - return new Gson().fromJson(message, Map.class); - } - -} diff --git a/platform-jobs/samza/mvc-processor-indexer/src/test/java/org/sunbird/mvcjobs/samza/test/MVCProcessorESIndexerTest.java b/platform-jobs/samza/mvc-processor-indexer/src/test/java/org/sunbird/mvcjobs/samza/test/MVCProcessorESIndexerTest.java deleted file mode 100644 index 848317d98f..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/test/java/org/sunbird/mvcjobs/samza/test/MVCProcessorESIndexerTest.java +++ /dev/null @@ -1,89 +0,0 @@ - -package org.sunbird.mvcjobs.samza.test; - -import com.google.gson.Gson; -import org.apache.commons.lang.StringUtils; -import org.sunbird.mvcjobs.samza.service.util.MVCProcessorESIndexer; -import org.sunbird.mvcsearchindex.elasticsearch.ElasticSearchUtil; -import static org.junit.Assert.assertTrue; - -import org.junit.Before; -import org.junit.Test; -import org.junit.runner.RunWith; -import org.mockito.Mockito; -import org.mockito.MockitoAnnotations; -import org.powermock.api.mockito.PowerMockito; -import org.powermock.core.classloader.annotations.PowerMockIgnore; -import org.powermock.core.classloader.annotations.PrepareForTest; -import org.powermock.modules.junit4.PowerMockRunner; -import org.apache.samza.config.Config; - -import java.io.IOException; -import java.util.Map; - -import static org.mockito.Mockito.*; - -@RunWith(PowerMockRunner.class) -@PrepareForTest({ElasticSearchUtil.class, Config.class, MVCProcessorESIndexer.class}) -@PowerMockIgnore({"javax.management.*", "sun.security.ssl.*", "javax.net.ssl.*" , "javax.crypto.*"}) -public class MVCProcessorESIndexerTest { - private String uniqueId = "do_113041248230580224116"; - private String eventDataNewDoc = "{\"identifier\":\"do_113041248230580224116\",\"action\":\"update-es-index\",\"stage\":1}"; - private String eventDataMlKeywords = "{\"action\":\"update-ml-keywords\",\"stage\":\"2\",\"ml_Keywords\":[\"maths\",\"addition\",\"add\"],\"ml_contentText\":\"This is the content text for addition of two numbers.\"}"; - private String eventDataContentRating = "{\"action\":\"update-content-rating\",\"stage\":4,\"metadata\":{\"me_averageRating\":\"1\",\"me_total_time_spent_in_app\":\"2\",\"me_total_time_spent_in_portal\":\"3\",\"me_total_time_spent_in_desktop\":\"4\",\"me_total_play_sessions_in_app\":\"5\",\"me_total_play_sessions_in_portal\":\"6\",\"me_total_play_sessions_in_desktop\":\"7\"}}"; - private String eventDataContentTextVector = "{\"action\":\"update-ml-contenttextvector\",\"stage\":3,\"ml_contentTextVector\":[[1.1,2,7.4,68]]}"; - private Config configMock; - private MVCProcessorESIndexer mvcProcessorESIndexer = new MVCProcessorESIndexer(); - - @Before - public void setup(){ - MockitoAnnotations.initMocks(this); - configMock = mock(Config.class); - stub(configMock.get("nested.fields")).toReturn("badgeAssertions,targets,badgeAssociations,plugins,me_totalTimeSpent,me_totalPlaySessionCount,me_totalTimeSpentInSec,batches"); - PowerMockito.mockStatic(ElasticSearchUtil.class); - PowerMockito.doNothing().when(ElasticSearchUtil.class); - } - - @Test - public void testUpsertDocumentCaseUpdateEsIndex() throws Exception { - ElasticSearchUtil.addDocumentWithId(Mockito.anyString(),Mockito.anyString(),Mockito.anyString()); - mvcProcessorESIndexer.upsertDocument(uniqueId,getEvent(eventDataNewDoc)); - when(ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString())).thenReturn(uniqueId); - String doc = ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString()); - assertTrue(StringUtils.contains(doc, uniqueId)); - } - - @Test - public void testUpsertDocumentUpdateMlKeywords() throws Exception { - ElasticSearchUtil.updateDocument(Mockito.anyString(),Mockito.anyString(),Mockito.anyString()); - mvcProcessorESIndexer.upsertDocument(uniqueId,getEvent(eventDataMlKeywords)); - when(ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString())).thenReturn(uniqueId); - String doc = ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString()); - assertTrue(StringUtils.contains(doc, uniqueId)); - } - - @Test - public void testUpsertDocumentUpdateMlContentTextVector() throws Exception { - ElasticSearchUtil.updateDocument(Mockito.anyString(),Mockito.anyString(),Mockito.anyString()); - mvcProcessorESIndexer.upsertDocument(uniqueId,getEvent(eventDataContentTextVector)); - when(ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString())).thenReturn(uniqueId); - String doc = ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString()); - assertTrue(StringUtils.contains(doc, uniqueId)); - } - - @Test - public void testUpsertDocumentUpdateContentRating() throws Exception { - ElasticSearchUtil.updateDocument(Mockito.anyString(),Mockito.anyString(),Mockito.anyString()); - when(ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString())).thenReturn(uniqueId); - mvcProcessorESIndexer.upsertDocument(uniqueId,getEvent(eventDataContentRating)); - String doc = ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString()); - assertTrue(StringUtils.contains(doc, uniqueId)); - } - - - public Map getEvent(String message) throws IOException { - return new Gson().fromJson(message, Map.class); - } - -} - diff --git a/platform-jobs/samza/mvc-processor-indexer/src/test/resources/application.conf b/platform-jobs/samza/mvc-processor-indexer/src/test/resources/application.conf deleted file mode 100644 index ca201c8819..0000000000 --- a/platform-jobs/samza/mvc-processor-indexer/src/test/resources/application.conf +++ /dev/null @@ -1,40 +0,0 @@ -# Graph Configuration -graph.dir=/data/graphDB -akka.request_timeout=30 -environment.id=10000000 -graph.ids=domain -graph.passport.key.base=31b6fd1c4d64e745c867e61a45edc34a -route.domain="bolt://localhost:7687" -route.bolt.write.domain="bolt://localhost:7687" -route.bolt.read.domain="bolt://localhost:7687" -route.bolt.comment.domain="bolt://localhost:7687" -route.all="bolt://localhost:7687" -route.bolt.write.all="bolt://localhost:7687" -route.bolt.read.all="bolt://localhost:7687" -route.bolt.comment.all="bolt://localhost:7687" -shard.id=1 -platform.auth.check.enabled=false -platform.cache.ttl=3600000 - -# Elasticsearch properties -search.es_conn_info="localhost:9200" -search.fields.query=["name^100","title^100","lemma^100","code^100","tags^100","domain","subject","description^10","keywords^25","ageGroup^10","filter^10","theme^10","genre^10","objects^25","contentType^100","language^200","teachingMode^25","skills^10","learningObjective^10","curriculum^100","gradeLevel^100","developer^100","attributions^10","owner^50","text","words","releaseNotes"] -search.fields.date=["lastUpdatedOn","createdOn","versionDate","lastSubmittedOn","lastPublishedOn"] -search.batch.size=500 -search.connection.timeout=30 -search.connection.timeout=30 -platform-api-url="http://localhost:8080/language-service" - -LearningActorSystem{ - default-dispatcher { - type = "Dispatcher" - executor = "fork-join-executor" - fork-join-executor { - parallelism-min = 1 - parallelism-factor = 2.0 - parallelism-max = 4 - } - # Throughput for default Dispatcher, set to 1 for as fair as possible - throughput = 1 - } -} diff --git a/platform-jobs/samza/pom.xml b/platform-jobs/samza/pom.xml deleted file mode 100644 index 687411d755..0000000000 --- a/platform-jobs/samza/pom.xml +++ /dev/null @@ -1,46 +0,0 @@ - - 4.0.0 - - org.sunbird - platform-jobs - 1.1-SNAPSHOT - - - UTF-8 - 0.14.1 - 1.8 - 2.11 - 2.6.2 - 1.1.0 - - org.sunbird - samza - pom - EkStep Platform Samza Jobs - This Project Contains all the backend jobs, they may be the Pipeline Consumers. - - common - course-common - publish-pipeline - qrcode-image-generator - distribution - qr-image-generator - merge-user-courses - auto-creator - mvc-processor-indexer - - - - - - org.apache.maven.plugins - maven-compiler-plugin - - ${java.version} - ${java.version} - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/publish-pipeline/.gitignore b/platform-jobs/samza/publish-pipeline/.gitignore deleted file mode 100644 index b83d22266a..0000000000 --- a/platform-jobs/samza/publish-pipeline/.gitignore +++ /dev/null @@ -1 +0,0 @@ -/target/ diff --git a/platform-jobs/samza/publish-pipeline/pom.xml b/platform-jobs/samza/publish-pipeline/pom.xml deleted file mode 100644 index 01f8da356f..0000000000 --- a/platform-jobs/samza/publish-pipeline/pom.xml +++ /dev/null @@ -1,87 +0,0 @@ - - 4.0.0 - - org.sunbird - samza - 1.1-SNAPSHOT - - publish-pipeline - 0.0.386 - - - org.sunbird - samza-common - 1.1-SNAPSHOT - - - org.sunbird - content-manager - 1.1-beta - jar - - - org.apache.kafka - kafka-clients - - - - - org.sunbird - unit-tests - 1.1-SNAPSHOT - test - - - org.mockito - mockito-all - 1.10.19 - test - - - com.fasterxml.jackson.core - jackson-databind - 2.7.8 - - - com.fasterxml.jackson.core - jackson-core - 2.6.0 - - - com.fasterxml.jackson.core - jackson-annotations - 2.7.8 - - - - - - org.apache.maven.plugins - maven-compiler-plugin - - 1.8 - 1.8 - - - - - maven-assembly-plugin - - - src/main/assembly/src.xml - - - - - make-assembly - package - - single - - - - - - - diff --git a/platform-jobs/samza/publish-pipeline/src/main/assembly/src.xml b/platform-jobs/samza/publish-pipeline/src/main/assembly/src.xml deleted file mode 100644 index 70c2e6a187..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/assembly/src.xml +++ /dev/null @@ -1,69 +0,0 @@ - - - - - distribution - - tar.gz - - false - - - ${basedir} - - README* - LICENSE* - NOTICE* - - - - - - ${basedir}/src/main/resources/log4j.xml - lib - - - - ${basedir}/src/main/config/publish-pipeline.properties - config - true - - - - - bin - - org.apache.samza:samza-shell:tgz:dist:* - - 0744 - true - - - lib - - org.apache.samza:samza-api - org.sunbird:publish-pipeline - org.apache.samza:samza-core_2.11 - org.apache.samza:samza-kafka_2.11 - org.apache.samza:samza-yarn_2.11 - org.apache.samza:samza-log4j - org.apache.kafka:kafka_2.11 - org.apache.hadoop:hadoop-hdfs - - true - - - diff --git a/platform-jobs/samza/publish-pipeline/src/main/config/local.publish-pipeline.properties b/platform-jobs/samza/publish-pipeline/src/main/config/local.publish-pipeline.properties deleted file mode 100644 index a3049416de..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/config/local.publish-pipeline.properties +++ /dev/null @@ -1,167 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=dev.publish.pipeline - -# YARN -yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-distribution.tar.gz - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.dev.lp.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory - -# Task -task.class=org.sunbird.jobs.samza.task.PublishPipelineTask -#task.inputs=kafka.telemetry.raw -task.inputs=kafka.local.learning.job.request -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=1 -task.commit.ms=60000 -task.opts=-agentlib:jdwp=transport=dt_socket,address=localhost:9009,server=y,suspend=y -task.window.ms=300000 -#task.opts=-Dfile.encoding=UTF8 -task.broadcast.inputs=kafka.dev.system.command#0 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=localhost:2181 -systems.kafka.consumer.auto.offset.reset=largest -systems.kafka.producer.bootstrap.servers=localhost:9092 - -# Job Coordinator -job.coordinator.system=kafka -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=1 - -# Job specific config properties -graph.dir="/data/graphDB" -redis.host=localhost -redis.port=6379 -redis.maxConnections=128 -akka.request_timeout=30 -environment.id=10000000 -graph.ids=domain -graph.passport.key.base=31b6fd1c4d64e745c867e61a45edc34a -route.domain=bolt://localhost:7687 -route.bolt.write.domain=bolt://localhost:7687 -route.bolt.read.domain=bolt://localhost:7687 -route.bolt.comment.domain=bolt://localhost:7687 -route.all=bolt://localhost:7687 -route.bolt.write.all=bolt://localhost:7687 -route.bolt.read.all=bolt://localhost:7687 -route.bolt.comment.all=bolt://localhost:7687 -shard.id=1 -platform.auth.check.enabled=false -platform.cache.ttl=3600000 -backend_telemetry_topic=local.telemetry.backend -failed_event_topic=local.learning.job.request - -#Current environment -cloud_storage.env=dev - -#Folder configuration -cloud_storage.content.folder=content -cloud_storage.itemset.folder = "itemset" -cloud_storage.asset.folder=assets -cloud_storage.artefact.folder=artifact -cloud_storage.bundle.folder=bundle -cloud_storage.media.folder=media -cloud_storage.ecar.folder=ecar_files -cloud_storage.upload.url.ttl=600 - - -# Media download configuration -content.media.base.url=https://dev.ekstep.in -plugin.media.base.url=https://dev.ekstep.in - -#directory location where store unzip file -dist.directory = /data/tmp/dist/ -output.zipfile = /data/tmp/story.zip -source.folder = /data/tmp/temp2/ -save.directory = /data/tmp/temp/ - -MAX_CONTENT_PACKAGE_FILE_SIZE_LIMIT = 52428800 -MAX_ASSET_FILE_SIZE_LIMIT = 20971520 -RETRY_ASSET_DOWNLOAD_COUNT = 1 - -platform-api-url=http://localhost:8080/learning-service - -lp.tempfile.location=__lp_tmpfile_location__ -publish.collection.fullecar.disable=true -max.iteration.count.samza.job=2 -publish.content.limit=200 - -#Remote Debug Configuration -#task.opts=-agentlib:jdwp=transport=dt_socket,address=localhost:9009,server=y,suspend=y - -# Metrics -output.metrics.job.name=publish-pipeline -output.metrics.topic.name=__env__.pipeline_metrics - -#Failed Topic Config -output.failed.events.topic.name=local.learning.events.failed - -telemetry_env=LOCAL -# Configuration for default channel ID -channel.default=in.ekstep - -#Streamable media type list -stream.mime.type=video/mp4,video/webm -stream.keyspace.name=platform_db -stream.keyspace.table=job_request - -cassandra.lp.connection=localhost:9042 -cassandra.lpa.connection=localhost:9042 - -search.es_conn_info=localhost:9200 - -#restrict.metadata.objectTypes=Content,ContentImage - -content.nested.fields=badgeAssertions,targets,badgeAssociations - - -# Max size(width/height) of thumbnail in pixels -max.thumbnail.size.pixels=150 - -installation.id= - -# Cloud store details (Please replace them for local testing) -cloud_storage_type= -azure_storage_key= -azure_storage_secret= - -azure_storage_container= -aws_storage_key= -aws_storage_secret= -aws_storage_container= - -#Post publish Job topic name -post.publish.event.topic=local.content.postpublish.request -post.publish.mvc.topic=local.mvc.processor.job.request -kp.print.service.base.url=http://11.2.2.4:5001 -lp.assessment.tmp_file_location=/tmp/ -lp.assessment.template_name=questionSetTemplate.vm - -content.tagging.backward_enable=true -content.tagging.property=subject,medium - -# For enabling transfer of content from one path to other -content.upload.context.driven=true - -# PDF generation for contents linked to ItemSet -itemset.generate.pdf=true -content.streaming_enabled=true - -# This is added to handle large artifacts sizes differently -content.artifact.size.for_online=209715200 - -#Content Type Primary Category mapping -contentTypeToPrimaryCategory={\"ClassroomTeachingVideo\":\"Explanation Content\",\"ConceptMap\":\"Learning Resource\",\"Course\":\"Course\",\"CuriosityQuestionSet\":\"Practice Question Set\",\"eTextBook\":\"eTextbook\",\"ExperientialResource\":\"Learning Resource\",\"ExplanationResource\":\"Explanation Content\",\"ExplanationVideo\":\"Explanation Content\",\"FocusSpot\":\"Teacher Resource\",\"LearningOutcomeDefinition\":\"Teacher Resource\",\"MarkingSchemeRubric\":\"Teacher Resource\",\"PedagogyFlow\":\"Teacher Resource\",\"PracticeQuestionSet\":\"Practice Question Set\",\"PracticeResource\":\"Practice Question Set\",\"SelfAssess\":\"Course Assessment\",\"TeachingMethod\":\"Teacher Resource\",\"TextBook\":\"Digital Textbook\",\"Collection\":\"Content Playlist\",\"ExplanationReadingMaterial\":\"Learning Resource\",\"LearningActivity\":\"Learning Resource\",\"LessonPlan\":\"Content Playlist\",\"LessonPlanResource\":\"Teacher Resource\",\"PreviousBoardExamPapers\":\"Learning Resource\",\"TVLesson\":\"Explanation Content\",\"OnboardingResource\":\"Learning Resource\",\"ReadingMaterial\":\"Learning Resource\",\"Template\":\"Template\",\"Asset\":\"Asset\",\"Plugin\":\"Plugin\",\"LessonPlanUnit\":\"Lesson Plan Unit\",\"CourseUnit\":\"Course Unit\",\"TextBookUnit\":\"Textbook Unit\"} \ No newline at end of file diff --git a/platform-jobs/samza/publish-pipeline/src/main/config/publish-pipeline.properties b/platform-jobs/samza/publish-pipeline/src/main/config/publish-pipeline.properties deleted file mode 100644 index fc8c47972b..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/config/publish-pipeline.properties +++ /dev/null @@ -1,192 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=__env__.publish-pipeline -job.container.count=__publish_pipeline_container_count__ - -# YARN -yarn.package.path=http://__yarn_host__:__yarn_port__/__env__/${project.artifactId}-${pom.version}-distribution.tar.gz -yarn.container.memory.mb=__yarn_container_memory_mb__ - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.__env__.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory - -# Task -task.class=org.sunbird.jobs.samza.task.PublishPipelineTask -task.inputs=kafka.__env__.learning.job.request -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=__samza_checkpoint_replication_factor__ -task.commit.ms=60000 -task.window.ms=300000 -task.opts=__publish_pipeline_task_opts__ -task.broadcast.inputs=kafka.__env__.system.command#0 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=__zookeepers__ -systems.kafka.consumer.auto.offset.reset=smallest -systems.kafka.samza.offset.default=oldest -systems.kafka.producer.bootstrap.servers=__kafka_brokers__ - - -# Job Coordinator -job.coordinator.system=kafka -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=__samza_coordinator_replication_factor__ - -#Job specif configuration -redis.host=__redis_host__ -redis.port=__redis_port__ -redis.maxConnections=128 -akka.request_timeout=600 -environment.id=__environment_id__ -graph.passport.key.base=__graph_passport_key__ -route.domain=__lp_bolt_url__ -route.bolt.read.domain=__lp_bolt_read_url__ -route.bolt.write.domain=__lp_bolt_write_url__ -route.all=__other_bolt_url__ -route.bolt.read.all=__other_bolt_read_url__ -route.bolt.write.all=__other_bolt_write_url__ -shard.id=__mw_shard_id__ - -content.keyspace.name=__keyspace_name__ -content.keyspace.table=__keyspace_table__ -assessment.keyspace.name=__keyspace_name__ -hierarchy.keyspace.name=__hierarchy_keyspace_name__ -content.hierarchy.table=content_hierarchy -CONTENT_TO_VEC_URL=__content_to_vec_url__ -platform-api-url=__lp_url__ -ekstepPlatformApiUserId=ilimi -graph.dir="/data/graphDB" -graph.ids=["domain","language","en","hi","ka","te","ta"] -platform.auth.check.enabled=false -platform.cache.ttl=3600000 -kafka.topics.backend.telemetry=__env__.telemetry.raw -kafka.topics.failed=__env__.learning.job.request - -#Current environment -cloud_storage.env=__cloud_storage_config_environment__ - -#Folder configuration -cloud_storage.content.folder=content -cloud_storage.itemset.folder=itemset -cloud_storage.asset.folder=assets -cloud_storage.artefact.folder=artifact -cloud_storage.bundle.folder=bundle -cloud_storage.media.folder=media -cloud_storage.ecar.folder=ecar_files -cloud_storage.upload.url.ttl=600 - - -# Media download configuration -content.media.base.url=__content_media_base_url__ -plugin.media.base.url=__plugin_media_base_url__ - -#directory location where store unzip file -dist.directory=/tmp/dist/ -output.zipfile=/tmp/story.zip -source.folder=/tmp/temp2/ -save.directory=/tmp/temp/ - -MAX_CONTENT_PACKAGE_FILE_SIZE_LIMIT=52428800 -MAX_ASSET_FILE_SIZE_LIMIT=20971520 -RETRY_ASSET_DOWNLOAD_COUNT=1 - -lp.tempfile.location=__lp_tmpfile_location__ -max.iteration.count.samza.job=__max_iteration_count_for_samza_job__ -publish.content.limit=200 - - -# Metrics -output.metrics.job.name=publish-pipeline -output.metrics.topic.name=__env__.pipeline_metrics - -#Failed Topic Config -output.failed.events.topic.name=__env__.learning.events.failed - -telemetry_env=__env_name__ -installation.id=__installation_id__ - -# Cloud store details -cloud_storage_type=__cloud_storage_type__ -azure_storage_key=__azure_storage_key__ -azure_storage_secret=__azure_storage_secret__ -azure_storage_container=__azure_storage_container__ -aws_storage_key=__aws_access_key_id__ -aws_storage_secret=__aws_secret_access_key__ -aws_storage_container=__aws_storage_container__ - -# Configuration for default channel ID -channel.default=in.ekstep - - -content.publish.invoke_web_hook=__invoke_web_hook__ - -#Streamable media type list -stream.mime.type=__streaming_mime_type__ -stream.keyspace.name=__env___platform_db -stream.keyspace.table=job_request - -cassandra.lp.connection=__cassandra_lp_connection__ -cassandra.lpa.connection=__cassandra_lpa_connection__ - -#restrict.metadata.objectTypes=Content,ContentImage - -kafka.topic.system.command=__env__.system.command - -# Consistency Level for Multi Node Cassandra cluster -cassandra.lp.consistency.level=QUORUM - -compositesearch.index.name=__compositesearch_index_name__ - -content.nested.fields=badgeAssertions,targets,badgeAssociations - -search.es_conn_info=__search_es_host__ - -# Content Tagging Config for Backward Compatibility in Mobile App - -content.cache.read=true -content.cache.hierarchy=true - -# Max size(width/height) of thumbnail in pixels -max.thumbnail.size.pixels=150 - -#Post publish Job topic name -post.publish.event.topic=__env__.content.postpublish.request -post.publish.mvc.topic=__env__.mvc.processor.job.request -# Print service config -#kp.print.service.base.url=__kp_print_service_base_url__ -kp.print.service.base.url=__kp_print_service_base_url__ -lp.assessment.tmp_file_location=/tmp/ -lp.assessment.template_name=questionSetTemplate.vm - -# Content Tagging Config for Backward Compatibility in Mobile App -content.tagging.backward_enable=true -content.tagging.property=subject,medium - -# For enabling transfer of content from one path to other -content.upload.context.driven=true - -# PDF generation for contents linked to ItemSet -itemset.generate.pdf=__itemset_generate_pdf__ -content.streaming_enabled=__content_streaming_enabled__ - -#Configuration added to handle large artifacts -content.artifact.size.for_online=209715200 - -#Content Type Primary Category mapping -contentTypeToPrimaryCategory={\"ClassroomTeachingVideo\":\"Explanation Content\",\"ConceptMap\":\"Learning Resource\",\"Course\":\"Course\",\"CuriosityQuestionSet\":\"Practice Question Set\",\"eTextBook\":\"eTextbook\",\"ExperientialResource\":\"Learning Resource\",\"ExplanationResource\":\"Explanation Content\",\"ExplanationVideo\":\"Explanation Content\",\"FocusSpot\":\"Teacher Resource\",\"LearningOutcomeDefinition\":\"Teacher Resource\",\"MarkingSchemeRubric\":\"Teacher Resource\",\"PedagogyFlow\":\"Teacher Resource\",\"PracticeQuestionSet\":\"Practice Question Set\",\"PracticeResource\":\"Practice Question Set\",\"SelfAssess\":\"Course Assessment\",\"TeachingMethod\":\"Teacher Resource\",\"TextBook\":\"Digital Textbook\",\"Collection\":\"Content Playlist\",\"ExplanationReadingMaterial\":\"Learning Resource\",\"LearningActivity\":\"Learning Resource\",\"LessonPlan\":\"Content Playlist\",\"LessonPlanResource\":\"Teacher Resource\",\"PreviousBoardExamPapers\":\"Learning Resource\",\"TVLesson\":\"Explanation Content\",\"OnboardingResource\":\"Learning Resource\",\"ReadingMaterial\":\"Learning Resource\",\"Template\":\"Template\",\"Asset\":\"Asset\",\"Plugin\":\"Plugin\",\"LessonPlanUnit\":\"Lesson Plan Unit\",\"CourseUnit\":\"Course Unit\",\"TextBookUnit\":\"Textbook Unit\"} - -# master Category Cache Properties -master.category.cache.read=true -master.category.cache.ttl=86400 -master.category.validation.enabled=__master_category_validation_enabled__ \ No newline at end of file diff --git a/platform-jobs/samza/publish-pipeline/src/main/java/org/sunbird/jobs/samza/service/PublishPipelineService.java b/platform-jobs/samza/publish-pipeline/src/main/java/org/sunbird/jobs/samza/service/PublishPipelineService.java deleted file mode 100644 index 7926c45733..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/java/org/sunbird/jobs/samza/service/PublishPipelineService.java +++ /dev/null @@ -1,366 +0,0 @@ -package org.sunbird.jobs.samza.service; - -import com.fasterxml.jackson.core.type.TypeReference; -import com.fasterxml.jackson.databind.ObjectMapper; -import org.apache.commons.collections.MapUtils; -import org.apache.commons.io.FileUtils; -import org.apache.commons.lang3.StringUtils; -import org.apache.samza.config.Config; -import org.apache.samza.system.OutgoingMessageEnvelope; -import org.apache.samza.system.SystemStream; -import org.apache.samza.task.MessageCollector; -import org.sunbird.common.Platform; -import org.sunbird.common.exception.ClientException; -import org.sunbird.common.exception.ServerException; -import org.sunbird.content.common.ContentErrorMessageConstants; -import org.sunbird.content.enums.ContentErrorCodeConstants; -import org.sunbird.content.enums.ContentWorkflowPipelineParams; -import org.sunbird.content.pipeline.initializer.InitializePipeline; -import org.sunbird.content.publish.PublishManager; -import org.sunbird.graph.dac.model.Node; -import org.sunbird.jobs.samza.exception.PlatformErrorCodes; -import org.sunbird.jobs.samza.exception.PlatformException; -import org.sunbird.jobs.samza.service.task.JobMetrics; -import org.sunbird.jobs.samza.util.FailedEventsUtil; -import org.sunbird.jobs.samza.util.JSONUtils; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.jobs.samza.util.PublishPipelineParams; -import org.sunbird.learning.router.LearningRequestRouterPool; -import org.sunbird.learning.util.ControllerUtil; -import org.sunbird.telemetry.dto.TelemetryBJREvent; -import org.sunbird.telemetry.logger.TelemetryManager; - -import com.rits.cloning.Cloner; - -import java.io.File; -import java.text.SimpleDateFormat; -import java.util.*; - -public class PublishPipelineService implements ISamzaService { - - private static JobLogger LOGGER = new JobLogger(PublishPipelineService.class); - - private Map parameterMap = new HashMap(); - - protected static final String DEFAULT_CONTENT_IMAGE_OBJECT_SUFFIX = ".img"; - - private ControllerUtil util = new ControllerUtil(); - - private Config config = null; - - private static int MAXITERTIONCOUNT = 2; - - private SystemStream systemStream = null; - private SystemStream postPublishStream = null; - private SystemStream postPublishMVCStream = null; - private static SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSZ"); - - private static ObjectMapper mapper = new ObjectMapper(); - - protected int getMaxIterations() { - if (Platform.config.hasPath("max.iteration.count.samza.job")) - return Platform.config.getInt("max.iteration.count.samza.job"); - else - return MAXITERTIONCOUNT; - } - - @Override - public void initialize(Config config) throws Exception { - this.config = config; - JSONUtils.loadProperties(config); - LOGGER.info("Service config initialized"); - LearningRequestRouterPool.init(); - LOGGER.info("Akka actors initialized"); - systemStream = new SystemStream("kafka", config.get("output.failed.events.topic.name")); - LOGGER.info("Stream initialized for Failed Events"); - postPublishStream = new SystemStream("kafka", config.get("post.publish.event.topic")); - LOGGER.info("Stream initialized for Post Publish Events"); - postPublishMVCStream = new SystemStream("kafka",config.get("post.publish.mvc.topic")); - LOGGER.info("Stream initialized for Post Publish MVC Content Events"); - - } - - @Override - @SuppressWarnings("unchecked") - public void processMessage(Map message, JobMetrics metrics, MessageCollector collector) - throws Exception { - - if (null == message) { - LOGGER.info("Ignoring the message because it is not valid for publishing."); - return; - } - Map edata = (Map) message.get(PublishPipelineParams.edata.name()); - Map object = (Map) message.get(PublishPipelineParams.object.name()); - - if (!validateObject(edata) || null == object) { - LOGGER.info("Ignoring the message because it is not valid for publishing."); - return; - } - - String nodeId = (String) object.get(PublishPipelineParams.id.name()); - if (StringUtils.isNotBlank(nodeId)) { - try { - Node node = getNode(nodeId); - if (null != node) { - if (prePublishValidation(node, (Map) edata.get("metadata"))) { - LOGGER.info( - "Node fetched for publish and content enrichment operation : " + node.getIdentifier()); - prePublishUpdate(edata, node); - - processJob(edata, nodeId, metrics, collector); - } - } else { - metrics.incSkippedCounter(); - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - PlatformErrorCodes.PROCESSING_ERROR.name(), new ServerException("ERR_PUBLISH_PIPELINE", "Please check neo4j connection or identfier to publish")); - LOGGER.debug("Invalid Node Object. Unable to process the event", message); - } - } catch (PlatformException e) { - LOGGER.error("Failed to process message", message, e); - metrics.incFailedCounter(); - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - PlatformErrorCodes.PROCESSING_ERROR.name(), e); - } catch (Exception e) { - LOGGER.error("Failed to process message", message, e); - metrics.incErrorCounter(); - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - PlatformErrorCodes.SYSTEM_ERROR.name(), e); - } - } else { - FailedEventsUtil.pushEventForRetry(systemStream, message, metrics, collector, - PlatformErrorCodes.SYSTEM_ERROR.name(), new ServerException("ERR_PUBLISH_PIPELINE", "Id is blank")); - metrics.incSkippedCounter(); - LOGGER.debug("Invalid NodeId. Unable to process the event", message); - } - } - - private boolean prePublishValidation(Node node, Map eventMetadata) { - Map objMetadata = (Map) node.getMetadata(); - - double eventPkgVersion = ((eventMetadata.get("pkgVersion") == null) ? 0d - : ((Number)eventMetadata.get("pkgVersion")).doubleValue()); - double objPkgVersion = ((objMetadata.get("pkgVersion") == null) ? 0d : ((Number) objMetadata.get("pkgVersion")).doubleValue()); - - return (objPkgVersion <= eventPkgVersion); - } - - private void processJob(Map edata, String contentId, JobMetrics metrics, MessageCollector collector) throws Exception { - - Node node = getNode(contentId); - String publishType = (String) edata.get(PublishPipelineParams.publish_type.name()); - node.getMetadata().put(PublishPipelineParams.publish_type.name(), publishType); - publishContent(node, edata, metrics, collector); - } - - @SuppressWarnings("unchecked") - private void prePublishUpdate(Map edata, Node node) { - Map metadata = (Map) edata.get("metadata"); - node.getMetadata().putAll(metadata); - - String prevState = (String) node.getMetadata().get(ContentWorkflowPipelineParams.status.name()); - node.getMetadata().put(ContentWorkflowPipelineParams.prevState.name(), prevState); - node.getMetadata().put("status", "Processing"); - - util.updateNode(node); - edata.put(PublishPipelineParams.status.name(), PublishPipelineParams.Processing.name()); - LOGGER.debug("Node status :: Processing for NodeId :: " + node.getIdentifier()); - } - - private Node getNode(String nodeId) { - Node node = null; - String imgNodeId = nodeId + DEFAULT_CONTENT_IMAGE_OBJECT_SUFFIX; - node = util.getNode(PublishPipelineParams.domain.name(), imgNodeId); - if (null == node) { - node = util.getNode(PublishPipelineParams.domain.name(), nodeId); - } - return node; - } - - private void publishContent(Node node, Map edata, JobMetrics metrics, MessageCollector collector) throws Exception { - boolean published = true; - LOGGER.debug("Publish processing start for content: " + node.getIdentifier()); - publishNode(node, (String) node.getMetadata().get(PublishPipelineParams.mimeType.name())); - Node publishedNode = getNode(node.getIdentifier().replace(".img", "")); - if (StringUtils.equalsIgnoreCase((String) publishedNode.getMetadata().get(PublishPipelineParams.status.name()), - PublishPipelineParams.Failed.name())) { - edata.put(PublishPipelineParams.status.name(), PublishPipelineParams.FAILED.name()); - LOGGER.debug("Node publish operation :: FAILED :: For NodeId :: " + node.getIdentifier()); - throw new PlatformException(PlatformErrorCodes.PUBLISH_FAILED.name(), - "Node publish operation failed for Node Id:" + node.getIdentifier()); - } else { - metrics.incSuccessCounter(); - edata.put(PublishPipelineParams.status.name(), PublishPipelineParams.SUCCESS.name()); - LOGGER.debug("Node publish operation :: SUCCESS :: For NodeId :: " + node.getIdentifier()); - pushInstructionEvent(publishedNode, collector); - } - } - - protected static String format(Date date) { - if (null != date) { - try { - return sdf.format(date); - } catch (Exception e) { - TelemetryManager.error("Error! While Converting the Date Format."+ date, e); - } - } - return null; - } - - private void publishNode(Node node, String mimeType) { - if (null == node) - throw new ClientException(ContentErrorCodeConstants.INVALID_CONTENT.name(), - ContentErrorMessageConstants.INVALID_CONTENT - + " | ['null' or Invalid Content Node (Object). Async Publish Operation Failed.]"); - Cloner cloner = new Cloner(); - Node cloneNode = cloner.deepClone(node); - String nodeId = node.getIdentifier().replace(".img", ""); - LOGGER.info("Publish processing start for node: " + nodeId); - String basePath = PublishManager.getBasePath(nodeId, this.config.get("lp.tempfile.location")); - LOGGER.info("Base path to store files: " + basePath); - try { - setContentBody(node, mimeType); - LOGGER.debug("Fetched body from cassandra"); - parameterMap.put(PublishPipelineParams.node.name(), node); - parameterMap.put(PublishPipelineParams.ecmlType.name(), PublishManager.isECMLContent(mimeType)); - LOGGER.info("Initializing the publish pipeline for: " + node.getIdentifier()); - InitializePipeline pipeline = new InitializePipeline(basePath, nodeId); - pipeline.init(PublishPipelineParams.publish.name(), parameterMap); - } catch (Exception e) { - e.printStackTrace(); - LOGGER.info( - "Something Went Wrong While Performing 'Content Publish' Operation in Async Mode. | [Content Id: " - + nodeId + "]", - e.getMessage()); - cloneNode.getMetadata().put(PublishPipelineParams.publishError.name(), e.getMessage()); - cloneNode.getMetadata().put(PublishPipelineParams.status.name(), PublishPipelineParams.Failed.name()); - util.updateNode(cloneNode); - } finally { - try { - FileUtils.deleteDirectory(new File(basePath.replace(nodeId, ""))); - } catch (Exception e2) { - LOGGER.error("Error while deleting base Path: " + basePath, e2); - e2.printStackTrace(); - } - } - } - - private void setContentBody(Node node, String mimeType) { - if (PublishManager.isECMLContent(mimeType)) { - node.getMetadata().put(PublishPipelineParams.body.name(), - PublishManager.getContentBody(node.getIdentifier())); - } - } - - private boolean validateObject(Map edata) { - String action = (String) edata.get("action"); - String contentType = (String) edata.get(PublishPipelineParams.contentType.name()); - Integer iteration = (Integer) edata.get(PublishPipelineParams.iteration.name()); - //TODO: remove contentType validation - if (StringUtils.equalsIgnoreCase("publish", action) && (!StringUtils.equalsIgnoreCase(contentType, - PublishPipelineParams.Asset.name())) && (iteration <= getMaxIterations())) { - return true; - } - return false; - } - - private void pushInstructionEvent(Node node, MessageCollector collector) throws Exception { - Map actor = new HashMap(); - Map context = new HashMap(); - Map object = new HashMap(); - Map edata = new HashMap(); - String mimeType = (String) node.getMetadata().get("mimeType"); - String sourceURL = node.getMetadata().get("sourceURL") != null ? (String)node.getMetadata().get("sourceURL") : null; - if(StringUtils.isNotBlank(sourceURL)){ - Map mvcProcessorEvent = generateInstructionEventMetadata(actor, context, object, edata, node.getMetadata(), node.getIdentifier(), "link-dialcode"); - mvcProcessorEvent= updatevaluesForMVCEvent(mvcProcessorEvent); - if (MapUtils.isEmpty(mvcProcessorEvent)) { - TelemetryManager.error("Post Publish event is not generated properly. #postPublishJob : " + mvcProcessorEvent); - throw new ClientException("MVC_JOB_REQUEST_EXCEPTION", "Event is not generated properly."); - } - collector.send(new OutgoingMessageEnvelope(postPublishMVCStream, mvcProcessorEvent)); - LOGGER.info("All Events sent to post publish mvc event topic"); - } - - Map postPublishEvent = generateInstructionEventMetadata(actor, context, object, edata, node.getMetadata(), node.getIdentifier(), "post-publish-process"); - if (MapUtils.isEmpty(postPublishEvent)) { - TelemetryManager.error("Post Publish event is not generated properly. #postPublishJob : " + postPublishEvent); - throw new ClientException("BE_JOB_REQUEST_EXCEPTION", "Event is not generated properly."); - } - collector.send(new OutgoingMessageEnvelope(postPublishStream, postPublishEvent)); - - LOGGER.info("All Events sent to post publish event topic"); - } - - Map updatevaluesForMVCEvent(Map mvcProcessorEvent) { - mvcProcessorEvent.put("eventData",mvcProcessorEvent.get("edata")); - mvcProcessorEvent.put("eid","MVC_JOB_PROCESSOR"); - mvcProcessorEvent.remove("edata"); - Map eventData = (Map) mvcProcessorEvent.get("eventData"); - eventData.put("identifier",eventData.get("id")); - eventData.remove("id"); - eventData.remove("iteration"); - eventData.remove("mimeType"); - eventData.remove("contentType"); - eventData.remove("pkgVersion"); - eventData.remove("status"); - eventData.put("action","update-es-index"); - eventData.put("stage",1); - return mvcProcessorEvent; - } - - private Map generateInstructionEventMetadata(Map actor, Map context, - Map object, Map edata, Map metadata, String contentId, String action) { - TelemetryBJREvent te = new TelemetryBJREvent(); - actor.put("id", "Post Publish Processor"); - actor.put("type", "System"); - context.put("channel", metadata.get("channel")); - Map pdata = new HashMap<>(); - pdata.put("id", "org.sunbird.platform"); - pdata.put("ver", "1.0"); - context.put("pdata", pdata); - if (Platform.config.hasPath("cloud_storage.env")) { - String env = Platform.config.getString("cloud_storage.env"); - context.put("env", env); - } - - object.put("id", contentId); - object.put("ver", metadata.get("versionKey")); - - edata.put("action", action); - edata.put("contentType", metadata.get("contentType")); - edata.put("status", metadata.get("status")); - // TODO: remove 'id' after mvc-processor handled it. - edata.put("id", contentId); - edata.put("identifier", contentId); - edata.put("pkgVersion", metadata.get("pkgVersion")); - edata.put("mimeType", metadata.get("mimeType")); - edata.put("name", metadata.get("name")); - edata.put("createdBy", metadata.get("createdBy")); - edata.put("createdFor", metadata.get("createdFor")); - edata.put("trackable", metadata.get("trackable")); - if (metadata.get("artifactUrl") != null) { - edata.put("artifactUrl", metadata.get("artifactUrl")); - } - - // generate event structure - long unixTime = System.currentTimeMillis(); - String mid = "LP." + System.currentTimeMillis() + "." + UUID.randomUUID(); - edata.put("iteration", 1); - te.setEid("BE_JOB_REQUEST"); - te.setEts(unixTime); - te.setMid(mid); - te.setActor(actor); - te.setContext(context); - te.setObject(object); - te.setEdata(edata); - Map event = null; - try { - event = mapper.convertValue(te, new TypeReference>() { - }); - } catch (Exception e) { - TelemetryManager.error("Error Generating BE_JOB_REQUEST event: " + e.getMessage(), e); - } - return event; - } - -} diff --git a/platform-jobs/samza/publish-pipeline/src/main/java/org/sunbird/jobs/samza/task/PublishPipelineTask.java b/platform-jobs/samza/publish-pipeline/src/main/java/org/sunbird/jobs/samza/task/PublishPipelineTask.java deleted file mode 100644 index a78d5a4c98..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/java/org/sunbird/jobs/samza/task/PublishPipelineTask.java +++ /dev/null @@ -1,42 +0,0 @@ -package org.sunbird.jobs.samza.task; - -import java.util.Map; - -import org.apache.samza.task.MessageCollector; -import org.apache.samza.task.TaskCoordinator; -import org.sunbird.jobs.samza.service.ISamzaService; -import org.sunbird.jobs.samza.service.PublishPipelineService; -import org.sunbird.jobs.samza.util.JobLogger; - -public class PublishPipelineTask extends AbstractTask { - - private static JobLogger LOGGER = new JobLogger(PublishPipelineTask.class); - private ISamzaService service = new PublishPipelineService(); - - public ISamzaService initialize() throws Exception { - LOGGER.info("Task initialized"); - this.jobType = "publish"; - this.jobStartMessage = "Started processing of publish samza job"; - this.jobEndMessage = "Publish job processing complete"; - this.jobClass = "org.sunbird.jobs.samza.task.PublishPipelineTask"; - - return service; - } - - @Override - public void process(Map message, MessageCollector collector, TaskCoordinator coordinator) throws Exception { - try { - //LOGGER.info("Starting of service.processMessage..."); - long startTime = System.currentTimeMillis(); - LOGGER.info("Starting of service.processMessage at :: " + startTime); - service.processMessage(message, metrics, collector); - //LOGGER.info("Completed service.processMessage..."); - long endTime = System.currentTimeMillis(); - LOGGER.info("Completed service.processMessage at :: " + endTime); - LOGGER.info("Total execution time to complete publish operation :: " + (endTime-startTime)); - } catch (Exception e) { - metrics.incErrorCounter(); - LOGGER.error("Message processing failed", message, e); - } - } -} \ No newline at end of file diff --git a/platform-jobs/samza/publish-pipeline/src/main/java/org/sunbird/jobs/samza/util/PublishPipelineParams.java b/platform-jobs/samza/publish-pipeline/src/main/java/org/sunbird/jobs/samza/util/PublishPipelineParams.java deleted file mode 100644 index b72f3dff04..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/java/org/sunbird/jobs/samza/util/PublishPipelineParams.java +++ /dev/null @@ -1,11 +0,0 @@ -package org.sunbird.jobs.samza.util; - -public enum PublishPipelineParams { - - taxonomy, taxonomy_hierarchy,search_criteria, property_keys, unique_constraint, status, Live, Unlisted, isImageObject, node_id, - cwp_element_name, param, nodeUniqueId, nodeType, state, Asset, contentType, - transactionData,properties, Flagged, FlagDraft, domain, FlagReview, data, eid, id, ecmlType, - node, Processing, Draft, edata, eks, content, mimeType, publish, Failed, publishError, body, Collection, - gradeLevel, ageGroup, medium, subject, genre, theme, keywords, concepts, visibility, channel, Default, versionKey, - BE_JOB_REQUEST, Content, cid, object, Pending, FAILED, SUCCESS, iteration, publish_type, kafka, Parent, children; -} diff --git a/platform-jobs/samza/publish-pipeline/src/main/resources/actor-config.xml b/platform-jobs/samza/publish-pipeline/src/main/resources/actor-config.xml deleted file mode 100644 index f349225f43..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/resources/actor-config.xml +++ /dev/null @@ -1,24 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/publish-pipeline/src/main/resources/application.conf b/platform-jobs/samza/publish-pipeline/src/main/resources/application.conf deleted file mode 100644 index 55482aeae5..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/resources/application.conf +++ /dev/null @@ -1,13 +0,0 @@ -LearningActorSystem{ - default-dispatcher { - type = "Dispatcher" - executor = "fork-join-executor" - fork-join-executor { - parallelism-min = 1 - parallelism-factor = 2.0 - parallelism-max = 4 - } - # Throughput for default Dispatcher, set to 1 for as fair as possible - throughput = 1 - } -} diff --git a/platform-jobs/samza/publish-pipeline/src/main/resources/log4j.xml b/platform-jobs/samza/publish-pipeline/src/main/resources/log4j.xml deleted file mode 100644 index d2db3940cc..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/resources/log4j.xml +++ /dev/null @@ -1,20 +0,0 @@ - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/publish-pipeline/src/main/resources/questionSetTemplate.vm b/platform-jobs/samza/publish-pipeline/src/main/resources/questionSetTemplate.vm deleted file mode 100644 index baf0b5d378..0000000000 --- a/platform-jobs/samza/publish-pipeline/src/main/resources/questionSetTemplate.vm +++ /dev/null @@ -1,76 +0,0 @@ -
- -
- -
-
-

$title

-
$questions
-
-
-

Answers

-
$answers
-
- -
\ No newline at end of file diff --git a/platform-jobs/samza/qr-image-generator/pom.xml b/platform-jobs/samza/qr-image-generator/pom.xml deleted file mode 100644 index 3980c3fc02..0000000000 --- a/platform-jobs/samza/qr-image-generator/pom.xml +++ /dev/null @@ -1,45 +0,0 @@ - - - - samza - org.sunbird - 1.1-SNAPSHOT - - 4.0.0 - qr-image-generator - 1.1-SNAPSHOT - - - - com.google.zxing - core - 3.3.3 - - - com.google.zxing - javase - 3.3.3 - - - org.apache.commons - commons-lang3 - 3.8.1 - - - - - - - org.apache.maven.plugins - maven-compiler-plugin - - 1.8 - 1.8 - - - - - - \ No newline at end of file diff --git a/platform-jobs/samza/qr-image-generator/src/main/java/org/sunbird/qrimage/generator/QRImageGenerator.java b/platform-jobs/samza/qr-image-generator/src/main/java/org/sunbird/qrimage/generator/QRImageGenerator.java deleted file mode 100644 index 7076baaa66..0000000000 --- a/platform-jobs/samza/qr-image-generator/src/main/java/org/sunbird/qrimage/generator/QRImageGenerator.java +++ /dev/null @@ -1,281 +0,0 @@ -package org.sunbird.qrimage.generator; - -import com.google.zxing.BarcodeFormat; -import com.google.zxing.EncodeHintType; -import com.google.zxing.NotFoundException; -import com.google.zxing.WriterException; -import com.google.zxing.client.j2se.BufferedImageLuminanceSource; -import com.google.zxing.common.BitMatrix; -import com.google.zxing.common.HybridBinarizer; -import com.google.zxing.qrcode.QRCodeWriter; -import com.google.zxing.qrcode.decoder.ErrorCorrectionLevel; -import org.apache.commons.lang3.StringUtils; -import org.sunbird.qrimage.request.QRImageConfig; -import org.sunbird.qrimage.request.QRImageRequest; - -import javax.imageio.ImageIO; -import java.awt.*; -import java.awt.font.TextAttribute; -import java.awt.image.BufferedImage; -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.util.HashMap; -import java.util.List; -import java.util.Map; -import java.util.stream.Collectors; - -public class QRImageGenerator { - - private static QRImageConfig config = getDefaultConfig(); - private static QRCodeWriter qrCodeWriter = new QRCodeWriter(); - private static Map fontStore = new HashMap(); - - - public static File generateQRImage(QRImageRequest request) throws Exception { - if (null != request && null == request.getConfig()) - request.setConfig(config); - - List dataList = request.getData(); - String data = dataList.stream().collect(Collectors.joining(",")); - String text = request.getText(); - String fileName = request.getFileName(); - - String errorCorrectionLevel = request.getConfig().getErrorCorrectionLevel(); - int pixelsPerBlock = request.getConfig().getPixelsPerBlock(); - int qrMargin = request.getConfig().getQrCodeMargin(); - String fontName = request.getConfig().getTextFontName(); - int fontSize = request.getConfig().getTextFontSize(); - double tracking = request.getConfig().getTextCharacterSpacing(); - String imageFormat = request.getConfig().getFileFormat(); - String colorModel = request.getConfig().getColorModel(); - int borderSize = request.getConfig().getImageBorderSize(); - int qrMarginBottom = request.getConfig().getQrCodeMarginBottom(); - int imageMargin = request.getConfig().getImageMargin(); - - BufferedImage qrImage = generateBaseImage(data, errorCorrectionLevel, pixelsPerBlock, qrMargin, colorModel); - - if (StringUtils.isNotBlank(text)) { - BufferedImage textImage = getTextImage(text, fontName, fontSize, tracking, colorModel); - qrImage = addTextToBaseImage(qrImage, textImage, colorModel, qrMargin, pixelsPerBlock, qrMarginBottom, imageMargin); - } - - if (borderSize > 0) { - drawBorder(qrImage, borderSize, imageMargin); - } - - File finalImageFile = new File(request.getTempFileLocation() + File.separator + fileName + "." + imageFormat); - finalImageFile.createNewFile(); - ImageIO.write(qrImage, imageFormat, finalImageFile); - return finalImageFile; - } - - - private static QRImageConfig getDefaultConfig(){ - QRImageConfig config = new QRImageConfig(); - config.setFileFormat("png"); - config.setErrorCorrectionLevel("H"); - config.setPixelsPerBlock(2); - config.setColorModel("Grayscale"); - config.setTextFontName("Verdana"); - config.setTextFontSize(11); - config.setTextCharacterSpacing(0.1); - config.setQrCodeMargin(3); - config.setImageBorderSize(1); - config.setImageMargin(1); - config.setQrCodeMarginBottom(1); - return config; - } - - private static BufferedImage generateBaseImage(String data, String errorCorrectionLevel, int pixelsPerBlock, int qrMargin, String colorModel) throws WriterException { - Map hintsMap = getHintsMap(errorCorrectionLevel, qrMargin); - BitMatrix defaultBitMatrix = getDefaultBitMatrix(data, hintsMap); - BitMatrix largeBitMatrix = getBitMatrix(data, defaultBitMatrix.getWidth() * pixelsPerBlock, defaultBitMatrix.getHeight() * pixelsPerBlock, hintsMap); - BufferedImage qrImage = getImage(largeBitMatrix, colorModel); - return qrImage; - } - - //Sample = 2A42UH , Verdana, 11, 0.1, Grayscale - private static BufferedImage getTextImage(String text, String fontName, int fontSize, double tracking, String colorModel) throws IOException, FontFormatException { - BufferedImage image = new BufferedImage(1, 1, getImageType(colorModel)); - Font basicFont = getFontFromStore(fontName); - - Map attributes = new HashMap(); - attributes.put(TextAttribute.TRACKING, tracking); - attributes.put(TextAttribute.WEIGHT, TextAttribute.WEIGHT_BOLD); - attributes.put(TextAttribute.SIZE, fontSize); - Font font = basicFont.deriveFont(attributes); - - Graphics2D graphics2d = image.createGraphics(); - graphics2d.setFont(font); - FontMetrics fontmetrics = graphics2d.getFontMetrics(); - int width = fontmetrics.stringWidth(text); - int height = fontmetrics.getHeight(); - graphics2d.dispose(); - - image = new BufferedImage(width, height, getImageType(colorModel)); - graphics2d = image.createGraphics(); - graphics2d.setRenderingHint(RenderingHints.KEY_ALPHA_INTERPOLATION, RenderingHints.VALUE_ALPHA_INTERPOLATION_QUALITY); - graphics2d.setRenderingHint(RenderingHints.KEY_COLOR_RENDERING, RenderingHints.VALUE_COLOR_RENDER_QUALITY); - graphics2d.setRenderingHint(RenderingHints.KEY_TEXT_ANTIALIASING, RenderingHints.VALUE_TEXT_ANTIALIAS_OFF); - graphics2d.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_OFF); - - graphics2d.setColor(Color.WHITE); - graphics2d.fillRect(0, 0, image.getWidth(), image.getHeight()); - graphics2d.setColor(Color.BLACK); - - graphics2d.setFont(font); - fontmetrics = graphics2d.getFontMetrics(); - graphics2d.drawString(text, 0, fontmetrics.getAscent()); - graphics2d.dispose(); - - return image; - } - - private static BufferedImage addTextToBaseImage(BufferedImage qrImage, BufferedImage textImage, String colorModel, int qrMargin, int pixelsPerBlock, int qrMarginBottom, int imageMargin) throws NotFoundException { - BufferedImageLuminanceSource qrSource = new BufferedImageLuminanceSource(qrImage); - HybridBinarizer qrBinarizer = new HybridBinarizer(qrSource); - BitMatrix qrBits = qrBinarizer.getBlackMatrix(); - - BufferedImageLuminanceSource textSource = new BufferedImageLuminanceSource(textImage); - HybridBinarizer textBinarizer = new HybridBinarizer(textSource); - BitMatrix textBits = textBinarizer.getBlackMatrix(); - - if (qrBits.getWidth() > textBits.getWidth()) { - BitMatrix tempTextMatrix = new BitMatrix(qrBits.getWidth(), textBits.getHeight()); - copyMatrixDataToBiggerMatrix(textBits, tempTextMatrix); - textBits = tempTextMatrix; - } else if (qrBits.getWidth() < textBits.getWidth()) { - BitMatrix tempQrMatrix = new BitMatrix(textBits.getWidth(), qrBits.getHeight()); - copyMatrixDataToBiggerMatrix(qrBits, tempQrMatrix); - qrBits = tempQrMatrix; - } - - BitMatrix mergedMatrix = mergeMatricesOfSameWidth(qrBits, textBits, qrMargin, pixelsPerBlock, qrMarginBottom, imageMargin); - return getImage(mergedMatrix, colorModel); - } - - private static BitMatrix mergeMatricesOfSameWidth(BitMatrix firstMatrix, BitMatrix secondMatrix, int qrMargin, int pixelsPerBlock, int qrMarginBottom, int imageMargin) { - int mergedWidth = firstMatrix.getWidth() + (2 * imageMargin); - int mergedHeight = firstMatrix.getHeight() + secondMatrix.getHeight() + (2 * imageMargin); - int defaultBottomMargin = pixelsPerBlock * qrMargin; - int marginToBeRemoved = qrMarginBottom > defaultBottomMargin ? 0 : (defaultBottomMargin-qrMarginBottom); - BitMatrix mergedMatrix = new BitMatrix(mergedWidth, mergedHeight - marginToBeRemoved); - - for (int x = 0; x < firstMatrix.getWidth(); x++) { - for (int y = 0; y < firstMatrix.getHeight() - marginToBeRemoved; y++) { - if (firstMatrix.get(x, y)) { - mergedMatrix.set(x + imageMargin, y + imageMargin); - } - } - } - for (int x = 0; x < secondMatrix.getWidth(); x++) { - for (int y = 0; y < secondMatrix.getHeight(); y++) { - if (secondMatrix.get(x, y)) { - mergedMatrix.set(x + imageMargin, y + firstMatrix.getHeight() - marginToBeRemoved + imageMargin); - } - } - } - return mergedMatrix; - } - - private static void copyMatrixDataToBiggerMatrix(BitMatrix fromMatrix, BitMatrix toMatrix) { - int widthDiff = toMatrix.getWidth() - fromMatrix.getWidth(); - int leftMargin = widthDiff / 2; - for (int x = 0; x < fromMatrix.getWidth(); x++) { - for (int y = 0; y < fromMatrix.getHeight(); y++) { - if (fromMatrix.get(x, y)) { - toMatrix.set(x + leftMargin, y); - } - } - } - } - - private static void drawBorder(BufferedImage image, int borderSize, int imageMargin) { - image.createGraphics(); - Graphics2D graphics = (Graphics2D) image.getGraphics(); - graphics.setColor(Color.BLACK); - for (int i = 0; i < borderSize; i++) { - graphics.drawRect(i + imageMargin, i + imageMargin, image.getWidth() - 1 - (2 * i) - (2 * imageMargin), image.getHeight() - 1 - (2 * i) - (2 * imageMargin)); - } - graphics.dispose(); - } - - private static BufferedImage getImage(BitMatrix bitMatrix, String colorModel) { - int imageWidth = bitMatrix.getWidth(); - int imageHeight = bitMatrix.getHeight(); - BufferedImage image = new BufferedImage(imageWidth, imageHeight, getImageType(colorModel)); - image.createGraphics(); - - Graphics2D graphics = (Graphics2D) image.getGraphics(); - graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_OFF); - graphics.setColor(Color.WHITE); - graphics.fillRect(0, 0, imageWidth, imageHeight); - - graphics.setColor(Color.BLACK); - - for (int i = 0; i < imageWidth; i++) { - for (int j = 0; j < imageHeight; j++) { - if (bitMatrix.get(i, j)) { - graphics.fillRect(i, j, 1, 1); - } - } - } - graphics.dispose(); - return image; - } - - private static BitMatrix getBitMatrix(String data, int width, int height, Map hintsMap) throws WriterException { - BitMatrix bitMatrix = qrCodeWriter.encode(data, BarcodeFormat.QR_CODE, width, height, hintsMap); - return bitMatrix; - } - - private static BitMatrix getDefaultBitMatrix(String data, Map hintsMap) throws WriterException { - BitMatrix defaultBitMatrix = qrCodeWriter.encode(data, BarcodeFormat.QR_CODE, 0, 0, hintsMap); - return defaultBitMatrix; - } - - private static Map getHintsMap(String errorCorrectionLevel, int qrMargin) { - Map hintsMap = new HashMap(); - switch (errorCorrectionLevel) { - case "H": - hintsMap.put(EncodeHintType.ERROR_CORRECTION, ErrorCorrectionLevel.H); - break; - case "Q": - hintsMap.put(EncodeHintType.ERROR_CORRECTION, ErrorCorrectionLevel.Q); - break; - case "M": - hintsMap.put(EncodeHintType.ERROR_CORRECTION, ErrorCorrectionLevel.M); - break; - case "L": - hintsMap.put(EncodeHintType.ERROR_CORRECTION, ErrorCorrectionLevel.L); - break; - } - hintsMap.put(EncodeHintType.MARGIN, qrMargin); - return hintsMap; - } - - - - private static int getImageType(String colorModel) { - if (colorModel.equalsIgnoreCase("RGB")) { - return BufferedImage.TYPE_INT_RGB; - } else { - return BufferedImage.TYPE_BYTE_GRAY; - } - } - - private static Font loadFontStore(String fontName) throws IOException, FontFormatException { - //load the packaged font file from the root dir - String fontFile = "/"+fontName+".ttf"; - InputStream fontStream = QRImageGenerator.class.getResourceAsStream(fontFile); - Font basicFont = Font.createFont(Font.TRUETYPE_FONT, fontStream); - fontStore.put(fontName, basicFont); - - return basicFont; - } - - private static Font getFontFromStore(String fontName) throws IOException, FontFormatException { - return null != fontStore.get(fontName) ? fontStore.get(fontName) : loadFontStore(fontName); - } -} diff --git a/platform-jobs/samza/qr-image-generator/src/main/java/org/sunbird/qrimage/request/QRImageConfig.java b/platform-jobs/samza/qr-image-generator/src/main/java/org/sunbird/qrimage/request/QRImageConfig.java deleted file mode 100644 index 8a0d0cc239..0000000000 --- a/platform-jobs/samza/qr-image-generator/src/main/java/org/sunbird/qrimage/request/QRImageConfig.java +++ /dev/null @@ -1,104 +0,0 @@ -package org.sunbird.qrimage.request; - -public class QRImageConfig { - - private String fileFormat; - private String errorCorrectionLevel; - private int pixelsPerBlock; - private String colorModel; - private String textFontName; - private int textFontSize; - private double textCharacterSpacing; - private int qrCodeMargin; - private int imageBorderSize; - private int imageMargin; - private int qrCodeMarginBottom; - - public String getFileFormat() { - return fileFormat; - } - - public void setFileFormat(String fileFormat) { - this.fileFormat = fileFormat; - } - - public String getErrorCorrectionLevel() { - return errorCorrectionLevel; - } - - public void setErrorCorrectionLevel(String errorCorrectionLevel) { - this.errorCorrectionLevel = errorCorrectionLevel; - } - - public int getPixelsPerBlock() { - return pixelsPerBlock; - } - - public void setPixelsPerBlock(int pixelsPerBlock) { - this.pixelsPerBlock = pixelsPerBlock; - } - - public String getColorModel() { - return colorModel; - } - - public void setColorModel(String colorModel) { - this.colorModel = colorModel; - } - - public String getTextFontName() { - return textFontName; - } - - public void setTextFontName(String textFontName) { - this.textFontName = textFontName; - } - - public int getTextFontSize() { - return textFontSize; - } - - public void setTextFontSize(int textFontSize) { - this.textFontSize = textFontSize; - } - - public double getTextCharacterSpacing() { - return textCharacterSpacing; - } - - public void setTextCharacterSpacing(double textCharacterSpacing) { - this.textCharacterSpacing = textCharacterSpacing; - } - - public int getQrCodeMargin() { - return qrCodeMargin; - } - - public void setQrCodeMargin(int qrCodeMargin) { - this.qrCodeMargin = qrCodeMargin; - } - - public int getImageBorderSize() { - return imageBorderSize; - } - - public void setImageBorderSize(int imageBorderSize) { - this.imageBorderSize = imageBorderSize; - } - - public int getImageMargin() { - return imageMargin; - } - - public void setImageMargin(int imageMargin) { - this.imageMargin = imageMargin; - } - - public int getQrCodeMarginBottom() { - return qrCodeMarginBottom; - } - - public void setQrCodeMarginBottom(int qrCodeMarginBottom) { - this.qrCodeMarginBottom = qrCodeMarginBottom; - } -} diff --git a/platform-jobs/samza/qr-image-generator/src/main/java/org/sunbird/qrimage/request/QRImageRequest.java b/platform-jobs/samza/qr-image-generator/src/main/java/org/sunbird/qrimage/request/QRImageRequest.java deleted file mode 100644 index e55528c4a6..0000000000 --- a/platform-jobs/samza/qr-image-generator/src/main/java/org/sunbird/qrimage/request/QRImageRequest.java +++ /dev/null @@ -1,54 +0,0 @@ -package org.sunbird.qrimage.request; - -import java.util.List; - -public class QRImageRequest { - - private List data; - private String text; - private String fileName; - private QRImageConfig config; - private String tempFileLocation; - - public QRImageRequest(){} - - public QRImageRequest(String tempFileLocation) { - this.tempFileLocation = tempFileLocation; - } - - public List getData() { - return data; - } - - public void setData(List data) { - this.data = data; - } - - public String getText() { - return text; - } - - public void setText(String text) { - this.text = text; - } - - public String getFileName() { - return fileName; - } - - public void setFileName(String fileName) { - this.fileName = fileName; - } - - public QRImageConfig getConfig() { - return config; - } - - public void setConfig(QRImageConfig config) { - this.config = config; - } - - public String getTempFileLocation() { - return this.tempFileLocation; - } -} diff --git a/platform-jobs/samza/qr-image-generator/src/main/resources/Verdana.ttf b/platform-jobs/samza/qr-image-generator/src/main/resources/Verdana.ttf deleted file mode 100755 index 18ef6e8f1f..0000000000 Binary files a/platform-jobs/samza/qr-image-generator/src/main/resources/Verdana.ttf and /dev/null differ diff --git a/platform-jobs/samza/qrcode-image-generator/.gitignore b/platform-jobs/samza/qrcode-image-generator/.gitignore deleted file mode 100644 index 57912cd344..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/.gitignore +++ /dev/null @@ -1,2 +0,0 @@ -/target/ -*.iml \ No newline at end of file diff --git a/platform-jobs/samza/qrcode-image-generator/pom.xml b/platform-jobs/samza/qrcode-image-generator/pom.xml deleted file mode 100644 index a7dd112d4f..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/pom.xml +++ /dev/null @@ -1,63 +0,0 @@ - - - - samza - org.sunbird - 1.1-SNAPSHOT - - 4.0.0 - - qrcode-image-generator - 0.0.31 - - - com.google.zxing - core - 3.3.3 - - - com.google.zxing - javase - 3.3.3 - - - org.sunbird - samza-common - 1.1-SNAPSHOT - - - - - - - org.apache.maven.plugins - maven-compiler-plugin - - 1.8 - 1.8 - - - - - maven-assembly-plugin - - - src/main/assembly/src.xml - - - - - make-assembly - package - - single - - - - - - - - diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/assembly/src.xml b/platform-jobs/samza/qrcode-image-generator/src/main/assembly/src.xml deleted file mode 100644 index 813101d8e0..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/assembly/src.xml +++ /dev/null @@ -1,69 +0,0 @@ - - - - - distribution - - tar.gz - - false - - - ${basedir} - - README* - LICENSE* - NOTICE* - - - - - - ${basedir}/src/main/resources/log4j.xml - lib - - - - ${basedir}/src/main/config/qrcode-image-generator.properties - config - true - - - - - bin - - org.apache.samza:samza-shell:tgz:dist:* - - 0744 - true - - - lib - - org.apache.samza:samza-api - org.sunbird:qrcode-image-generator - org.apache.samza:samza-core_2.11 - org.apache.samza:samza-kafka_2.11 - org.apache.samza:samza-yarn_2.11 - org.apache.samza:samza-log4j - org.apache.kafka:kafka_2.11 - org.apache.hadoop:hadoop-hdfs - - true - - - diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/config/local.qrcode-image-generator.properties b/platform-jobs/samza/qrcode-image-generator/src/main/config/local.qrcode-image-generator.properties deleted file mode 100644 index 55f84fae02..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/config/local.qrcode-image-generator.properties +++ /dev/null @@ -1,73 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=local.qrcode-image-generator - -# YARN -yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-distribution.tar.gz - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.dev.lp.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory - -# Task -task.class=org.sunbird.jobs.samza.task.QRCodeImageGeneratorTask -task.inputs=kafka.local.qrimage.request -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=1 -task.commit.ms=60000 -task.window.ms=300000 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=localhost:2181 -systems.kafka.consumer.auto.offset.reset=smallest -systems.kafka.samza.offset.default=oldest -systems.kafka.producer.bootstrap.servers=localhost:9092 - -# Job Coordinator -job.coordinator.system=kafka - -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=1 - -# Job specific configuration - -# Metrics -output.metrics.job.name=qrcode-image-generator -output.metrics.topic.name=local.qrimage.request - -# Cloud store details -cloud_storage_type=__cloud_storage_type__ -azure_storage_key=__azure_storage_key__ -azure_storage_secret=__azure_storage_secret__ -azure_storage_container=__azure_storage_container__ -aws_storage_key=__aws_access_key_id__ -aws_storage_secret=__aws_secret_access_key__ -aws_storage_container=__aws_storage_container__ -cloud_upload_retry_count=3 - -# Cassandra connection details -cassandra.lp.connection=localhost:9042 -cassandra.lpa.connection=localhost:9042 -cassandra.sunbird.connection=localhost:9042 - -# QR Image generation default configurations -# Thickness of white border(in pixels) around the black border of the qr image -qr_image_margin=1 -# Spacing(in pixels) between qrcode and text in the qr image -qr_image_margin_bottom=0 - -# Remote Debug Configuration -task.opts=-agentlib:jdwp=transport=dt_socket,address=localhost:9009,server=y,suspend=y - -# Temp file path to generate files -lp_tempfile_location=/tmp \ No newline at end of file diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/config/qrcode-image-generator.properties b/platform-jobs/samza/qrcode-image-generator/src/main/config/qrcode-image-generator.properties deleted file mode 100644 index b5954fd766..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/config/qrcode-image-generator.properties +++ /dev/null @@ -1,73 +0,0 @@ -# Job -job.factory.class=org.apache.samza.job.yarn.YarnJobFactory -job.name=__env__.qrcode-image-generator - -# YARN -yarn.package.path=http://__yarn_host__:__yarn_port__/__env__/${project.artifactId}-${pom.version}-distribution.tar.gz - -# Metrics -metrics.reporters=snapshot,jmx -metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory -metrics.reporter.snapshot.stream=kafka.__env__.metrics -metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory - -# Task -task.class=org.sunbird.jobs.samza.task.QRCodeImageGeneratorTask -task.inputs=kafka.__env__.qrimage.request -task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory -task.checkpoint.system=kafka -task.checkpoint.replication.factor=__samza_checkpoint_replication_factor__ -task.commit.ms=60000 -task.window.ms=300000 - -# Serializers -serializers.registry.json.class=org.sunbird.jobs.samza.serializers.EkstepJsonSerdeFactory -serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory - -# Systems -systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory -systems.kafka.samza.msg.serde=json -systems.kafka.streams.metrics.samza.msg.serde=metrics -systems.kafka.consumer.zookeeper.connect=__zookeepers__ -systems.kafka.consumer.auto.offset.reset=smallest -systems.kafka.samza.offset.default=oldest -systems.kafka.producer.bootstrap.servers=__kafka_brokers__ - -# Job Coordinator -job.coordinator.system=kafka - -# Normally, this would be 3, but we have only one broker. -job.coordinator.replication.factor=__samza_coordinator_replication_factor__ - -# Job specific configuration - -# Metrics -output.metrics.job.name=qrcode-image-generator -output.metrics.topic.name=__env__.qrimage.request - -# Cloud store details -cloud_storage_type=__cloud_storage_type__ -azure_storage_key=__azure_storage_key__ -azure_storage_secret=__azure_storage_secret__ -azure_storage_container=__azure_storage_container__ -aws_storage_key=__aws_access_key_id__ -aws_storage_secret=__aws_secret_access_key__ -aws_storage_container=__aws_storage_container__ -cloud_upload_retry_count=__cloud_upload_retry_count__ - -# Cassandra connection details -cassandra.lp.connection=__cassandra_lp_connection__ -cassandra.lpa.connection=__cassandra_lpa_connection__ -cassandra.sunbird.connection=__cassandra_sunbird_connection__ - -# QR Image generation default configurations -# Thickness of white border(in pixels) around the black border of the qr image -qr_image_margin=1 -# Spacing(in pixels) between qrcode and text in the qr image -qr_image_margin_bottom=0 - -# Consistency Level for Multi Node Cassandra cluster -cassandra.sunbird.consistency.level=QUORUM - -# Temp file path to generate files -lp_tempfile_location=/tmp \ No newline at end of file diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/model/QRCodeGenerationRequest.java b/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/model/QRCodeGenerationRequest.java deleted file mode 100644 index a1f7963bc1..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/model/QRCodeGenerationRequest.java +++ /dev/null @@ -1,134 +0,0 @@ -package org.sunbird.jobs.samza.model; - -import java.util.List; - -public class QRCodeGenerationRequest { - - private List data; - private String errorCorrectionLevel; - private int pixelsPerBlock; - private int qrCodeMargin; - private List text; - private String textFontName; - private int textFontSize; - private double textCharacterSpacing; - private int imageBorderSize; - private String colorModel; - private List fileName; - private String fileFormat; - private int qrCodeMarginBottom; - private int imageMargin; - private String tempFilePath; - - public String getTempFilePath() { return tempFilePath; } - - public void setTempFilePath(String tempFilePath) { this.tempFilePath = tempFilePath; } - - public int getImageMargin() { return imageMargin; } - - public void setImageMargin(int imageMargin) { this.imageMargin = imageMargin; } - - public int getQrCodeMarginBottom() { - return qrCodeMarginBottom; - } - - public void setQrCodeMarginBottom(int qrCodeMarginBottom) { - this.qrCodeMarginBottom = qrCodeMarginBottom; - } - - public List getData() { - return data; - } - - public void setData(List data) { - this.data = data; - } - - public String getErrorCorrectionLevel() { - return errorCorrectionLevel; - } - - public void setErrorCorrectionLevel(String errorCorrectionLevel) { - this.errorCorrectionLevel = errorCorrectionLevel; - } - - public int getPixelsPerBlock() { - return pixelsPerBlock; - } - - public void setPixelsPerBlock(int pixelsPerBlock) { - this.pixelsPerBlock = pixelsPerBlock; - } - - public int getQrCodeMargin() { - return qrCodeMargin; - } - - public void setQrCodeMargin(int qrCodeMargin) { - this.qrCodeMargin = qrCodeMargin; - } - - public List getText() { - return text; - } - - public void setText(List text) { - this.text = text; - } - - public String getTextFontName() { - return textFontName; - } - - public void setTextFontName(String textFontName) { - this.textFontName = textFontName; - } - - public int getTextFontSize() { - return textFontSize; - } - - public void setTextFontSize(int textFontSize) { - this.textFontSize = textFontSize; - } - - public double getTextCharacterSpacing() { - return textCharacterSpacing; - } - - public void setTextCharacterSpacing(double textCharacterSpacing) { - this.textCharacterSpacing = textCharacterSpacing; - } - - public int getImageBorderSize() { - return imageBorderSize; - } - - public void setImageBorderSize(int imageBorderSize) { - this.imageBorderSize = imageBorderSize; - } - - public String getColorModel() { - return colorModel; - } - - public void setColorModel(String colorModel) { - this.colorModel = colorModel; - } - - public List getFileName() { - return fileName; - } - - public void setFileName(List fileName) { - this.fileName = fileName; - } - - public String getFileFormat() { - return fileFormat; - } - - public void setFileFormat(String fileFormat) { - this.fileFormat = fileFormat; - } -} diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/service/QRCodeImageGeneratorService.java b/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/service/QRCodeImageGeneratorService.java deleted file mode 100644 index 994f95ed63..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/service/QRCodeImageGeneratorService.java +++ /dev/null @@ -1,154 +0,0 @@ -package org.sunbird.jobs.samza.service; - -import org.apache.commons.lang3.StringUtils; -import org.apache.samza.config.Config; -import org.apache.samza.task.MessageCollector; -import org.sunbird.jobs.samza.model.QRCodeGenerationRequest; -import org.sunbird.jobs.samza.util.JSONUtils; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.jobs.samza.util.QRCodeImageGeneratorParams; -import org.sunbird.jobs.samza.util.QRCodeImageGeneratorUtil; -import org.sunbird.jobs.samza.util.QRCodeCassandraConnector; -import org.sunbird.jobs.samza.util.CloudStorageUtil; -import org.sunbird.jobs.samza.util.ZipEditorUtil; -import org.sunbird.jobs.samza.service.task.JobMetrics; - -import java.io.File; -import java.util.ArrayList; -import java.util.Map; -import java.util.List; - -public class QRCodeImageGeneratorService implements ISamzaService { - - private JobLogger LOGGER = new JobLogger(QRCodeImageGeneratorService.class); - - private Config appConfig = null; - - @Override - public void initialize(Config config) throws Exception { - JSONUtils.loadProperties(config); - appConfig = config; - LOGGER.info("QRCodeImageGeneratorService:initialize: Service config initialized"); - } - - @Override - public void processMessage(Map message, JobMetrics metrics, MessageCollector collector) throws Exception { - List availableImages = new ArrayList(); - File zipFile = null; - try{ - LOGGER.info("QRCodeImageGeneratorService:processMessage: Processing request: "+message); - LOGGER.info("QRCodeImageGeneratorService:processMessage: Starting message processing at "+System.currentTimeMillis()); - - if(!message.containsKey(QRCodeImageGeneratorParams.eid.name())) { - return; - } - - String eid = (String) message.get(QRCodeImageGeneratorParams.eid.name()); - if(!eid.equalsIgnoreCase(QRCodeImageGeneratorParams.BE_QR_IMAGE_GENERATOR.name())) { - return; - } - - List> dialCodes = (List>) message.get(QRCodeImageGeneratorParams.dialcodes.name()); - if(null == dialCodes || dialCodes.size()==0) { - return; - } - - Map config = (Map) message.get(QRCodeImageGeneratorParams.config.name()); - String imageFormat = (String) config.get(QRCodeImageGeneratorParams.imageFormat.name()); - - List dataList = new ArrayList(); - List textList = new ArrayList(); - List fileNameList = new ArrayList(); - String downloadUrl = null; - String tempFilePath = appConfig.getOrDefault(QRCodeImageGeneratorParams.lp_tempfile_location.name(), "/tmp"); - - for(Map dialCode : dialCodes) { - if(dialCode.containsKey(QRCodeImageGeneratorParams.location.name())) { - try { - downloadUrl = (String) dialCode.get(QRCodeImageGeneratorParams.location.name()); - String fileName = (String) dialCode.get(QRCodeImageGeneratorParams.id.name()); - File fileToSave = new File(tempFilePath + File.separator + fileName+"."+imageFormat); - LOGGER.info("QRCodeImageGeneratorService:processMessage: creating file - " + fileToSave.getAbsolutePath()); - fileToSave.createNewFile(); - LOGGER.info("QRCodeImageGeneratorService:processMessage: created file - " + fileToSave.getAbsolutePath()); - CloudStorageUtil.downloadFile(downloadUrl, fileToSave); - availableImages.add(fileToSave); - continue; - } catch(Exception e) { - LOGGER.error("QRCodeImageGeneratorService:processMessage: Error while downloading image:", downloadUrl, e); - } - } - - dataList.add((String)dialCode.get(QRCodeImageGeneratorParams.data.name())); - textList.add((String)dialCode.get(QRCodeImageGeneratorParams.text.name())); - fileNameList.add((String)dialCode.get(QRCodeImageGeneratorParams.id.name())); - - } - - Map storage = (Map) message.get(QRCodeImageGeneratorParams.storage.name()); - String container = storage.get(QRCodeImageGeneratorParams.container.name()); - String path = storage.get(QRCodeImageGeneratorParams.path.name()); - String zipFileName = storage.get(QRCodeImageGeneratorParams.fileName.name()); - String processId = (String) message.get(QRCodeImageGeneratorParams.processId.name()); - - QRCodeGenerationRequest qrGenRequest = getQRCodeGenerationRequest(config, dataList, textList, fileNameList); - List generatedImages = QRCodeImageGeneratorUtil.createQRImages(qrGenRequest, appConfig, container, path); - - if(!StringUtils.isBlank(processId)) { - LOGGER.info("QRCodeImageGeneratorService:processMessage: Generating zip for QR codes with processId " + processId); - if(StringUtils.isBlank(zipFileName)) { - zipFileName = processId; - } - availableImages.addAll(generatedImages); - zipFile = ZipEditorUtil.zipFiles(availableImages, zipFileName, tempFilePath); - - String zipDownloadUrl = CloudStorageUtil.uploadFile(container, path, zipFile, false); - QRCodeCassandraConnector.updateDownloadZIPUrl(processId, zipDownloadUrl); - } else { - LOGGER.info("QRCodeImageGeneratorService:processMessage: Skipping zip creation due to missing processId."); - } - LOGGER.info("QRCodeImageGeneratorService:processMessage: Message processed successfully at "+System.currentTimeMillis()); - } catch (Exception e) { - QRCodeCassandraConnector.updateFailure((String) message.get(QRCodeImageGeneratorParams.processId.name()), - e.getMessage()); - throw e; - } finally { - if(null != zipFile) { - zipFile.delete(); - } - for(File imageFile : availableImages) { - if(null != imageFile) { - imageFile.delete(); - } - } - } - } - - private QRCodeGenerationRequest getQRCodeGenerationRequest(Map config, List dataList, List textList, List fileNameList) { - QRCodeGenerationRequest qrGenRequest = new QRCodeGenerationRequest(); - qrGenRequest.setData(dataList); - qrGenRequest.setText(textList); - qrGenRequest.setFileName(fileNameList); - qrGenRequest.setErrorCorrectionLevel((String) config.get(QRCodeImageGeneratorParams.errorCorrectionLevel.name())); - qrGenRequest.setPixelsPerBlock((Integer) config.get(QRCodeImageGeneratorParams.pixelsPerBlock.name())); - qrGenRequest.setQrCodeMargin((Integer) config.get(QRCodeImageGeneratorParams.qrCodeMargin.name())); - qrGenRequest.setTextFontName((String) config.get(QRCodeImageGeneratorParams.textFontName.name())); - qrGenRequest.setTextFontSize((Integer) config.get(QRCodeImageGeneratorParams.textFontSize.name())); - qrGenRequest.setTextCharacterSpacing((Double) config.get(QRCodeImageGeneratorParams.textCharacterSpacing.name())); - qrGenRequest.setFileFormat((String) config.get(QRCodeImageGeneratorParams.imageFormat.name())); - qrGenRequest.setColorModel((String) config.get(QRCodeImageGeneratorParams.colourModel.name())); - qrGenRequest.setImageBorderSize((Integer) config.get(QRCodeImageGeneratorParams.imageBorderSize.name())); - if(config.containsKey(QRCodeImageGeneratorParams.qrCodeMarginBottom.name())) { - qrGenRequest.setQrCodeMarginBottom((Integer) config.get(QRCodeImageGeneratorParams.qrCodeMarginBottom.name())); - } else { - qrGenRequest.setQrCodeMarginBottom(appConfig.getInt(QRCodeImageGeneratorParams.qr_image_margin_bottom.name())); - } - if(config.containsKey(QRCodeImageGeneratorParams.imageMargin.name())) { - qrGenRequest.setImageMargin((Integer) config.get(QRCodeImageGeneratorParams.imageMargin.name())); - } else { - qrGenRequest.setImageMargin(appConfig.getInt(QRCodeImageGeneratorParams.qr_image_margin.name())); - } - qrGenRequest.setTempFilePath(appConfig.getOrDefault(QRCodeImageGeneratorParams.lp_tempfile_location.name(), "/tmp")); - return qrGenRequest; - } -} diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/task/QRCodeImageGeneratorTask.java b/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/task/QRCodeImageGeneratorTask.java deleted file mode 100644 index de743c33e8..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/task/QRCodeImageGeneratorTask.java +++ /dev/null @@ -1,64 +0,0 @@ -package org.sunbird.jobs.samza.task; - -import org.apache.samza.config.Config; -import org.apache.samza.system.IncomingMessageEnvelope; -import org.apache.samza.task.StreamTask; -import org.apache.samza.task.InitableTask; -import org.apache.samza.task.TaskContext; -import org.apache.samza.task.MessageCollector; -import org.apache.samza.task.TaskCoordinator; -import org.sunbird.jobs.samza.service.ISamzaService; -import org.sunbird.jobs.samza.service.QRCodeImageGeneratorService; -import org.sunbird.jobs.samza.service.task.JobMetrics; -import org.sunbird.jobs.samza.util.JobLogger; -import org.sunbird.jobs.samza.util.QRCodeImageGeneratorParams; - -import java.util.HashMap; -import java.util.Map; - -public class QRCodeImageGeneratorTask implements StreamTask, InitableTask { - - private JobLogger LOGGER = new JobLogger(QRCodeImageGeneratorTask.class); - - private JobMetrics metrics; - - private ISamzaService service = new QRCodeImageGeneratorService(); - - @Override - public void init(Config config, TaskContext context) throws Exception { - try { - metrics = new JobMetrics(context, config.get("output.metrics.job.name"), config.get("output.metrics.topic.name")); - service.initialize(config); - LOGGER.info("QRCodeImageGeneratorTask:init: Task initialized"); - } catch (Exception ex) { - LOGGER.error("QRCodeImageGeneratorTask:init: Task initialization failed", ex); - throw ex; - } - } - - - @Override - public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) throws Exception { - Map outgoingMap = getMessage(envelope); - try { - service.processMessage(outgoingMap, metrics, collector); - } catch (Exception e) { - LOGGER.error("QRCodeImageGeneratorTask:process: Error while processing message for process_id:: " + - (String) outgoingMap.get(QRCodeImageGeneratorParams.processId.name()), outgoingMap, e); - e.printStackTrace(); - //throw e; - } - } - - @SuppressWarnings("unchecked") - private Map getMessage(IncomingMessageEnvelope envelope) { - try { - return (Map) envelope.getMessage(); - } catch (Exception e) { - e.printStackTrace(); - LOGGER.error("QRCodeImageGeneratorTask:getMessage: Invalid message = " + envelope.getMessage(), e); - return new HashMap(); - } - } - -} diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/CloudStorageUtil.java b/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/CloudStorageUtil.java deleted file mode 100644 index 81493cae58..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/CloudStorageUtil.java +++ /dev/null @@ -1,62 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import org.apache.commons.lang3.StringUtils; -import org.sunbird.common.Platform; -import org.sunbird.common.exception.ServerException; -import org.sunbird.cloud.storage.BaseStorageService; -import org.sunbird.cloud.storage.factory.StorageConfig; -import org.sunbird.cloud.storage.factory.StorageServiceFactory; - -import scala.Option; - -import java.io.File; -import java.io.FileOutputStream; -import java.io.IOException; -import java.net.URL; -import java.nio.channels.Channels; -import java.nio.channels.FileChannel; -import java.nio.channels.ReadableByteChannel; - -public class CloudStorageUtil { - - private static BaseStorageService storageService = null; - private static String cloudStoreType = Platform.config.getString("cloud_storage_type"); - static { - - if(StringUtils.equalsIgnoreCase(cloudStoreType, "azure")) { - String storageKey = Platform.config.getString("azure_storage_key"); - String storageSecret = Platform.config.getString("azure_storage_secret"); - storageService = StorageServiceFactory.getStorageService(new StorageConfig(cloudStoreType, storageKey, storageSecret)); - }else if(StringUtils.equalsIgnoreCase(cloudStoreType, "aws")) { - String storageKey = Platform.config.getString("aws_storage_key"); - String storageSecret = Platform.config.getString("aws_storage_secret"); - storageService = StorageServiceFactory.getStorageService(new StorageConfig(cloudStoreType, storageKey, storageSecret)); - }else { - throw new ServerException("ERR_INVALID_CLOUD_STORAGE", "Error while initialising cloud storage"); - } - } - - public static String uploadFile(String container, String path, File file, boolean isDirectory) { - int retryCount = Platform.config.getInt("cloud_upload_retry_count"); - String objectKey = path + file.getName(); - String url = storageService.upload(container, - file.getAbsolutePath(), - objectKey, - Option.apply(isDirectory), - Option.apply(1), - Option.apply(retryCount),Option.empty()); - return url; - } - - public static void downloadFile(String downloadUrl, File fileToSave) throws IOException { - URL url = new URL(downloadUrl); - ReadableByteChannel readableByteChannel = Channels.newChannel(url.openStream()); - FileOutputStream fileOutputStream = new FileOutputStream(fileToSave); - FileChannel fileChannel = fileOutputStream.getChannel(); - fileOutputStream.getChannel().transferFrom(readableByteChannel, 0, Long.MAX_VALUE); - fileChannel.close(); - fileOutputStream.close(); - readableByteChannel.close(); - } - -} diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/QRCodeCassandraConnector.java b/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/QRCodeCassandraConnector.java deleted file mode 100644 index cded052963..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/QRCodeCassandraConnector.java +++ /dev/null @@ -1,27 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import com.datastax.driver.core.Session; -import org.sunbird.cassandra.connector.util.CassandraConnector; - -public class QRCodeCassandraConnector { - - public static void updateDownloadUrl(String id, String downloadUrl) { - String query = "update dialcodes.dialcode_images set status=2, url='"+downloadUrl+"' where filename='"+id+"'"; - executeQuery(query); - } - - public static void updateDownloadZIPUrl(String id, String downloadZIPUrl) { - String query = "update dialcodes.dialcode_batch set status=2, url='"+downloadZIPUrl+"' where processid="+id; - executeQuery(query); - } - - public static void updateFailure(String id, String errMsg) { - String query = "update dialcodes.dialcode_batch set status=3, url='' where processid="+id; - executeQuery(query); - } - - private static void executeQuery(String query) { - Session session = CassandraConnector.getSession("sunbird"); - session.execute(query); - } -} diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/QRCodeImageGeneratorParams.java b/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/QRCodeImageGeneratorParams.java deleted file mode 100644 index d0a950d912..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/QRCodeImageGeneratorParams.java +++ /dev/null @@ -1,10 +0,0 @@ -package org.sunbird.jobs.samza.util; - -public enum QRCodeImageGeneratorParams { - - eid, processId, objectId, dialcodes, data, text, id, location, storage, container, path, config, - errorCorrectionLevel, pixelsPerBlock, qrCodeMargin, textFontName, textFontSize, textCharacterSpacing, - imageFormat, colourModel, imageBorderSize, qrCodeMarginBottom, BE_QR_IMAGE_GENERATOR, fileName, imageMargin, - qr_image_margin_bottom, qr_image_margin, lp_tempfile_location; - -} diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/QRCodeImageGeneratorUtil.java b/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/QRCodeImageGeneratorUtil.java deleted file mode 100644 index 352b9b10fb..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/QRCodeImageGeneratorUtil.java +++ /dev/null @@ -1,288 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import com.google.zxing.BarcodeFormat; -import com.google.zxing.EncodeHintType; -import com.google.zxing.NotFoundException; -import com.google.zxing.WriterException; -import com.google.zxing.client.j2se.BufferedImageLuminanceSource; -import com.google.zxing.common.BitMatrix; -import com.google.zxing.common.HybridBinarizer; -import com.google.zxing.qrcode.QRCodeWriter; -import com.google.zxing.qrcode.decoder.ErrorCorrectionLevel; -import org.apache.samza.config.Config; -import org.sunbird.jobs.samza.model.QRCodeGenerationRequest; - -import javax.imageio.ImageIO; -import java.awt.FontMetrics; -import java.awt.Font; -import java.awt.Color; -import java.awt.RenderingHints; -import java.awt.Graphics2D; -import java.awt.font.TextAttribute; -import java.awt.image.BufferedImage; -import java.awt.FontFormatException; -import java.io.File; -import java.io.IOException; -import java.io.InputStream; -import java.util.HashMap; -import java.util.Map; -import java.util.List; -import java.util.ArrayList; - -public class QRCodeImageGeneratorUtil { - - private static QRCodeWriter qrCodeWriter = new QRCodeWriter(); - private static Map fontStore = new HashMap(); - private static JobLogger LOGGER = new JobLogger(QRCodeImageGeneratorUtil.class); - - public static List createQRImages(QRCodeGenerationRequest qrGenRequest, Config appConfig, String container, String path) throws WriterException, IOException, NotFoundException, FontFormatException { - - List fileList = new ArrayList(); - - List dataList = qrGenRequest.getData(); - List textList = qrGenRequest.getText(); - List fileNameList = qrGenRequest.getFileName(); - - String errorCorrectionLevel = qrGenRequest.getErrorCorrectionLevel(); - int pixelsPerBlock = qrGenRequest.getPixelsPerBlock(); - int qrMargin = qrGenRequest.getQrCodeMargin(); - String fontName = qrGenRequest.getTextFontName(); - int fontSize = qrGenRequest.getTextFontSize(); - double tracking = qrGenRequest.getTextCharacterSpacing(); - String imageFormat = qrGenRequest.getFileFormat(); - String colorModel = qrGenRequest.getColorModel(); - int borderSize = qrGenRequest.getImageBorderSize(); - int qrMarginBottom = qrGenRequest.getQrCodeMarginBottom(); - int imageMargin = qrGenRequest.getImageMargin(); - String tempFilePath = qrGenRequest.getTempFilePath(); - - for (int i = 0; i < dataList.size(); i++) { - String data = dataList.get(i); - String text = textList.get(i); - String fileName = fileNameList.get(i); - - BufferedImage qrImage = generateBaseImage(data, errorCorrectionLevel, pixelsPerBlock, qrMargin, colorModel); - - if (null != text || "" != text) { - BufferedImage textImage = getTextImage(text, fontName, fontSize, tracking, colorModel); - qrImage = addTextToBaseImage(qrImage, textImage, colorModel, qrMargin, pixelsPerBlock, qrMarginBottom, imageMargin); - } - - if (borderSize > 0) { - drawBorder(qrImage, borderSize, imageMargin); - } - - File finalImageFile = new File(tempFilePath + File.separator + fileName + "." + imageFormat); - LOGGER.info("QRCodeImageGeneratorUtil:createQRImages: creating file - " + finalImageFile.getAbsolutePath()); - finalImageFile.createNewFile(); - LOGGER.info("QRCodeImageGeneratorUtil:createQRImages: created file - " + finalImageFile.getAbsolutePath()); - ImageIO.write(qrImage, imageFormat, finalImageFile); - fileList.add(finalImageFile); - - try { - String imageDownloadUrl = CloudStorageUtil.uploadFile(container, path, finalImageFile, false); - QRCodeCassandraConnector.updateDownloadUrl(fileName, imageDownloadUrl); - } catch(Exception e) { - //ignore exception and proceed - } - } - - return fileList; - - } - - private static BufferedImage addTextToBaseImage(BufferedImage qrImage, BufferedImage textImage, String colorModel, int qrMargin, int pixelsPerBlock, int qrMarginBottom, int imageMargin) throws NotFoundException { - BufferedImageLuminanceSource qrSource = new BufferedImageLuminanceSource(qrImage); - HybridBinarizer qrBinarizer = new HybridBinarizer(qrSource); - BitMatrix qrBits = qrBinarizer.getBlackMatrix(); - - BufferedImageLuminanceSource textSource = new BufferedImageLuminanceSource(textImage); - HybridBinarizer textBinarizer = new HybridBinarizer(textSource); - BitMatrix textBits = textBinarizer.getBlackMatrix(); - - if (qrBits.getWidth() > textBits.getWidth()) { - BitMatrix tempTextMatrix = new BitMatrix(qrBits.getWidth(), textBits.getHeight()); - copyMatrixDataToBiggerMatrix(textBits, tempTextMatrix); - textBits = tempTextMatrix; - } else if (qrBits.getWidth() < textBits.getWidth()) { - BitMatrix tempQrMatrix = new BitMatrix(textBits.getWidth(), qrBits.getHeight()); - copyMatrixDataToBiggerMatrix(qrBits, tempQrMatrix); - qrBits = tempQrMatrix; - } - - BitMatrix mergedMatrix = mergeMatricesOfSameWidth(qrBits, textBits, qrMargin, pixelsPerBlock, qrMarginBottom, imageMargin); - return getImage(mergedMatrix, colorModel); - } - - private static BufferedImage generateBaseImage(String data, String errorCorrectionLevel, int pixelsPerBlock, int qrMargin, String colorModel) throws WriterException { - Map hintsMap = getHintsMap(errorCorrectionLevel, qrMargin); - BitMatrix defaultBitMatrix = getDefaultBitMatrix(data, hintsMap); - BitMatrix largeBitMatrix = getBitMatrix(data, defaultBitMatrix.getWidth() * pixelsPerBlock, defaultBitMatrix.getHeight() * pixelsPerBlock, hintsMap); - BufferedImage qrImage = getImage(largeBitMatrix, colorModel); - return qrImage; - } - - //To remove extra spaces between text and qrcode, margin below qrcode is removed - //Parameter, qrCodeMarginBottom, is introduced to add custom margin(in pixels) between qrcode and text - //Parameter, imageMargin is introduced, to add custom margin(in pixels) outside the black border of the image - private static BitMatrix mergeMatricesOfSameWidth(BitMatrix firstMatrix, BitMatrix secondMatrix, int qrMargin, int pixelsPerBlock, int qrMarginBottom, int imageMargin) { - int mergedWidth = firstMatrix.getWidth() + (2 * imageMargin); - int mergedHeight = firstMatrix.getHeight() + secondMatrix.getHeight() + (2 * imageMargin); - int defaultBottomMargin = pixelsPerBlock * qrMargin; - int marginToBeRemoved = qrMarginBottom > defaultBottomMargin ? 0 : (defaultBottomMargin-qrMarginBottom); - BitMatrix mergedMatrix = new BitMatrix(mergedWidth, mergedHeight - marginToBeRemoved); - - for (int x = 0; x < firstMatrix.getWidth(); x++) { - for (int y = 0; y < firstMatrix.getHeight() - marginToBeRemoved; y++) { - if (firstMatrix.get(x, y)) { - mergedMatrix.set(x + imageMargin, y + imageMargin); - } - } - } - for (int x = 0; x < secondMatrix.getWidth(); x++) { - for (int y = 0; y < secondMatrix.getHeight(); y++) { - if (secondMatrix.get(x, y)) { - mergedMatrix.set(x + imageMargin, y + firstMatrix.getHeight() - marginToBeRemoved + imageMargin); - } - } - } - return mergedMatrix; - } - - private static void copyMatrixDataToBiggerMatrix(BitMatrix fromMatrix, BitMatrix toMatrix) { - int widthDiff = toMatrix.getWidth() - fromMatrix.getWidth(); - int leftMargin = widthDiff / 2; - for (int x = 0; x < fromMatrix.getWidth(); x++) { - for (int y = 0; y < fromMatrix.getHeight(); y++) { - if (fromMatrix.get(x, y)) { - toMatrix.set(x + leftMargin, y); - } - } - } - } - - private static void drawBorder(BufferedImage image, int borderSize, int imageMargin) { - image.createGraphics(); - Graphics2D graphics = (Graphics2D) image.getGraphics(); - graphics.setColor(Color.BLACK); - for (int i = 0; i < borderSize; i++) { - graphics.drawRect(i + imageMargin, i + imageMargin, image.getWidth() - 1 - (2 * i) - (2 * imageMargin), image.getHeight() - 1 - (2 * i) - (2 * imageMargin)); - } - graphics.dispose(); - } - - private static BufferedImage getImage(BitMatrix bitMatrix, String colorModel) { - int imageWidth = bitMatrix.getWidth(); - int imageHeight = bitMatrix.getHeight(); - BufferedImage image = new BufferedImage(imageWidth, imageHeight, getImageType(colorModel)); - image.createGraphics(); - - Graphics2D graphics = (Graphics2D) image.getGraphics(); - graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_OFF); - graphics.setColor(Color.WHITE); - graphics.fillRect(0, 0, imageWidth, imageHeight); - - graphics.setColor(Color.BLACK); - - for (int i = 0; i < imageWidth; i++) { - for (int j = 0; j < imageHeight; j++) { - if (bitMatrix.get(i, j)) { - graphics.fillRect(i, j, 1, 1); - } - } - } - graphics.dispose(); - return image; - } - - private static BitMatrix getBitMatrix(String data, int width, int height, Map hintsMap) throws WriterException { - BitMatrix bitMatrix = qrCodeWriter.encode(data, BarcodeFormat.QR_CODE, width, height, hintsMap); - return bitMatrix; - } - - private static BitMatrix getDefaultBitMatrix(String data, Map hintsMap) throws WriterException { - BitMatrix defaultBitMatrix = qrCodeWriter.encode(data, BarcodeFormat.QR_CODE, 0, 0, hintsMap); - return defaultBitMatrix; - } - - private static Map getHintsMap(String errorCorrectionLevel, int qrMargin) { - Map hintsMap = new HashMap(); - switch (errorCorrectionLevel) { - case "H": - hintsMap.put(EncodeHintType.ERROR_CORRECTION, ErrorCorrectionLevel.H); - break; - case "Q": - hintsMap.put(EncodeHintType.ERROR_CORRECTION, ErrorCorrectionLevel.Q); - break; - case "M": - hintsMap.put(EncodeHintType.ERROR_CORRECTION, ErrorCorrectionLevel.M); - break; - case "L": - hintsMap.put(EncodeHintType.ERROR_CORRECTION, ErrorCorrectionLevel.L); - break; - } - hintsMap.put(EncodeHintType.MARGIN, qrMargin); - return hintsMap; - } - - //Sample = 2A42UH , Verdana, 11, 0.1, Grayscale - private static BufferedImage getTextImage(String text, String fontName, int fontSize, double tracking, String colorModel) throws IOException, FontFormatException { - - BufferedImage image = new BufferedImage(1, 1, getImageType(colorModel)); - - Font basicFont = getFontFromStore(fontName); - - Map attributes = new HashMap(); - attributes.put(TextAttribute.TRACKING, tracking); - attributes.put(TextAttribute.WEIGHT, TextAttribute.WEIGHT_BOLD); - attributes.put(TextAttribute.SIZE, fontSize); - Font font = basicFont.deriveFont(attributes); - - Graphics2D graphics2d = image.createGraphics(); - graphics2d.setFont(font); - FontMetrics fontmetrics = graphics2d.getFontMetrics(); - int width = fontmetrics.stringWidth(text); - int height = fontmetrics.getHeight(); - graphics2d.dispose(); - - image = new BufferedImage(width, height, getImageType(colorModel)); - graphics2d = image.createGraphics(); - graphics2d.setRenderingHint(RenderingHints.KEY_ALPHA_INTERPOLATION, RenderingHints.VALUE_ALPHA_INTERPOLATION_QUALITY); - graphics2d.setRenderingHint(RenderingHints.KEY_COLOR_RENDERING, RenderingHints.VALUE_COLOR_RENDER_QUALITY); - graphics2d.setRenderingHint(RenderingHints.KEY_TEXT_ANTIALIASING, RenderingHints.VALUE_TEXT_ANTIALIAS_OFF); - graphics2d.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_OFF); - - graphics2d.setColor(Color.WHITE); - graphics2d.fillRect(0, 0, image.getWidth(), image.getHeight()); - graphics2d.setColor(Color.BLACK); - - graphics2d.setFont(font); - fontmetrics = graphics2d.getFontMetrics(); - graphics2d.drawString(text, 0, fontmetrics.getAscent()); - graphics2d.dispose(); - - return image; - } - - private static int getImageType(String colorModel) { - if (colorModel.equalsIgnoreCase("RGB")) { - return BufferedImage.TYPE_INT_RGB; - } else { - return BufferedImage.TYPE_BYTE_GRAY; - } - } - - private static Font loadFontStore(String fontName) throws IOException, FontFormatException { - //load the packaged font file from the root dir - String fontFile = "/"+fontName+".ttf"; - InputStream fontStream = QRCodeImageGeneratorUtil.class.getResourceAsStream(fontFile); - Font basicFont = Font.createFont(Font.TRUETYPE_FONT, fontStream); - fontStore.put(fontName, basicFont); - - return basicFont; - } - - private static Font getFontFromStore(String fontName) throws IOException, FontFormatException { - return null != fontStore.get(fontName) ? fontStore.get(fontName) : loadFontStore(fontName); - } -} diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/ZipEditorUtil.java b/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/ZipEditorUtil.java deleted file mode 100644 index e0b72c7293..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/java/org/sunbird/jobs/samza/util/ZipEditorUtil.java +++ /dev/null @@ -1,40 +0,0 @@ -package org.sunbird.jobs.samza.util; - -import java.io.File; -import java.io.FileInputStream; -import java.io.FileOutputStream; -import java.io.IOException; -import java.util.zip.ZipEntry; -import java.util.zip.ZipOutputStream; -import java.util.List; - -public class ZipEditorUtil { - - private static JobLogger LOGGER = new JobLogger(ZipEditorUtil.class); - - public static File zipFiles(List files, String zipName, String basePath) throws IOException { - File zipFile = new File(basePath + File.separator + zipName + ".zip"); - LOGGER.info("ZipEditorUtil:zipFiles: creating file - " + zipFile.getAbsolutePath()); - zipFile.createNewFile(); - LOGGER.info("ZipEditorUtil:zipFiles: created file - " + zipFile.getAbsolutePath()); - FileOutputStream fos = new FileOutputStream(zipFile); - ZipOutputStream zos = new ZipOutputStream(fos); - for (File file : files) { - String filePath = file.getAbsolutePath(); - ZipEntry ze = new ZipEntry(file.getName()); - zos.putNextEntry(ze); - FileInputStream fis = new FileInputStream(filePath); - byte[] buffer = new byte[1024]; - int len; - while ((len = fis.read(buffer)) > 0) { - zos.write(buffer, 0, len); - } - zos.closeEntry(); - fis.close(); - } - zos.close(); - fos.close(); - - return zipFile; - } -} \ No newline at end of file diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/resources/Verdana.ttf b/platform-jobs/samza/qrcode-image-generator/src/main/resources/Verdana.ttf deleted file mode 100755 index 18ef6e8f1f..0000000000 Binary files a/platform-jobs/samza/qrcode-image-generator/src/main/resources/Verdana.ttf and /dev/null differ diff --git a/platform-jobs/samza/qrcode-image-generator/src/main/resources/log4j.xml b/platform-jobs/samza/qrcode-image-generator/src/main/resources/log4j.xml deleted file mode 100644 index d2db3940cc..0000000000 --- a/platform-jobs/samza/qrcode-image-generator/src/main/resources/log4j.xml +++ /dev/null @@ -1,20 +0,0 @@ - - - - - - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/platform-modules/actors/pom.xml b/platform-modules/actors/pom.xml index 3f38d54eb0..2c960a5bfd 100644 --- a/platform-modules/actors/pom.xml +++ b/platform-modules/actors/pom.xml @@ -63,7 +63,7 @@ org.sunbird cloud-store-sdk - ${cloud.store.version} + ${cloud.store.version} diff --git a/platform-modules/actors/src/main/java/org/sunbird/learning/actor/FrameworkHierarchyActor.java b/platform-modules/actors/src/main/java/org/sunbird/learning/actor/FrameworkHierarchyActor.java index 2d79096539..7466db10cf 100644 --- a/platform-modules/actors/src/main/java/org/sunbird/learning/actor/FrameworkHierarchyActor.java +++ b/platform-modules/actors/src/main/java/org/sunbird/learning/actor/FrameworkHierarchyActor.java @@ -41,7 +41,6 @@ protected void invokeMethod(Request request, ActorRef parent) { Map frameworkData = fwHierarchy.getFrameworkHierarchy(frameworkId); OK("framework", frameworkData, sender()); } else { - TelemetryManager.log("Unsupported operation: " + methodName); throw new ClientException(LearningErrorCodes.ERR_INVALID_OPERATION.name(), "Unsupported operation: " + methodName); } diff --git a/platform-modules/actors/src/main/java/org/sunbird/learning/framework/FrameworkHierarchy.java b/platform-modules/actors/src/main/java/org/sunbird/learning/framework/FrameworkHierarchy.java index 235bfecc88..6dd68d6dd8 100644 --- a/platform-modules/actors/src/main/java/org/sunbird/learning/framework/FrameworkHierarchy.java +++ b/platform-modules/actors/src/main/java/org/sunbird/learning/framework/FrameworkHierarchy.java @@ -21,6 +21,7 @@ import org.sunbird.graph.model.cache.CategoryCache; import org.sunbird.graph.model.node.DefinitionDTO; import org.sunbird.learning.hierarchy.store.HierarchyStore; +import org.sunbird.telemetry.logger.TelemetryManager; import java.util.ArrayList; import java.util.Collections; @@ -61,7 +62,6 @@ public void generateFrameworkHierarchy(String id) throws Exception { Map frameworkDocument = new HashMap<>(); Map frameworkHierarchy = getHierarchy(node.getIdentifier(), 0, false, true); CategoryCache.setFramework(node.getIdentifier(), frameworkHierarchy); - frameworkDocument.putAll(frameworkHierarchy); frameworkDocument.put("identifier", node.getIdentifier()); frameworkDocument.put("objectType", node.getObjectType()); diff --git a/platform-modules/actors/src/main/java/org/sunbird/learning/util/CloudStore.java b/platform-modules/actors/src/main/java/org/sunbird/learning/util/CloudStore.java index 32643d61e8..98b6ed803c 100644 --- a/platform-modules/actors/src/main/java/org/sunbird/learning/util/CloudStore.java +++ b/platform-modules/actors/src/main/java/org/sunbird/learning/util/CloudStore.java @@ -24,17 +24,22 @@ public class CloudStore { private static String cloudStoreType = Platform.config.getString("cloud_storage_type"); static { - - if(StringUtils.equalsIgnoreCase(cloudStoreType, "azure")) { - String storageKey = Platform.config.getString("azure_storage_key"); - String storageSecret = Platform.config.getString("azure_storage_secret"); - storageService = StorageServiceFactory.getStorageService(new StorageConfig(cloudStoreType, storageKey, storageSecret)); - }else if(StringUtils.equalsIgnoreCase(cloudStoreType, "aws")) { - String storageKey = Platform.config.getString("aws_storage_key"); - String storageSecret = Platform.config.getString("aws_storage_secret"); - storageService = StorageServiceFactory.getStorageService(new StorageConfig(cloudStoreType, storageKey, storageSecret)); - }else { - throw new ServerException("ERR_INVALID_CLOUD_STORAGE", "Error while initialising cloud storage"); + try + { + String storageKey = Platform.config.getString("cloud_storage_key"); + System.out.println("storageKey::"+storageKey); + String storageSecret = Platform.config.getString("cloud_storage_secret"); + System.out.println("storageSecret::"+storageSecret); + scala.Option storageEndpoint = scala.Option.apply(Platform.config.getString("cloud_storage_endpoint")); + System.out.println("storageEndpoint::"+storageEndpoint); + scala.Option storageRegion = scala.Option.apply(""); + System.out.println("storageRegion::"+storageRegion); + System.out.println("cloudStoreType::"+cloudStoreType); + storageService = StorageServiceFactory.getStorageService(new StorageConfig(cloudStoreType, storageKey, storageSecret,storageEndpoint,storageRegion)); + System.out.println("storageService::"+storageService); + }catch(Exception e) + { + e.printStackTrace(); } } @@ -43,10 +48,8 @@ public static BaseStorageService getCloudStoreService() { } public static String getContainerName() { - if(StringUtils.equalsIgnoreCase(cloudStoreType, "azure")) { - return Platform.config.getString("azure_storage_container"); - }else if(StringUtils.equalsIgnoreCase(cloudStoreType, "aws")) { - return S3PropertyReader.getProperty("aws_storage_container"); + if(Platform.config.hasPath("cloud_storage_container") && !Platform.config.getString("cloud_storage_container").equalsIgnoreCase("")) { + return Platform.config.getString("cloud_storage_container"); }else { throw new ServerException("ERR_INVALID_CLOUD_STORAGE", "Error while getting container name"); } diff --git a/platform-modules/actors/src/main/java/org/sunbird/learning/util/ControllerUtil.java b/platform-modules/actors/src/main/java/org/sunbird/learning/util/ControllerUtil.java index 807e0794a1..879551f51f 100644 --- a/platform-modules/actors/src/main/java/org/sunbird/learning/util/ControllerUtil.java +++ b/platform-modules/actors/src/main/java/org/sunbird/learning/util/ControllerUtil.java @@ -4,6 +4,8 @@ import org.apache.commons.collections.MapUtils; import org.apache.commons.lang.StringUtils; import org.codehaus.jackson.map.ObjectMapper; +import org.json.JSONArray; +import org.neo4j.driver.v1.Values; import org.sunbird.common.Platform; import org.sunbird.common.dto.NodeDTO; import org.sunbird.common.dto.Request; @@ -16,7 +18,11 @@ import org.sunbird.common.mgr.ConvertGraphNode; import org.sunbird.graph.common.enums.GraphHeaderParams; import org.sunbird.graph.dac.enums.GraphDACParams; +import org.sunbird.graph.dac.enums.SystemNodeTypes; +import org.sunbird.graph.dac.model.Filter; +import org.sunbird.graph.dac.model.MetadataCriterion; import org.sunbird.graph.dac.model.Node; +import org.sunbird.graph.dac.model.SearchConditions; import org.sunbird.graph.dac.model.SearchCriteria; import org.sunbird.graph.engine.mgr.impl.NodeManager; import org.sunbird.graph.engine.router.GraphEngineManagers; @@ -149,25 +155,25 @@ public DefinitionDTO getDefinition(String taxonomyId, String objectType) { } return null; } - + public DefinitionDTO getDefinition(String taxonomyId, String objectType, boolean disableAkka) { - DefinitionDTO definition = null; - if(disableAkka) { - try { - Request request = new Request(); - request.getContext().put(GraphHeaderParams.graph_id.name(), TAXONOMY_ID); - request.put(GraphDACParams.object_type.name(), objectType); - NodeManager nodeManager = new NodeManager(); - definition = nodeManager.getNodeDefinition(request); - }catch (Exception e) { - throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), e.getMessage() + ". Please Try Again After Sometime!"); - } - }else { - definition = getDefinition(taxonomyId, objectType); - } - return definition; - - } + DefinitionDTO definition = null; + if(disableAkka) { + try { + Request request = new Request(); + request.getContext().put(GraphHeaderParams.graph_id.name(), TAXONOMY_ID); + request.put(GraphDACParams.object_type.name(), objectType); + NodeManager nodeManager = new NodeManager(); + definition = nodeManager.getNodeDefinition(request); + }catch (Exception e) { + throw new ServerException(TaxonomyErrorCodes.SYSTEM_ERROR.name(), e.getMessage() + ". Please Try Again After Sometime!"); + } + }else { + definition = getDefinition(taxonomyId, objectType); + } + return definition; + + } /** * Gets all the definitions @@ -406,6 +412,71 @@ public List getNodes(String graphId, String objectType, int startPosition, } } + public List getNodes(String graphId, String objectType, List mimeTypes, List status, List contentIdsList, double migrationVersion, int startPosition, int batchSize) { + List filters = new ArrayList(); + if(!mimeTypes.isEmpty()) + filters.add(new Filter("mimeType", SearchConditions.OP_IN, mimeTypes)); + if(!status.isEmpty()) + filters.add(new Filter("status", SearchConditions.OP_IN, status)); + if(!contentIdsList.isEmpty()) { + filters.add(new Filter("IL_UNIQUE_ID", SearchConditions.OP_IN, contentIdsList)); + } else { + if (migrationVersion == 0) filters.add(new Filter("migrationVersion", SearchConditions.OP_IS, Values.NULL)); + else filters.add(new Filter("migrationVersion", SearchConditions.OP_EQUAL, migrationVersion)); + } + + SearchCriteria sc = new SearchCriteria(); + sc.setNodeType(SystemNodeTypes.DATA_NODE.name()); + sc.setObjectType(objectType); + sc.setResultSize(batchSize); + sc.setStartPosition(startPosition); + if(!filters.isEmpty() && filters.size()>0) + sc.addMetadata(MetadataCriterion.create(filters)); + Request req = getRequest(graphId, GraphEngineManagers.SEARCH_MANAGER, "searchNodes", + GraphDACParams.search_criteria.name(), sc); + req.put(GraphDACParams.get_tags.name(), true); + Response listRes = getResponse(req); + if (checkError(listRes)) + return null; + else { + List nodes = (List) listRes.get(GraphDACParams.node_list.name()); + return nodes; + } + } + + public List getNodes(String graphId, String objectType, List status, List contentIdsList, double migrationVersion, int startPosition, int batchSize) { + List filters = new ArrayList(); + if(!status.isEmpty()) + filters.add(new Filter("status", SearchConditions.OP_IN, status)); + if(!contentIdsList.isEmpty()) { + filters.add(new Filter("IL_UNIQUE_ID", SearchConditions.OP_IN, contentIdsList)); + } else { + if (migrationVersion == 0) { + filters.add(new Filter("qumlVersion", SearchConditions.OP_IS, Values.NULL)); + filters.add(new Filter("schemaVersion", SearchConditions.OP_IS, Values.NULL)); + } + else if (migrationVersion > 2) filters.add(new Filter("migrationVersion", SearchConditions.OP_EQUAL, migrationVersion)); + } + + SearchCriteria sc = new SearchCriteria(); + sc.setNodeType(SystemNodeTypes.DATA_NODE.name()); + sc.setObjectType(objectType); + sc.setResultSize(batchSize); + sc.setStartPosition(startPosition); + if(!filters.isEmpty() && filters.size()>0) + sc.addMetadata(MetadataCriterion.create(filters)); + Request req = getRequest(graphId, GraphEngineManagers.SEARCH_MANAGER, "searchNodes", + GraphDACParams.search_criteria.name(), sc); + req.put(GraphDACParams.get_tags.name(), true); + Response listRes = getResponse(req); + if (checkError(listRes)) + return null; + else { + List nodes = (List) listRes.get(GraphDACParams.node_list.name()); + return nodes; + } + } + public List getNodesWithInDateRange(String graphId, String objectType, String startDate, String endDate) { List nodeIds = new ArrayList<>(); @@ -553,15 +624,15 @@ public Map constructHierarchy(List> list) { Map>> currentLevelNodes = new HashMap<>(); list.stream().filter(e -> ((Number) e.get("depth")).intValue() == depth) .collect(Collectors.toList()).forEach(e -> { - String id = (String) e.get("identifier"); - List> nodes = currentLevelNodes.get(id); - if (CollectionUtils.isEmpty(nodes)) { - nodes = new ArrayList<>(); - currentLevelNodes.put((String) e.get("identifier"), nodes); - } - nodes.add(e); + String id = (String) e.get("identifier"); + List> nodes = currentLevelNodes.get(id); + if (CollectionUtils.isEmpty(nodes)) { + nodes = new ArrayList<>(); + currentLevelNodes.put((String) e.get("identifier"), nodes); + } + nodes.add(e); - }); + }); List> nextLevelNodes = list.stream().filter(e -> ((Number) e.get("depth")).intValue() == depth + 1) .collect(Collectors.toList()); @@ -663,19 +734,19 @@ public Map getHierarchyMap(String graphId, String contentId, Def // startTime = System.currentTimeMillis(); Response getList = getDataNodes(graphId, ids); // System.out.println("Time to get required data nodes: " + (System.currentTimeMillis() - startTime)); - if (null != getList && !checkError(getList)) { - List nodeList = (List) getList.get("node_list"); - Map> contentsWithMetadata = nodeList.stream().map(n -> ConvertGraphNode.convertGraphNode - (n, graphId, definition, fields)).map(contentMap -> { - contentMap.remove("collections"); - contentMap.remove("children"); - contentMap.remove("usedByContent"); - contentMap.remove("item_sets"); - contentMap.remove("methods"); - contentMap.remove("libraries"); - contentMap.remove("editorState"); - return contentMap; - }).collect(Collectors.toMap(e -> (String) e.get("identifier"), e -> e)); + if (null != getList && !checkError(getList)) { + List nodeList = (List) getList.get("node_list"); + Map> contentsWithMetadata = nodeList.stream().map(n -> ConvertGraphNode.convertGraphNode + (n, graphId, definition, fields)).map(contentMap -> { + contentMap.remove("collections"); + contentMap.remove("children"); + contentMap.remove("usedByContent"); + contentMap.remove("item_sets"); + contentMap.remove("methods"); + contentMap.remove("libraries"); + contentMap.remove("editorState"); + return contentMap; + }).collect(Collectors.toMap(e -> (String) e.get("identifier"), e -> e)); contentList = contentList.stream().map(n -> { n.putAll(contentsWithMetadata.get(n.get("identifier"))); @@ -684,16 +755,16 @@ public Map getHierarchyMap(String graphId, String contentId, Def // startTime = System.currentTimeMillis(); collectionHierarchy = contentCleanUp(constructHierarchy(contentList)); // System.out.println("Time to construct hierarchy: " + (System.currentTimeMillis() - startTime)); - } else { - if (null != getList && getList.getResponseCode() == ResponseCode.CLIENT_ERROR) { - throw new ClientException(ContentErrorCodes.ERR_INVALID_INPUT.name(), getList.getParams().getErrmsg()); - } else { - throw new ServerException(ContentAPIParams.SERVER_ERROR.name(), getList.getParams().getErrmsg()); - } - } - hierarchyCleanUp(collectionHierarchy); - return collectionHierarchy; - } + } else { + if (null != getList && getList.getResponseCode() == ResponseCode.CLIENT_ERROR) { + throw new ClientException(ContentErrorCodes.ERR_INVALID_INPUT.name(), getList.getParams().getErrmsg()); + } else { + throw new ServerException(ContentAPIParams.SERVER_ERROR.name(), getList.getParams().getErrmsg()); + } + } + hierarchyCleanUp(collectionHierarchy); + return collectionHierarchy; + } public List getPublishedCollections(String graphId, int offset, int limit) { @@ -758,4 +829,88 @@ public void hierarchyCleanUp(Map map) { } } -} + public Map getCSPMigrationObjectCount(String graphId, List objectTypes, List mimeTypeList, List statusList, List contentIdsList, double migrationVersion) { + Map counts = new HashMap(); + Request request = getRequest(graphId, GraphEngineManagers.SEARCH_MANAGER, "executeQueryForProps"); + StringBuilder queryString = new StringBuilder(); + queryString.append("MATCH (n:{0}) WHERE EXISTS(n.IL_FUNC_OBJECT_TYPE) AND n.IL_SYS_NODE_TYPE=\"DATA_NODE\" AND n.IL_FUNC_OBJECT_TYPE IN {1} "); + + if(mimeTypeList!=null && !mimeTypeList.isEmpty()) + queryString.append(" AND n.mimeType IN {2} "); + + if(statusList!=null && !statusList.isEmpty()) + queryString.append(" AND n.status IN {3} "); + + if(contentIdsList!=null && !contentIdsList.isEmpty()) + queryString.append(" AND n.IL_UNIQUE_ID IN {5} "); + else { + if (migrationVersion == 0) queryString.append(" AND NOT EXISTS(n.migrationVersion) "); + else queryString.append(" AND n.migrationVersion={4} "); + } + + queryString.append("RETURN n.IL_FUNC_OBJECT_TYPE AS objectType, COUNT(n) AS count;"); + + System.out.println("Count queryString:: " + MessageFormat.format(queryString.toString(), graphId, new JSONArray(objectTypes), new JSONArray(mimeTypeList), new JSONArray(statusList), migrationVersion, new JSONArray(contentIdsList))); + + request.put(GraphDACParams.query.name(), MessageFormat.format(queryString.toString(), graphId, new JSONArray(objectTypes), new JSONArray(mimeTypeList), new JSONArray(statusList), migrationVersion, new JSONArray(contentIdsList))); + + List props = new ArrayList(); + props.add("objectType"); + props.add("count"); + request.put(GraphDACParams.property_keys.name(), props); + Response response = getResponse(request); + if (!checkError(response)) { + Map result = response.getResult(); + List> list = (List>) result.get("properties"); + if (null != list && !list.isEmpty()) { + for (int i = 0; i < list.size(); i++) { + Map properties = list.get(i); + counts.put((String) properties.get("objectType"), (Long) properties.get("count")); + } + } + + } + return counts; + } + + public Map getQumlMigrationObjectCount(String graphId, List objectTypes, List statusList, List objectIdList, double migrationVersion) { + Map counts = new HashMap(); + Request request = getRequest(graphId, GraphEngineManagers.SEARCH_MANAGER, "executeQueryForProps"); + StringBuilder queryString = new StringBuilder(); + queryString.append("MATCH (n:{0}) WHERE EXISTS(n.IL_FUNC_OBJECT_TYPE) AND n.IL_SYS_NODE_TYPE=\"DATA_NODE\" AND n.IL_FUNC_OBJECT_TYPE IN {1} "); + + if(statusList!=null && !statusList.isEmpty()) + queryString.append(" AND n.status IN {2} "); + + if(objectIdList!=null && !objectIdList.isEmpty()) + queryString.append(" AND n.IL_UNIQUE_ID IN {3} "); + + if(migrationVersion != 0 && migrationVersion > 2) + queryString.append(" AND n.migrationVersion={4} "); + + queryString.append("AND NOT EXISTS(n.qumlVersion) AND NOT EXISTS(n.schemaVersion) RETURN n.IL_FUNC_OBJECT_TYPE AS objectType, COUNT(n) AS count;"); + + System.out.println("Count queryString:: " + MessageFormat.format(queryString.toString(), graphId, new JSONArray(objectTypes), new JSONArray(statusList), new JSONArray(objectIdList), migrationVersion)); + + request.put(GraphDACParams.query.name(), MessageFormat.format(queryString.toString(), graphId, new JSONArray(objectTypes), new JSONArray(statusList), new JSONArray(objectIdList), migrationVersion)); + + List props = new ArrayList(); + props.add("objectType"); + props.add("count"); + request.put(GraphDACParams.property_keys.name(), props); + Response response = getResponse(request); + if (!checkError(response)) { + Map result = response.getResult(); + List> list = (List>) result.get("properties"); + if (null != list && !list.isEmpty()) { + for (int i = 0; i < list.size(); i++) { + Map properties = list.get(i); + counts.put((String) properties.get("objectType"), (Long) properties.get("count")); + } + } + + } + return counts; + } + +} \ No newline at end of file diff --git a/platform-modules/content-manager/src/main/java/org/sunbird/content/validator/ContentValidator.java b/platform-modules/content-manager/src/main/java/org/sunbird/content/validator/ContentValidator.java index 7b15a87825..3567ad7aaa 100644 --- a/platform-modules/content-manager/src/main/java/org/sunbird/content/validator/ContentValidator.java +++ b/platform-modules/content-manager/src/main/java/org/sunbird/content/validator/ContentValidator.java @@ -478,6 +478,13 @@ private boolean isAllRequiredFieldsAvailable(Node node) { */ public Boolean isValidUrl(String fileURL, String mimeType) { Boolean isValid = false; + + String strBlobPrefix = Platform.config.hasPath("cloudstorage.relative_path_prefix")? Platform.config.getString("cloudstorage.relative_path_prefix"): "CONTENT_STORAGE_BASE_PATH"; + if(fileURL.contains(strBlobPrefix)) { + String absolutePath = Platform.config.getString("cloudstorage.read_base_path") + java.io.File.separator + Platform.config.getString("cloud_storage_container"); + fileURL = StringUtils.replace(fileURL,strBlobPrefix,absolutePath); + } + File file = HttpDownloadUtility.downloadFile(fileURL, BUNDLE_PATH); try { if (exceptionChecks(mimeType, file)) { diff --git a/platform-modules/content-manager/src/test/resources/application.conf b/platform-modules/content-manager/src/test/resources/application.conf index 308f99aa86..f655564d09 100644 --- a/platform-modules/content-manager/src/test/resources/application.conf +++ b/platform-modules/content-manager/src/test/resources/application.conf @@ -60,6 +60,10 @@ specialCharRegEx="^([$&+,:;=?@#|!]*)$" numberRegEx="^([+-]?\\d*\\.?\\d*)$" cloud_storage_type="azure" +cloud_storage_key="accesskeyyyy" +cloud_storage_secret="secretxxx=" +cloud_storage_container="sunbird-content-dev" + azure_storage_key="" azure_storage_secret="" azure_storage_container="sunbird-content-dev" diff --git a/platform-modules/manager/pom.xml b/platform-modules/manager/pom.xml index 55cb295a79..8c913b5120 100644 --- a/platform-modules/manager/pom.xml +++ b/platform-modules/manager/pom.xml @@ -9,18 +9,6 @@ sunbird-manager - - org.sunbird - searchindex-elasticsearch - 1.1-SNAPSHOT - jar - - - org.apache.logging.log4j - log4j-core - - - org.sunbird content-manager diff --git a/platform-modules/manager/src/main/java/org/sunbird/taxonomy/controller/ContentV3Controller.java b/platform-modules/manager/src/main/java/org/sunbird/taxonomy/controller/ContentV3Controller.java index 0495d3d738..f2fa149d18 100755 --- a/platform-modules/manager/src/main/java/org/sunbird/taxonomy/controller/ContentV3Controller.java +++ b/platform-modules/manager/src/main/java/org/sunbird/taxonomy/controller/ContentV3Controller.java @@ -184,8 +184,8 @@ public ResponseEntity bundle(@RequestBody Map map) { * Set. */ @SuppressWarnings("unchecked") - @RequestMapping(value = { "/publish/{id:.+}", "/public/publish/{id:.+}" }, method = RequestMethod.POST) - @ResponseBody +// @RequestMapping(value = { "/publish/{id:.+}", "/public/publish/{id:.+}" }, method = RequestMethod.POST) +// @ResponseBody public ResponseEntity publish(@PathVariable(value = "id") String contentId, @RequestBody Map map) { String apiId = "ekstep.learning.content.publish"; @@ -224,8 +224,8 @@ public ResponseEntity publish(@PathVariable(value = "id") String conte * Set. */ @SuppressWarnings("unchecked") - @RequestMapping(value = "/unlisted/publish/{id:.+}", method = RequestMethod.POST) - @ResponseBody +// @RequestMapping(value = "/unlisted/publish/{id:.+}", method = RequestMethod.POST) +// @ResponseBody public ResponseEntity publishUnlisted(@PathVariable(value = "id") String contentId, @RequestBody Map map) { String apiId = "ekstep.learning.content.unlisted.publish"; @@ -262,8 +262,8 @@ public ResponseEntity publishUnlisted(@PathVariable(value = "id") Stri * The Content Id which needs to be published. * @return The Response entity with Content Id in its Result Set. */ - @RequestMapping(value = "/review/{id:.+}", method = RequestMethod.POST) - @ResponseBody +// @RequestMapping(value = "/review/{id:.+}", method = RequestMethod.POST) +// @ResponseBody public ResponseEntity review(@PathVariable(value = "id") String contentId, @RequestBody Map map) { String apiId = "ekstep.learning.content.review"; @@ -446,8 +446,8 @@ public ResponseEntity syncHierarchy(@PathVariable(value = "id") String * @param requestMap * @return */ - @RequestMapping(value = "/dialcode/link", method = RequestMethod.POST) - @ResponseBody +// @RequestMapping(value = "/dialcode/link", method = RequestMethod.POST) +// @ResponseBody public ResponseEntity linkDialCode(@RequestBody Map requestMap, @RequestHeader(value = CHANNEL_ID, required = true) String channelId) { String apiId = "ekstep.content.dialcode.link"; @@ -470,8 +470,8 @@ public ResponseEntity linkDialCode(@RequestBody Map re * The Content Id for whom DIAL Codes have to be reserved * @return */ - @RequestMapping(value = "/dialcode/reserve/{id:.+}", method = RequestMethod.POST) - @ResponseBody +// @RequestMapping(value = "/dialcode/reserve/{id:.+}", method = RequestMethod.POST) +// @ResponseBody public ResponseEntity reserveDialCode( @PathVariable(value = "id") String contentId, @RequestBody Map requestMap, @@ -496,8 +496,8 @@ public ResponseEntity reserveDialCode( * The Content Id of the Textbook from which DIAL Codes have to be released * @return The Response Entity with list of Released QR Codes */ - @RequestMapping(value="/dialcode/release/{id}", method = RequestMethod.PATCH) - @ResponseBody +// @RequestMapping(value="/dialcode/release/{id}", method = RequestMethod.PATCH) +// @ResponseBody public ResponseEntity releaseDialcodes(@PathVariable(value="id") String contentId, @RequestHeader(value = CHANNEL_ID) String channelId) { String apiId = "ekstep.learning.content.dialcode.release"; @@ -545,8 +545,8 @@ protected String getAPIVersion() { return API_VERSION_3; } - @RequestMapping(value="/retire/{id:.+}", method = RequestMethod.DELETE) - @ResponseBody +// @RequestMapping(value="/retire/{id:.+}", method = RequestMethod.DELETE) +// @ResponseBody public ResponseEntity retire(@PathVariable(value = "id") String contentId) { String apiId = "ekstep.content.retire"; TelemetryManager.log("Retiring content | Content Id : " + contentId); @@ -583,8 +583,8 @@ public ResponseEntity acceptFlag(@PathVariable(value = "id") String co * @return The Response entity with Content Id and Version Key in its Result * Set. */ - @RequestMapping(value="/flag/reject/{id:.+}", method = RequestMethod.POST) - @ResponseBody +// @RequestMapping(value="/flag/reject/{id:.+}", method = RequestMethod.POST) +// @ResponseBody public ResponseEntity rejectFlag(@PathVariable(value = "id") String contentId){ String apiId = "ekstep.learning.content.rejectFlag"; TelemetryManager.log("Reject flagged content | Content Id : " + contentId); @@ -650,8 +650,8 @@ public ResponseEntity discard(@PathVariable(value = "id") String conte * @param contentId * @return */ - @RequestMapping(value = "/reject/{id:.+}", method = RequestMethod.POST) - @ResponseBody +// @RequestMapping(value = "/reject/{id:.+}", method = RequestMethod.POST) +// @ResponseBody public ResponseEntity rejectContent(@PathVariable(value = "id") String contentId, @RequestBody Map requestMap) { String apiId = "ekstep.learning.content.reject"; diff --git a/platform-modules/manager/src/test/java/org/sunbird/taxonomy/util/YouTubeUrlUtilTest.java b/platform-modules/manager/src/test/java/org/sunbird/taxonomy/util/YouTubeUrlUtilTest.java index c8c5feafc3..0828c77617 100644 --- a/platform-modules/manager/src/test/java/org/sunbird/taxonomy/util/YouTubeUrlUtilTest.java +++ b/platform-modules/manager/src/test/java/org/sunbird/taxonomy/util/YouTubeUrlUtilTest.java @@ -84,7 +84,7 @@ private void createYoutubeContent() throws Exception { // check license of valid youtube url. @Test public void testYouTubeService_01() throws Exception { - String artifactUrl = "https://www.youtube.com/watch?v=owr198WQpM8"; + String artifactUrl = "https://www.youtube.com/watch?v=GHmQ8euNwv8"; String result = YouTubeUrlUtil.getLicense(artifactUrl); assertEquals("creativeCommon", result); } @@ -146,7 +146,7 @@ public void testYouTubeService_07() throws Exception { public void testYouTubeService_08() throws Exception { //upload content String mimeType = "video/x-youtube"; - String fileUrl = "https://www.youtube.com/watch?v=owr198WQpM8"; + String fileUrl = "https://www.youtube.com/watch?v=eKT1IbPjH1Q"; Response response = contentManager.upload(contentId, fileUrl, mimeType); String responseCode = (String) response.getResponseCode().toString(); assertEquals("OK", responseCode); diff --git a/platform-modules/manager/src/test/resources/Contents/testEcmlMediaYoutube/index.ecml b/platform-modules/manager/src/test/resources/Contents/testEcmlMediaYoutube/index.ecml index f87791d263..0994dc56d3 100644 --- a/platform-modules/manager/src/test/resources/Contents/testEcmlMediaYoutube/index.ecml +++ b/platform-modules/manager/src/test/resources/Contents/testEcmlMediaYoutube/index.ecml @@ -29,7 +29,7 @@ - + @@ -191,7 +191,7 @@ - + diff --git a/platform-modules/pom.xml b/platform-modules/pom.xml index 94cdc3d8ee..492c211627 100644 --- a/platform-modules/pom.xml +++ b/platform-modules/pom.xml @@ -19,7 +19,7 @@ 2.3.1 1.8 1.8 - 1.2.8 + 1.4.6 diff --git a/platform-modules/service/src/main/resources/application.conf b/platform-modules/service/src/main/resources/application.conf index 5c057273cd..8e34631cff 100644 --- a/platform-modules/service/src/main/resources/application.conf +++ b/platform-modules/service/src/main/resources/application.conf @@ -221,3 +221,13 @@ content.tagging.property="subject,medium" # This is added to handle large artifacts sizes differently content.artifact.size.for_online=209715200 + + +cloudstorage { + metadata.replace_absolute_path=true + relative_path_prefix={{ cloudstorage_relative_path_prefix_content }} + metadata.list={{ cloudstorage_metadata_list }} + read_base_path="{{ cloudstorage_base_path }}" + write_base_path={{ valid_cloudstorage_base_urls }} +} +cloud_storage_container="{{ cloud_storage_content_bucketname }}" \ No newline at end of file diff --git a/platform-tools/spikes/content-tool/pom.xml b/platform-tools/spikes/content-tool/pom.xml index a485ae2693..626d2e53d6 100644 --- a/platform-tools/spikes/content-tool/pom.xml +++ b/platform-tools/spikes/content-tool/pom.xml @@ -66,7 +66,7 @@ org.sunbird cloud-store-sdk - 1.2.5 + 1.4.6 diff --git a/platform-tools/spikes/content-tool/src/main/java/org/sunbird/content/tool/CloudStoreManager.java b/platform-tools/spikes/content-tool/src/main/java/org/sunbird/content/tool/CloudStoreManager.java index d4f0e23e60..47e769efbd 100644 --- a/platform-tools/spikes/content-tool/src/main/java/org/sunbird/content/tool/CloudStoreManager.java +++ b/platform-tools/spikes/content-tool/src/main/java/org/sunbird/content/tool/CloudStoreManager.java @@ -21,9 +21,18 @@ public class CloudStoreManager { protected String destStorageType = Platform.config.getString("destination.storage_type"); + protected scala.Option awsEndpoint = scala.Option.apply(""); + protected scala.Option awsRegion = scala.Option.apply(""); + protected BaseStorageService awsService = StorageServiceFactory.getStorageService(new StorageConfig("aws", Platform.config.getString("aws_storage_key"), Platform.config.getString("aws_storage_secret"),awsEndpoint,awsRegion)); + + protected scala.Option azureEndpoint = scala.Option.apply(""); + protected scala.Option azureRegion = scala.Option.apply(""); + protected BaseStorageService azureService = StorageServiceFactory.getStorageService(new StorageConfig("azure", Platform.config.getString("azure_storage_key"), Platform.config.getString("azure_storage_secret"),azureEndpoint,azureRegion)); + + protected scala.Option ociEndpoint = scala.Option.apply(Platform.config.getString("oci_storage_endpoint")); + protected scala.Option ociRegion = scala.Option.apply(""); + protected BaseStorageService ociService = StorageServiceFactory.getStorageService(new StorageConfig("oci", Platform.config.getString("oci_storage_key"), Platform.config.getString("oci_storage_secret"),ociEndpoint,ociRegion)); - protected BaseStorageService awsService = StorageServiceFactory.getStorageService(new StorageConfig("aws", Platform.config.getString("aws_storage_key"), Platform.config.getString("aws_storage_secret"))); - protected BaseStorageService azureService = StorageServiceFactory.getStorageService((new StorageConfig("azure", Platform.config.getString("azure_storage_key"), Platform.config.getString("azure_storage_secret")))); private String cloudSrcBaseURL = Platform.config.getString("cloud.src.baseurl"); private String cloudDestBaseURL = Platform.config.getString("cloud.dest.baseurl"); @@ -239,6 +248,8 @@ public String getContainerName(String cloudStoreType) { return Platform.config.getString("azure_storage_container"); }else if(StringUtils.equalsIgnoreCase(cloudStoreType, "aws")) { return Platform.config.getString("aws_storage_container"); + }else if(StringUtils.equalsIgnoreCase(cloudStoreType, "oci")) { + return Platform.config.getString("oci_storage_container"); }else { throw new ServerException("ERR_INVALID_CLOUD_STORAGE", "Error while getting container name"); } @@ -249,6 +260,8 @@ public BaseStorageService getcloudService(String cloudStoreType){ return azureService; }else if(StringUtils.equalsIgnoreCase(cloudStoreType, "aws")) { return awsService; + }else if(StringUtils.equalsIgnoreCase(cloudStoreType, "oci")) { + return ociService; }else { throw new ServerException("ERR_INVALID_CLOUD_STORAGE", "Error while getting container name"); } diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/CSPMigrationMessageGenerator.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/CSPMigrationMessageGenerator.java new file mode 100644 index 0000000000..c285937397 --- /dev/null +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/CSPMigrationMessageGenerator.java @@ -0,0 +1,220 @@ +package org.sunbird.sync.tool.mgr; + +import org.apache.commons.collections.CollectionUtils; +import org.apache.commons.lang3.StringUtils; +import org.codehaus.jackson.map.ObjectMapper; +import org.springframework.stereotype.Component; +import org.sunbird.common.Platform; +import org.sunbird.common.exception.ClientException; +import org.sunbird.common.exception.ResourceNotFoundException; +import org.sunbird.graph.dac.enums.SystemNodeTypes; +import org.sunbird.graph.dac.model.Node; +import org.sunbird.learning.util.ControllerUtil; +import org.sunbird.sync.tool.util.KafkaUtil; +import org.sunbird.telemetry.util.LogTelemetryEventUtil; + +import javax.annotation.PostConstruct; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; + +@Component +public class CSPMigrationMessageGenerator { + + private ControllerUtil util = new ControllerUtil(); + private static int batchSize = 100; + private ObjectMapper mapper = new ObjectMapper(); + private static String actorId = "csp-migration"; + private static String actorType = "System"; + private static String pdataId = "org.sunbird.platform"; + private static String pdataVersion = "1.0"; + private static String action = "csp-migration"; + private static String migrationTopicName = Platform.config.getString("csp.migration.request.topic"); + + @PostConstruct + private void init() throws Exception { + int batch = Platform.config.hasPath("csp.migration.batch.size") ? Platform.config.getInt("csp.migration.batch.size") : 100; + batchSize = batch; + } + + public void generateMgrMsg(String graphId, String[] objectTypes, String[] mimeTypes, String[] status, String[] contentIds, double migrationVersion, Integer limit, Integer delay) throws Exception { + if (StringUtils.isBlank(graphId)) + throw new ClientException("ERR_INVALID_GRAPH_ID", "Graph Id is blank."); + if (null == objectTypes || objectTypes.length == 0) + throw new ClientException("ERR_EMPTY_OBJECT_TYPE", "Object Type is blank."); + List mimeTypeList = new ArrayList(); + List statusList = new ArrayList(); + List contentIdsList = new ArrayList(); + if (null != mimeTypes && mimeTypes.length > 0) + mimeTypeList = Arrays.asList(mimeTypes); + if (null != status && status.length > 0) + statusList = Arrays.asList(status); + if (null != contentIds && contentIds.length > 0) + contentIdsList = Arrays.asList(contentIds); + + Map errors = new HashMap<>(); + long startTime = System.currentTimeMillis(); + System.out.println("-----------------------------------------"); + System.out.println("\nMigration Event Generation starting at " + startTime); + Map counts = util.getCSPMigrationObjectCount(graphId, Arrays.asList(objectTypes), mimeTypeList, statusList, contentIdsList, migrationVersion); + if (counts.isEmpty()) { + System.out.println("No objects found in this graph."); + } else { + List objTypes = counts.keySet().stream().filter(key -> Arrays.asList(objectTypes).contains(key)).collect(Collectors.toList()); + for (String objectType : objTypes) { + Long count = counts.get(objectType); + System.out.println(count + " - " + objectType + " nodes available for migration"); + } + } + for (String objectType : objectTypes) { + Long count = counts.get(objectType); + if (count > 0) { + System.out.println("-----------------------------------------"); + System.out.println("\nGenerating event for object of type " + objectType + " with batch size of " + batchSize + " having delay " + delay + "ms for each batch.\n"); + int start = 0; + int current = 0; + long total = counts.get(objectType); + long stopLimit; + if (limit > 0) { + if (limit < batchSize || limit % batchSize != 0) { + System.out.println("Limit value should be minimum " + batchSize + ". The limit value should be multiple of " + batchSize + ". Setting limit to minimum value. i.e " + batchSize); + stopLimit = batchSize; + } else stopLimit = limit; + } else stopLimit = total; + + System.out.println("CSPMigrationMessageGenerator:: generateMgrMsg:: stopLimit: " + stopLimit + " || total: " + total); + + boolean found = true; + while (found && start < stopLimit) { + List nodes = null; + try { + nodes = util.getNodes(graphId, objectType.trim(), mimeTypeList, statusList, contentIdsList, migrationVersion, start, batchSize); + } catch (ResourceNotFoundException e) { + System.out.println("Error while fetching neo4j records for objectType=" + objectType + ", start=" + start + ",batchSize=" + batchSize); + start += batchSize; + continue; + } + if (CollectionUtils.isNotEmpty(nodes)) { + System.out.println("CSPMigrationMessageGenerator:: generateMgrMsg:: nodes: " + nodes.size()); + start += batchSize; + Map events = generateMigrationEvent(nodes, errors); + System.out.println("CSPMigrationMessageGenerator:: generateMgrMsg:: events: " + events.size()); + sendEvent(events, errors); + current += events.size(); + printProgress(startTime, total, current); + if (delay > 0) { + Thread.sleep(delay); + } + } else { + System.out.println("CSPMigrationMessageGenerator:: generateMgrMsg:: Breaking Event Generation Loop!"); + found = false; + break; + } + } + if (!errors.isEmpty()) + System.out.println("Error! while generating migration event data from nodes, below nodes are ignored. \n" + errors); + long endTime = System.currentTimeMillis(); + System.out.println("\nMigration Event Generation completed for object of type " + objectType + " in: " + (endTime - startTime) + "ms"); + } else { + System.out.println("\nSkipped Generating migration event for objectType: " + objectType); + } + } + System.out.println("-----------------------------------------"); + long endTime = System.currentTimeMillis(); + System.out.println("Migration Event Generation completed at " + endTime); + System.out.println("Time taken to generate Events: " + (endTime - startTime) + "ms"); + } + + private void sendEvent(Map events, Map errors) { + for (String id : events.keySet()) { + try { + KafkaUtil.send(events.get(id), migrationTopicName); + } catch (Exception e) { + e.printStackTrace(); + System.out.println("Error Message :"+e.getMessage() ); + errors.put(id, "Error While Sending Migration Event for " + id); + } + } + } + + private Map generateMigrationEvent(List nodes, Map errors) { + Map events = new HashMap(); + for (Node node : nodes) { + String message = getEvent(node, errors); + if (StringUtils.isNotBlank(message)) + events.put(node.getIdentifier(), message); + } + return events; + } + + private String getEvent(Node node, Map errors) { + Map actor = new HashMap() {{ + put("id", actorId); + put("type", actorType); + }}; + Map context = new HashMap() {{ + put("channel", node.getMetadata().getOrDefault("channel", "")); + put("pdata", new HashMap() {{ + put("id", pdataId); + put("ver", pdataVersion); + }}); + }}; + if (Platform.config.hasPath("cloud_storage.env")) { + String env = Platform.config.getString("cloud_storage.env"); + context.put("env", env); + } + Map object = new HashMap() {{ + put("id", node.getIdentifier()); + put("ver", node.getMetadata().get("versionKey")); + }}; + Map edata = new HashMap() {{ + put("action", action); + put("metadata", new HashMap() {{ + put("pkgVersion", node.getMetadata().get("pkgVersion")); + put("mimeType", node.getMetadata().get("mimeType")); + put("status", node.getMetadata().get("status")); + put("identifier", node.getIdentifier()); + put("objectType", node.getObjectType()); + }}); + }}; + String beJobRequestEvent = LogTelemetryEventUtil.logInstructionEvent(actor, context, object, edata); + if (StringUtils.isBlank(beJobRequestEvent)) { + errors.put(node.getIdentifier(), "Error While Generating Migration Event for " + node.getIdentifier()); + } + return beJobRequestEvent; + } + + private static void printProgress(long startTime, long total, long current) { + long eta = current == 0 ? 0 : + (total - current) * (System.currentTimeMillis() - startTime) / current; + + String etaHms = current == 0 ? "N/A" : + String.format("%02d:%02d:%02d", TimeUnit.MILLISECONDS.toHours(eta), + TimeUnit.MILLISECONDS.toMinutes(eta) % TimeUnit.HOURS.toMinutes(1), + TimeUnit.MILLISECONDS.toSeconds(eta) % TimeUnit.MINUTES.toSeconds(1)); + + StringBuilder string = new StringBuilder(140); + int percent = (int) (current * 100 / total); + string + .append('\r') + .append(String.join("", Collections.nCopies(percent == 0 ? 2 : 2 - (int) (Math.log10(percent)), " "))) + .append(String.format(" %d%% [", percent)) + .append(String.join("", Collections.nCopies(percent, "="))) + .append('>') + .append(String.join("", Collections.nCopies(100 - percent, " "))) + .append(']') + .append(String.join("", Collections.nCopies((int) (Math.log10(total)) - (int) (Math.log10(current)), " "))) + .append(String.format(" %d/%d, ETA: %s", current, total, etaHms)); + + System.out.print(string); + } + + public static void filterMigrationNodes(List nodes, Integer limit) { + nodes.removeIf(n -> SystemNodeTypes.DEFINITION_NODE.name().equals(n.getNodeType())); + } +} \ No newline at end of file diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/CassandraESSyncManager.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/CassandraESSyncManager.java index c5309efedc..27c3e3e493 100644 --- a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/CassandraESSyncManager.java +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/CassandraESSyncManager.java @@ -32,6 +32,7 @@ import org.sunbird.content.entity.Media; import org.sunbird.content.entity.Plugin; import org.sunbird.content.operation.initializer.BaseInitializer; +import org.sunbird.content.util.SyncMessageGenerator; import org.sunbird.graph.cache.util.RedisStoreUtil; import org.sunbird.graph.dac.model.Node; import org.sunbird.graph.model.node.DefinitionDTO; @@ -43,7 +44,6 @@ import org.sunbird.sync.tool.util.DialcodeSync; import org.sunbird.sync.tool.util.ElasticSearchConnector; import org.sunbird.sync.tool.util.GraphUtil; -import org.sunbird.sync.tool.util.SyncMessageGenerator; import org.sunbird.telemetry.logger.TelemetryManager; import org.springframework.stereotype.Component; @@ -230,7 +230,7 @@ private void populateESDoc(Map unitsMetadata, Map nodeMap = SyncMessageGenerator.getMessage(node); - Map message = SyncMessageGenerator.getJSONMessage(nodeMap, relationMap); + Map message = SyncMessageGenerator.getJSONMessage(nodeMap, relationMap, new ArrayList()); childData = refactorUnit(child); Object variants = message.get("variants"); if(null != variants && !(variants instanceof String)) @@ -316,7 +316,7 @@ private Map getESDocuments(List> units) thro return null; }).filter(node -> null!=node).collect(Collectors.toList()); - Map esDocument = SyncMessageGenerator.getMessages(nodes, "Content", new HashMap<>()); + Map esDocument = SyncMessageGenerator.getMessages(nodes, "Content", new HashMap<>(), new HashMap<>(), false); return esDocument; } @@ -533,13 +533,13 @@ protected boolean validateAssetMediaForExternalLink(Media media){ return isExternal; } - public void syncDialcodesByIds(List dialcodes) throws Exception { - if(CollectionUtils.isEmpty(dialcodes)) { + public void syncDialcodesByIds(List dialcodes, List filenames) throws Exception { + if(CollectionUtils.isEmpty(dialcodes) && CollectionUtils.isEmpty(filenames)) { System.out.println("CassandraESSyncManager:syncDialcodesByIds:No dialcodes for syncing."); return; } System.out.println("CassandraESSyncManager:syncDialcodesByIds:No dialcodes for syncing: " + dialcodes.size()); - int dialcodeSyncedCount = dialcodeSync.sync(dialcodes); + int dialcodeSyncedCount = dialcodeSync.sync(dialcodes, filenames); System.out.println("CassandraESSyncManager:syncDialcodesByIds::dialcodeSyncedCount: " + dialcodeSyncedCount); } diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/QumlMigrationMessageGenerator.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/QumlMigrationMessageGenerator.java new file mode 100644 index 0000000000..a8e45cce24 --- /dev/null +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/mgr/QumlMigrationMessageGenerator.java @@ -0,0 +1,215 @@ +package org.sunbird.sync.tool.mgr; + +import org.apache.commons.collections.CollectionUtils; +import org.apache.commons.lang3.StringUtils; +import org.codehaus.jackson.map.ObjectMapper; +import org.springframework.stereotype.Component; +import org.sunbird.common.Platform; +import org.sunbird.common.exception.ClientException; +import org.sunbird.common.exception.ResourceNotFoundException; +import org.sunbird.graph.dac.enums.SystemNodeTypes; +import org.sunbird.graph.dac.model.Node; +import org.sunbird.learning.util.ControllerUtil; +import org.sunbird.sync.tool.util.KafkaUtil; +import org.sunbird.telemetry.util.LogTelemetryEventUtil; + +import javax.annotation.PostConstruct; +import java.util.*; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; + +@Component +public class QumlMigrationMessageGenerator { + + private ControllerUtil util = new ControllerUtil(); + private static int batchSize = 50; + private ObjectMapper mapper = new ObjectMapper(); + private static String actorId = "quml-migration"; + private static String actorType = "System"; + private static String pdataId = "org.sunbird.platform"; + private static String pdataVersion = "1.0"; + private static String action = "quml-migration"; + private static String migrationTopicName = Platform.config.getString("quml.migration.request.topic"); + + @PostConstruct + private void init() throws Exception { + int batch = Platform.config.hasPath("quml.migration.batch.size") ? Platform.config.getInt("quml.migration.batch.size") : 50; + batchSize = batch; + } + + public void generateMgrMsg(String graphId, String[] objectTypes, String[] status, String[] contentIds, double migrationVersion, Integer limit, Integer delay) throws Exception { + List mimeTypeList = new ArrayList(); + if (StringUtils.isBlank(graphId)) + throw new ClientException("ERR_INVALID_GRAPH_ID", "Graph Id is blank."); + if (null == objectTypes || objectTypes.length == 0) + throw new ClientException("ERR_EMPTY_OBJECT_TYPE", "Object Type is blank."); + List statusList = new ArrayList(); + List contentIdsList = new ArrayList(); + if (null != status && status.length > 0) + statusList = Arrays.asList(status); + if (null != contentIds && contentIds.length > 0) + contentIdsList = Arrays.asList(contentIds); + + Map errors = new HashMap<>(); + long startTime = System.currentTimeMillis(); + System.out.println("-----------------------------------------"); + System.out.println("\nQuML Migration Event Generation starting at " + startTime); + Map counts = util.getQumlMigrationObjectCount(graphId, Arrays.asList(objectTypes), statusList, contentIdsList, migrationVersion); + if (counts.isEmpty()) { + System.out.println("No objects found in this graph."); + } else { + List objTypes = counts.keySet().stream().filter(key -> Arrays.asList(objectTypes).contains(key)).collect(Collectors.toList()); + for (String objectType : objTypes) { + Long count = counts.get(objectType); + System.out.println(count + " - " + objectType + " nodes available for quml migration"); + } + } + for (String objectType : objectTypes) { + Long count = counts.get(objectType); + if (count > 0) { + System.out.println("-----------------------------------------"); + System.out.println("\nGenerating event for object of type " + objectType + " with batch size of " + batchSize + " having delay " + delay + "ms for each batch.\n"); + int start = 0; + int current = 0; + long total = counts.get(objectType); + long stopLimit; + if (limit > 0) { + if (limit < batchSize || limit % batchSize != 0) { + System.out.println("Limit value should be minimum " + batchSize + ". The limit value should be multiple of " + batchSize + ". Setting limit to minimum value. i.e " + batchSize); + stopLimit = batchSize; + } else stopLimit = limit; + } else stopLimit = total; + + System.out.println("QumlMigrationMessageGenerator:: generateMgrMsg:: stopLimit: " + stopLimit + " || total: " + total); + + boolean found = true; + while (found && start < stopLimit) { + List nodes = null; + try { + nodes = util.getNodes(graphId, objectType.trim(), statusList, contentIdsList, migrationVersion, start, batchSize); + } catch (ResourceNotFoundException e) { + System.out.println("Error while fetching neo4j records for objectType=" + objectType + ", start=" + start + ",batchSize=" + batchSize); + start += batchSize; + continue; + } + if (CollectionUtils.isNotEmpty(nodes)) { + System.out.println("QumlMigrationMessageGenerator:: generateMgrMsg:: nodes: " + nodes.size()); + start += batchSize; + Map events = generateMigrationEvent(nodes, errors); + System.out.println("QumlMigrationMessageGenerator:: generateMgrMsg:: events: " + events.size()); + sendEvent(events, errors); + current += events.size(); + printProgress(startTime, total, current); + if (delay > 0) { + Thread.sleep(delay); + } + } else { + System.out.println("QumlMigrationMessageGenerator:: generateMgrMsg:: Breaking Event Generation Loop!"); + found = false; + break; + } + } + if (!errors.isEmpty()) + System.out.println("Error! while generating migration event data from nodes, below nodes are ignored. \n" + errors); + long endTime = System.currentTimeMillis(); + System.out.println("\nQuML Migration Event Generation completed for object of type " + objectType + " in: " + (endTime - startTime) + "ms"); + } else { + System.out.println("\nSkipped Generating migration event for objectType: " + objectType); + } + } + System.out.println("-----------------------------------------"); + long endTime = System.currentTimeMillis(); + System.out.println("QuML Migration Event Generation completed at " + endTime); + System.out.println("Time taken to generate Events: " + (endTime - startTime) + "ms"); + } + + private void sendEvent(Map events, Map errors) { + for (String id : events.keySet()) { + try { + KafkaUtil.send(events.get(id), migrationTopicName); + } catch (Exception e) { + e.printStackTrace(); + System.out.println("Error Message :"+e.getMessage() ); + errors.put(id, "Error While Sending Migration Event for " + id); + } + } + } + + private Map generateMigrationEvent(List nodes, Map errors) { + Map events = new HashMap(); + for (Node node : nodes) { + String message = getEvent(node, errors); + if (StringUtils.isNotBlank(message)) + events.put(node.getIdentifier(), message); + } + return events; + } + + private String getEvent(Node node, Map errors) { + Map actor = new HashMap() {{ + put("id", actorId); + put("type", actorType); + }}; + Map context = new HashMap() {{ + put("channel", node.getMetadata().getOrDefault("channel", "")); + put("pdata", new HashMap() {{ + put("id", pdataId); + put("ver", pdataVersion); + }}); + }}; + if (Platform.config.hasPath("cloud_storage.env")) { + String env = Platform.config.getString("cloud_storage.env"); + context.put("env", env); + } + Map object = new HashMap() {{ + put("id", node.getIdentifier()); + put("ver", node.getMetadata().get("versionKey")); + }}; + Map edata = new HashMap() {{ + put("action", action); + put("metadata", new HashMap() {{ + put("pkgVersion", node.getMetadata().get("pkgVersion")); + put("mimeType", node.getMetadata().get("mimeType")); + put("status", node.getMetadata().get("status")); + put("qumlVersion", node.getMetadata().get("qumlVersion")); + put("schemaVersion", node.getMetadata().get("schemaVersion")); + put("identifier", node.getIdentifier()); + put("objectType", node.getObjectType()); + }}); + }}; + String beJobRequestEvent = LogTelemetryEventUtil.logInstructionEvent(actor, context, object, edata); + if (StringUtils.isBlank(beJobRequestEvent)) { + errors.put(node.getIdentifier(), "Error While Generating Migration Event for " + node.getIdentifier()); + } + return beJobRequestEvent; + } + + private static void printProgress(long startTime, long total, long current) { + long eta = current == 0 ? 0 : + (total - current) * (System.currentTimeMillis() - startTime) / current; + + String etaHms = current == 0 ? "N/A" : + String.format("%02d:%02d:%02d", TimeUnit.MILLISECONDS.toHours(eta), + TimeUnit.MILLISECONDS.toMinutes(eta) % TimeUnit.HOURS.toMinutes(1), + TimeUnit.MILLISECONDS.toSeconds(eta) % TimeUnit.MINUTES.toSeconds(1)); + + StringBuilder string = new StringBuilder(140); + int percent = (int) (current * 100 / total); + string + .append('\r') + .append(String.join("", Collections.nCopies(percent == 0 ? 2 : 2 - (int) (Math.log10(percent)), " "))) + .append(String.format(" %d%% [", percent)) + .append(String.join("", Collections.nCopies(percent, "="))) + .append('>') + .append(String.join("", Collections.nCopies(100 - percent, " "))) + .append(']') + .append(String.join("", Collections.nCopies((int) (Math.log10(total)) - (int) (Math.log10(current)), " "))) + .append(String.format(" %d/%d, ETA: %s", current, total, etaHms)); + + System.out.print(string); + } + + public static void filterMigrationNodes(List nodes, Integer limit) { + nodes.removeIf(n -> SystemNodeTypes.DEFINITION_NODE.name().equals(n.getNodeType())); + } +} diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/MigrateCSPDataCommand.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/MigrateCSPDataCommand.java new file mode 100644 index 0000000000..b1d1c576d0 --- /dev/null +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/MigrateCSPDataCommand.java @@ -0,0 +1,43 @@ +package org.sunbird.sync.tool.shell; + +import org.springframework.beans.factory.annotation.Autowired; +import org.springframework.shell.core.CommandMarker; +import org.springframework.shell.core.annotation.CliCommand; +import org.springframework.shell.core.annotation.CliOption; +import org.springframework.stereotype.Component; +import org.sunbird.sync.tool.mgr.CSPMigrationMessageGenerator; + +import java.time.LocalDateTime; +import java.time.format.DateTimeFormatter; + +@Component +public class MigrateCSPDataCommand implements CommandMarker { + + @Autowired + CSPMigrationMessageGenerator cspMsgGenerator; + + @CliCommand(value = "migratecspdata", help = "Generate CSP Data Migration Event") + public void migrateCSPData( + @CliOption(key = {"graphId"}, mandatory = false, unspecifiedDefaultValue = "domain", help = "graphId of the object") final String graphId, + @CliOption(key = {"objectType"}, mandatory = true, help = "Object Type is Required") final String[] objectType, + @CliOption(key = {"mimeType"}, mandatory = false, help = "mimeTypes can be provided") final String[] mimeType, + @CliOption(key = {"status"}, mandatory = false, help = "Specific Status can be passed") final String[] status, + @CliOption(key = {"ids"}, mandatory = false, help = "Specific content Ids can be passed") final String[] contentIds, + @CliOption(key = {"migrationVersion"}, mandatory = false, unspecifiedDefaultValue = "0", help = "Specific migration version can be passed") final double migrationVersion, + @CliOption(key = {"limit"}, mandatory = false, unspecifiedDefaultValue = "0", help = "Specific Limit can be passed") final Integer limit, + @CliOption(key = {"delay"}, mandatory = false, unspecifiedDefaultValue = "10", help = "time gap between each batch") final Integer delay) + throws Exception { + + long startTime = System.currentTimeMillis(); + DateTimeFormatter dtf = DateTimeFormatter.ofPattern("yyyy/MM/dd HH:mm:ss"); + LocalDateTime start = LocalDateTime.now(); + cspMsgGenerator.generateMgrMsg(graphId, objectType, mimeType, status, contentIds, migrationVersion, limit, delay); + long endTime = System.currentTimeMillis(); + long exeTime = endTime - startTime; + System.out.println("Total time of execution: " + exeTime + "ms"); + LocalDateTime end = LocalDateTime.now(); + System.out.println("START_TIME: " + dtf.format(start) + ", END_TIME: " + dtf.format(end)); + } + + +} \ No newline at end of file diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/MigrateQumlDataCommand.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/MigrateQumlDataCommand.java new file mode 100644 index 0000000000..f7d5acd500 --- /dev/null +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/MigrateQumlDataCommand.java @@ -0,0 +1,41 @@ +package org.sunbird.sync.tool.shell; + +import org.springframework.beans.factory.annotation.Autowired; +import org.springframework.shell.core.CommandMarker; +import org.springframework.shell.core.annotation.CliCommand; +import org.springframework.shell.core.annotation.CliOption; +import org.springframework.stereotype.Component; +import org.sunbird.sync.tool.mgr.CSPMigrationMessageGenerator; +import org.sunbird.sync.tool.mgr.QumlMigrationMessageGenerator; + +import java.time.LocalDateTime; +import java.time.format.DateTimeFormatter; + +@Component +public class MigrateQumlDataCommand implements CommandMarker { + + @Autowired + QumlMigrationMessageGenerator qumlMsgGenerator; + + @CliCommand(value = "migrateQuml", help = "Generate QuML Data Migration (from 1.0 to 1.1) Event") + public void migrateQuml( + @CliOption(key = {"graphId"}, mandatory = false, unspecifiedDefaultValue = "domain", help = "graphId of the object") final String graphId, + @CliOption(key = {"objectType"}, mandatory = true, help = "Object Type is Required") final String[] objectType, + @CliOption(key = {"status"}, mandatory = false, help = "Specific Status can be passed") final String[] status, + @CliOption(key = {"ids"}, mandatory = false, help = "Specific content Ids can be passed") final String[] contentIds, + @CliOption(key = {"migrationVersion"}, mandatory = false, unspecifiedDefaultValue = "0", help = "Specific migration version can be passed") final double migrationVersion, + @CliOption(key = {"limit"}, mandatory = false, unspecifiedDefaultValue = "0", help = "Specific Limit can be passed") final Integer limit, + @CliOption(key = {"delay"}, mandatory = false, unspecifiedDefaultValue = "10", help = "time gap between each batch") final Integer delay) + throws Exception { + + long startTime = System.currentTimeMillis(); + DateTimeFormatter dtf = DateTimeFormatter.ofPattern("yyyy/MM/dd HH:mm:ss"); + LocalDateTime start = LocalDateTime.now(); + qumlMsgGenerator.generateMgrMsg(graphId, objectType, status, contentIds, migrationVersion, limit, delay); + long endTime = System.currentTimeMillis(); + long exeTime = endTime - startTime; + System.out.println("Total time of execution: " + exeTime + "ms"); + LocalDateTime end = LocalDateTime.now(); + System.out.println("START_TIME: " + dtf.format(start) + ", END_TIME: " + dtf.format(end)); + } +} diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/SyncShellCommands.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/SyncShellCommands.java index 0045ba87ad..0ee8b6fdb8 100644 --- a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/SyncShellCommands.java +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/shell/SyncShellCommands.java @@ -143,7 +143,8 @@ public void syncLeafNodesByIds( @CliCommand(value = "syncdialcodes", help = "Refresh leafNodes by Id(s) for Collection MimeTypes") public void syncDialcodes( - @CliOption(key = {"id","ids"}, mandatory = false, help = "Unique Id of node object") final String[] ids) + @CliOption(key = {"id","ids"}, mandatory = false, help = "Unique Id of node object") final String[] ids, + @CliOption(key = {"filename", "filenames"}, mandatory = false, help = "dialcode filename") final String[] filenames) throws Exception { long startTime = System.currentTimeMillis(); DateTimeFormatter dtf = DateTimeFormatter.ofPattern("yyyy/MM/dd HH:mm:ss"); @@ -151,7 +152,7 @@ public void syncDialcodes( if(null != ids && ids.length > 0) { System.out.println("SyncShellCommands:syncDialcodes:Total dialcodes for syncing:: " + ids); - syncManager.syncDialcodesByIds(new ArrayList(Arrays.asList(ids))); + syncManager.syncDialcodesByIds(new ArrayList(Arrays.asList(ids)), new ArrayList(Arrays.asList(filenames))); } long endTime = System.currentTimeMillis(); diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/DialcodeSync.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/DialcodeSync.java index 1070a611aa..8c2220e27d 100644 --- a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/DialcodeSync.java +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/DialcodeSync.java @@ -1,10 +1,14 @@ package org.sunbird.sync.tool.util; +import java.io.IOException; import java.util.HashMap; import java.util.List; import java.util.Map; +import java.util.function.Function; +import java.util.stream.Collectors; import org.apache.commons.collections.MapUtils; +import org.apache.commons.lang3.StringUtils; import org.sunbird.cassandra.connector.util.CassandraConnector; import org.sunbird.common.Platform; import org.sunbird.common.exception.ServerException; @@ -23,7 +27,12 @@ public class DialcodeSync { private static String documentType = null; private static String keyspace = null; private static String table = null; - + private static String qrImageKeyspace = null; + private static String qrImageTable = null; + + private static boolean isReplaceString = Platform.config.getBoolean("is_replace_string"); + private static String replaceSrcStringDIALStore = Platform.config.getString("replace_src_string_DIAL_store"); + private static String replaceDestStringDIALStore = Platform.config.getString("replace_dest_string_DIAL_store"); public DialcodeSync() { indexName = Platform.config.hasPath("dialcode.index.name") @@ -34,13 +43,17 @@ public DialcodeSync() { ? Platform.config.getString("dialcode.keyspace.name") : "sunbirddev_dialcode_store"; table = Platform.config.hasPath("dialcode.table") ? Platform.config.getString("dialcode.table") : "dial_code"; + qrImageKeyspace = Platform.config.hasPath("dialcode.qrImageKeyspace.name") + ? Platform.config.getString("dialcode.qrImageKeyspace.name") : "dialcodes"; + qrImageTable = Platform.config.hasPath("dialcode.qrImageTable") + ? Platform.config.getString("dialcode.qrImageTable") : "dialcode_images"; ElasticSearchUtil.initialiseESClient(indexName, Platform.config.getString("search.es_conn_info")); } - public int sync(List dialcodes) throws Exception { + public int sync(List dialcodes, List filenames) throws Exception { System.out.println("DialcodeSync:sync:message:: Total number of Dialcodes to be fetched from cassandra: " + dialcodes.size()); // Get dialcodes data from cassandra - Map messages = getDialcodesFromIds(dialcodes); + Map messages = getDialcodesFromIds(dialcodes, filenames); if(MapUtils.isEmpty(messages)) { System.out.println("DialcodeSync:sync:message:: No dialcodes data fetched from cassandra."); return 0; @@ -55,15 +68,27 @@ private void upsertDocument( Map messages) throws Exception { ElasticSearchUtil.bulkIndexWithIndexId(indexName, documentType, messages); } - public Map getDialcodesFromIds(List identifiers) { + public Map getDialcodesFromIds(List identifiers, List filenames) { try { - Map messages = new HashMap(); - ResultSet rs = getDialcodesFromDB(identifiers); + List updateddialcodes = null; + HashMap filenamesmap = null; + + if(identifiers != null || !identifiers.isEmpty()) + updateddialcodes = identifiers; + else { + filenamesmap = (HashMap)filenames.stream().collect(Collectors.toMap(s->s.split("_")[1], Function.identity())); + updateddialcodes = filenamesmap.keySet().stream().collect(Collectors.toList()); + } + + System.out.println("DialcodeSync:sync:message:: Total number of Dialcodes to be fetched from cassandra: " + updateddialcodes.size()); + + Map messages = new HashMap(); + ResultSet rs = getDialcodesFromDB(updateddialcodes); if (null != rs) { Map dialCodesFromDB = new HashMap(); while(rs.iterator().hasNext()) { Row row = rs.iterator().next(); - String dialcodeId = (String)row.getString("identifier"); + String dialcodeId = row.getString("identifier"); dialCodesFromDB.put(dialcodeId, row); Map syncRequest = new HashMap(){{ @@ -77,9 +102,21 @@ public Map getDialcodesFromIds(List identifiers) { put("published_on", row.getString("published_on")); put("objectType", "DialCode"); }}; - messages.put(dialcodeId, syncRequest); + + String imageUrl = ""; + if(filenamesmap!=null && !filenamesmap.isEmpty()) + imageUrl = getQRImageFromDB(filenamesmap.get(dialcodeId), true); + else imageUrl = getQRImageFromDB(dialcodeId, false); + System.out.println("Returned imageUrl: " + imageUrl); + if(isReplaceString) { + imageUrl = StringUtils.replaceEach(imageUrl, new String[]{replaceSrcStringDIALStore}, new String[]{replaceDestStringDIALStore}); + } + System.out.println("Replaced imageUrl: " + imageUrl); + if(imageUrl != null && !imageUrl.isEmpty()) syncRequest.put("imageUrl", imageUrl); + messages.put(dialcodeId, syncRequest); } System.out.println("total dialcodes fetched from cassandra: " + dialCodesFromDB); + System.out.println("messages: " + JSONUtils.serialize(messages)); return messages; } else { @@ -98,4 +135,23 @@ private ResultSet getDialcodesFromDB(List identifiers) { Session session = CassandraConnector.getSession(); return session.execute(query); } + + private String getQRImageFromDB(String dialcodeId, boolean isFileName) { + String query = ""; + if(isFileName) + query = "SELECT url FROM " + qrImageKeyspace + "." + qrImageTable + " WHERE filename ='" + dialcodeId + "';"; + else + query = "SELECT url FROM " + qrImageKeyspace + "." + qrImageTable + " WHERE dialcode ='" + dialcodeId + "' ALLOW FILTERING;"; + System.out.println("getQRImageFromDB query: " + query); + Session session = CassandraConnector.getSession(); + ResultSet rs = session.execute(query); + while(rs.iterator().hasNext()) { + Row row = rs.iterator().next(); + System.out.println("getQRImageFromDB url: " + row.getString("url")); + return row.getString("url"); + } + return ""; + } + + } diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/JSONUtils.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/JSONUtils.java new file mode 100644 index 0000000000..35cc7f7871 --- /dev/null +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/JSONUtils.java @@ -0,0 +1,34 @@ +package org.sunbird.sync.tool.util; + +import org.apache.commons.lang3.StringUtils; +import org.codehaus.jackson.map.ObjectMapper; + +import java.util.List; +import java.util.Map; + +public class JSONUtils { + + private static ObjectMapper mapper = new ObjectMapper();; + + public static String serialize(Object object) throws Exception { + return mapper.writeValueAsString(object); + } + + public static Object convertJSONString(String value) { + if (StringUtils.isNotBlank(value)) { + com.fasterxml.jackson.databind.ObjectMapper mapper = new com.fasterxml.jackson.databind.ObjectMapper(); + try { + Map map = mapper.readValue(value, Map.class); + return map; + } catch (Exception e) { + try { + List list = mapper.readValue(value, List.class); + return list; + } catch (Exception ex) { + //suppress error due to invalid map while converting JSON and return null + } + } + } + return null; + } +} \ No newline at end of file diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/KafkaUtil.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/KafkaUtil.java new file mode 100644 index 0000000000..c77b578809 --- /dev/null +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/KafkaUtil.java @@ -0,0 +1,85 @@ +package org.sunbird.sync.tool.util; + +import org.apache.kafka.clients.consumer.Consumer; +import org.apache.kafka.clients.consumer.ConsumerConfig; +import org.apache.kafka.clients.consumer.KafkaConsumer; +import org.apache.kafka.clients.producer.KafkaProducer; +import org.apache.kafka.clients.producer.Producer; +import org.apache.kafka.clients.producer.ProducerConfig; +import org.apache.kafka.clients.producer.ProducerRecord; +import org.apache.kafka.common.PartitionInfo; +import org.apache.kafka.common.serialization.LongDeserializer; +import org.apache.kafka.common.serialization.LongSerializer; +import org.apache.kafka.common.serialization.StringDeserializer; +import org.apache.kafka.common.serialization.StringSerializer; +import org.sunbird.common.Platform; +import org.sunbird.common.exception.ClientException; +import org.sunbird.telemetry.logger.TelemetryManager; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Properties; + +public class KafkaUtil { + + private final static String BOOTSTRAP_SERVERS = Platform.config.getString("kafka.urls"); + private static Producer producer; + private static Consumer consumer; + private static Map topicCheckResult = new HashMap(); + private static boolean isTopicCheckReq = Platform.config.hasPath("kafka.topic.send.enable") ? Platform.config.getBoolean("kafka.topic.send.enable") : true; + + static { + loadProducerProperties(); + loadConsumerProperties(); + } + + private static void loadProducerProperties() { + Properties props = new Properties(); + props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BOOTSTRAP_SERVERS); + props.put(ProducerConfig.CLIENT_ID_CONFIG, "KafkaClientProducer"); + props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, LongSerializer.class.getName()); + props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName()); + producer = new KafkaProducer(props); + } + + private static void loadConsumerProperties() { + Properties props = new Properties(); + props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, BOOTSTRAP_SERVERS); + props.put(ConsumerConfig.CLIENT_ID_CONFIG, "KafkaClientConsumer"); + props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, LongDeserializer.class.getName()); + props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName()); + consumer = new KafkaConsumer<>(props); + } + + private static Producer getProducer() { + return producer; + } + + private static Consumer getConsumer() { + return consumer; + } + + public static void send(String event, String topic) throws Exception { + if (topicCheckResult.getOrDefault(topic, false)) { + final Producer producer = getProducer(); + ProducerRecord record = new ProducerRecord(topic, event); + producer.send(record); + } else if (validate(topic)) { + final Producer producer = getProducer(); + ProducerRecord record = new ProducerRecord(topic, event); + producer.send(record); + } else { + System.err.println("Topic id: " + topic + ", does not exists."); + throw new ClientException("TOPIC_NOT_EXISTS_EXCEPTION", "Topic id: " + topic + ", does not exists."); + } + } + + public static boolean validate(String topic) throws Exception { + Consumer consumer = getConsumer(); + Map> topics = consumer.listTopics(); + Boolean result = topics.keySet().contains(topic); + topicCheckResult.put(topic, result); + return result; + } +} diff --git a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/SyncMessageGenerator.java b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/SyncMessageGenerator.java index 81b63497e8..d8b9cbc2da 100644 --- a/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/SyncMessageGenerator.java +++ b/platform-tools/spikes/sync-tool/src/main/java/org/sunbird/sync/tool/util/SyncMessageGenerator.java @@ -28,6 +28,9 @@ public class SyncMessageGenerator { private static Map definitionObjectMap = new HashMap<>(); private static ControllerUtil util = new ControllerUtil(); private static List nestedFields = Platform.config.getStringList("nested.fields"); + private static boolean isReplaceString = Platform.config.getBoolean("is_replace_string"); + private static String replaceSrcString = Platform.config.getString("replace_src_string"); + private static String replaceDestString = Platform.config.getString("replace_dest_string"); public static Map getMessages(List nodes, String objectType, Map errors) throws Exception { @@ -58,6 +61,7 @@ public static Map getMessages(List nodes, String objectTyp errors.put(node.getIdentifier(), e.getMessage()); } } + System.out.println("SyncMessageGenerator: getMessage: message:: " + messages); return messages; } @@ -112,6 +116,18 @@ public static Map getMessage(Node node) { Map transactionData = new HashMap(); if (null != node.getMetadata() && !node.getMetadata().isEmpty()) { Map propertyMap = new HashMap(); + if(isReplaceString) { + try { + String metadataStr = JSONUtils.serialize(node.getMetadata()); + String updatedMetadataStr = StringUtils.replaceEach(metadataStr, new String[]{replaceSrcString}, new String[]{replaceDestString}); + System.out.println("SyncMessageGenerator:getMessage: updatedMetadataStr " + updatedMetadataStr); + Map updatedMetadata = (Map) JSONUtils.convertJSONString(updatedMetadataStr); + node.setMetadata(updatedMetadata); + } catch (Exception e) { + System.out.println("SyncMessageGenerator:getMessage: While replacing string " + e.getMessage()); + e.printStackTrace(); + } + } for (Entry entry : node.getMetadata().entrySet()) { String key = entry.getKey(); if (StringUtils.isNotBlank(key)) { diff --git a/platform-tools/spikes/sync-tool/src/main/resources/application.conf b/platform-tools/spikes/sync-tool/src/main/resources/application.conf index ca4f2c7f1a..2508e05479 100644 --- a/platform-tools/spikes/sync-tool/src/main/resources/application.conf +++ b/platform-tools/spikes/sync-tool/src/main/resources/application.conf @@ -93,4 +93,17 @@ contentTypeToPrimaryCategory { LessonPlanUnit: "Lesson Plan Unit" CourseUnit: "Course Unit" TextBookUnit: "Textbook Unit" -} \ No newline at end of file +} + +csp.migration.request.topic="dev.csp.migration.job.request" +csp.migration.batch.size=50 + +is_replace_string=false +replace_src_string= "" +replace_dest_string="" +replace_src_string_DIAL_store="" +replace_dest_string_DIAL_store="" + +quml.migration.request.topic="dev.quml.migration.job.request" +quml.migration.batch.size=50 + diff --git a/platform-tools/spikes/sync-tool/src/test/java/org/sunbird/sync/tool/mgr/CassandraESSyncManagerTest.java b/platform-tools/spikes/sync-tool/src/test/java/org/sunbird/sync/tool/mgr/CassandraESSyncManagerTest.java index 015118d915..28774e18c2 100644 --- a/platform-tools/spikes/sync-tool/src/test/java/org/sunbird/sync/tool/mgr/CassandraESSyncManagerTest.java +++ b/platform-tools/spikes/sync-tool/src/test/java/org/sunbird/sync/tool/mgr/CassandraESSyncManagerTest.java @@ -31,21 +31,21 @@ public class CassandraESSyncManagerTest { @Test public void testsyncDialcodesByIdsWithDialcodes() throws Exception { DialcodeSync dialcodeSync = PowerMockito.mock(DialcodeSync.class); - PowerMockito.when(dialcodeSync.sync(Mockito.anyList())).thenReturn(1); + PowerMockito.when(dialcodeSync.sync(Mockito.anyList(), Mockito.anyList())).thenReturn(1); List dialcodes = Arrays.asList("A1B2C3"); CassandraESSyncManager cassandraESSyncManager = new CassandraESSyncManager(dialcodeSync); - cassandraESSyncManager.syncDialcodesByIds(dialcodes); + cassandraESSyncManager.syncDialcodesByIds(dialcodes, null); } @Test public void testsyncDialcodesByIdsWithoutDialcodes() throws Exception { DialcodeSync dialcodeSync = PowerMockito.mock(DialcodeSync.class); - PowerMockito.when(dialcodeSync.sync(Mockito.anyList())).thenReturn(1); + PowerMockito.when(dialcodeSync.sync(Mockito.anyList(), Mockito.anyList())).thenReturn(1); List dialcodes = null; CassandraESSyncManager cassandraESSyncManager = new CassandraESSyncManager(dialcodeSync); - cassandraESSyncManager.syncDialcodesByIds(dialcodes); + cassandraESSyncManager.syncDialcodesByIds(dialcodes, null); } @Test diff --git a/platform-tools/spikes/sync-tool/src/test/java/org/sunbird/sync/tool/util/DialcodeSyncTest.java b/platform-tools/spikes/sync-tool/src/test/java/org/sunbird/sync/tool/util/DialcodeSyncTest.java index 636d01c2ea..f5abfb26e3 100644 --- a/platform-tools/spikes/sync-tool/src/test/java/org/sunbird/sync/tool/util/DialcodeSyncTest.java +++ b/platform-tools/spikes/sync-tool/src/test/java/org/sunbird/sync/tool/util/DialcodeSyncTest.java @@ -37,7 +37,7 @@ public void testSyncWrongDialcodes() throws Exception { PowerMockito.when(CassandraConnector.getSession()).thenReturn(session); DialcodeSync dialcodeSync = new DialcodeSync(); - Assert.isTrue(dialcodeSync.sync(Arrays.asList("A1B2C3")) == 0); + Assert.isTrue(dialcodeSync.sync(Arrays.asList("A1B2C3"), null) == 0); } @Test @@ -73,6 +73,6 @@ public void testSyncCorrectDialcodes() throws Exception { DialcodeSync dialcodeSync = new DialcodeSync(); - Assert.isTrue(dialcodeSync.sync(Arrays.asList("A1B2C3")) == 1); + Assert.isTrue(dialcodeSync.sync(Arrays.asList("A1B2C3"),null) == 1); } } diff --git a/platform-tools/spikes/sync-tool/src/test/resources/application.conf b/platform-tools/spikes/sync-tool/src/test/resources/application.conf index 4c519eea34..a90ee410a3 100644 --- a/platform-tools/spikes/sync-tool/src/test/resources/application.conf +++ b/platform-tools/spikes/sync-tool/src/test/resources/application.conf @@ -40,4 +40,8 @@ search.fields.query=["name^100","title^100","lemma^100","code^100","tags^100","d search.fields.date=["lastUpdatedOn","createdOn","versionDate","lastSubmittedOn","lastPublishedOn"] search.batch.size=500 -batch.size=100 \ No newline at end of file +batch.size=100 + + +replace_src_string_DIAL_store= "{{ sync_tool_replace_src_string_DIAL_store | default('DIAL_STORAGE_BASE_PATH') }}" +replace_dest_string_DIAL_store="{{ sync_tool_replace_dest_string_DIAL_store | default('https://sunbirddevbbpublic.blob.core.windows.net/dial') }}" \ No newline at end of file diff --git a/pom.xml b/pom.xml index 50ac54bcce..c95893d4c5 100644 --- a/pom.xml +++ b/pom.xml @@ -44,15 +44,6 @@ platform-modules - - samza-jobs - - platform-core - platform-modules/actors - platform-modules/content-manager - platform-jobs - - diff --git a/searchIndex-platform/module/mvcsearchindex-elasticsearch/.gitignore b/searchIndex-platform/module/mvcsearchindex-elasticsearch/.gitignore deleted file mode 100644 index b83d22266a..0000000000 --- a/searchIndex-platform/module/mvcsearchindex-elasticsearch/.gitignore +++ /dev/null @@ -1 +0,0 @@ -/target/ diff --git a/searchIndex-platform/module/mvcsearchindex-elasticsearch/pom.xml b/searchIndex-platform/module/mvcsearchindex-elasticsearch/pom.xml deleted file mode 100644 index 75226166bc..0000000000 --- a/searchIndex-platform/module/mvcsearchindex-elasticsearch/pom.xml +++ /dev/null @@ -1,103 +0,0 @@ - - 4.0.0 - - org.sunbird - searchindex-platform - 1.1-SNAPSHOT - ../../pom.xml - - mvcsearchindex-elasticsearch - jar - mvcsearchindex-elasticsearch - - UTF-8 - - - - - org.sunbird - unit-tests - 1.1-SNAPSHOT - test-jar - test - - - org.sunbird - searchindex-common - 1.1-SNAPSHOT - jar - - - org.codehaus.jackson - jackson-mapper-asl - 1.9.13 - - - commons-lang - commons-lang - 2.6 - - - org.apache.httpcomponents - httpclient - 4.5.2 - - - net.sf.json-lib - json-lib - 2.4 - jdk15 - - - org.elasticsearch - elasticsearch - 7.5.0 - - - org.elasticsearch.client - elasticsearch-rest-high-level-client - 7.5.0 - - - org.elasticsearch.client - transport - 7.5.0 - - - org.apache.logging.log4j - log4j-api - ${log4j.version} - - - org.apache.logging.log4j - log4j-core - ${log4j.version} - - - org.powermock - powermock-module-junit4 - 1.7.4 - test - - - org.powermock - powermock-api-mockito - 1.7.4 - test - - - org.slf4j - slf4j-api - 1.6.1 - compile - - - - - - - - - - diff --git a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/main/java/org/sunbird/mvcsearchindex/elasticsearch/ElasticSearchUtil.java b/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/main/java/org/sunbird/mvcsearchindex/elasticsearch/ElasticSearchUtil.java deleted file mode 100644 index a9d29ff0d3..0000000000 --- a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/main/java/org/sunbird/mvcsearchindex/elasticsearch/ElasticSearchUtil.java +++ /dev/null @@ -1,324 +0,0 @@ -/** - * - */ -package org.sunbird.mvcsearchindex.elasticsearch; - -import java.io.IOException; -import java.util.ArrayList; -import java.util.HashMap; -import java.util.List; -import java.util.Map; -import java.util.concurrent.ExecutionException; -import org.apache.commons.lang3.StringUtils; -import org.apache.http.HttpHost; -import org.apache.http.client.config.RequestConfig; -import org.codehaus.jackson.JsonGenerationException; -import org.codehaus.jackson.map.JsonMappingException; -import org.sunbird.searchindex.util.CompositeSearchConstants; -import org.sunbird.telemetry.logger.TelemetryManager; -import org.elasticsearch.action.ActionListener; -import org.elasticsearch.action.admin.indices.alias.Alias; -import org.elasticsearch.action.support.master.AcknowledgedResponse; -import org.elasticsearch.client.indices.CreateIndexRequest; -import org.elasticsearch.client.indices.CreateIndexResponse; -import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest; -import org.elasticsearch.action.get.GetRequest; -import org.elasticsearch.action.get.GetResponse; -import org.elasticsearch.action.index.IndexRequest; -import org.elasticsearch.action.index.IndexResponse; -import org.elasticsearch.action.search.SearchRequest; -import org.elasticsearch.action.search.SearchResponse; -import org.elasticsearch.action.update.UpdateRequest; -import org.elasticsearch.action.update.UpdateResponse; -import org.elasticsearch.client.*; -import org.elasticsearch.common.settings.Settings; -import org.elasticsearch.common.xcontent.XContentType; -import org.elasticsearch.index.query.BoolQueryBuilder; -import org.elasticsearch.index.query.QueryBuilders; -import org.elasticsearch.search.aggregations.AggregationBuilders; -import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder; -import org.elasticsearch.search.builder.SearchSourceBuilder; - -import com.fasterxml.jackson.core.type.TypeReference; -import com.fasterxml.jackson.databind.ObjectMapper; - -import akka.dispatch.Futures; -import scala.concurrent.Future; -import scala.concurrent.Promise; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -/** - * @author pradyumna - * - */ -public class ElasticSearchUtil { - private static final Logger logger = LoggerFactory.getLogger(ElasticSearchUtil.class); - static { - System.setProperty("es.set.netty.runtime.available.processors", "false"); - registerShutdownHook(); - } - - private static Map esClient = new HashMap(); - - private static ObjectMapper mapper = new ObjectMapper(); - - public static void initialiseESClient(String indexName, String connectionInfo) { - if (StringUtils.isBlank(indexName)) - indexName = CompositeSearchConstants.MVC_SEARCH_INDEX; - createClient(indexName, connectionInfo); - } - - /** - * - */ - private static void createClient(String indexName, String connectionInfo) { - if (!esClient.containsKey(indexName)) { - Map hostPort = new HashMap(); - for (String info : connectionInfo.split(",")) { - hostPort.put(info.split(":")[0], Integer.valueOf(info.split(":")[1])); - } - List httpHosts = new ArrayList<>(); - for (String host : hostPort.keySet()) { - httpHosts.add(new HttpHost(host, hostPort.get(host))); - } - RestClientBuilder builder = RestClient.builder(httpHosts.toArray(new HttpHost[httpHosts.size()])) - .setRequestConfigCallback(new RestClientBuilder.RequestConfigCallback() { - @Override - public RequestConfig.Builder customizeRequestConfig(RequestConfig.Builder requestConfigBuilder) { - return requestConfigBuilder.setConnectionRequestTimeout(-1); - } - }); - RestHighLevelClient client = new RestHighLevelClient(builder); - if (null != client) - esClient.put(indexName, client); - } - } - - private static RestHighLevelClient getClient(String indexName) { - if (StringUtils.isBlank(indexName)) - indexName = CompositeSearchConstants.MVC_SEARCH_INDEX; - if (StringUtils.startsWith(indexName,"kp_audit_log")) - return esClient.get("kp_audit_log"); - return esClient.get(indexName); - } - - public void finalize() { - cleanESClient(); - } - - - - public static boolean isIndexExists(String indexName) { - Response response; - try { - Request request = new Request("HEAD", "/" + indexName); - response = getClient(indexName).getLowLevelClient().performRequest(request); - return (200 == response.getStatusLine().getStatusCode()); - } catch (IOException e) { - return false; - } - - } - - - public static boolean addIndex(String indexName, String documentType, String settings, String mappings,String alias , String esvalues) - throws IOException { - boolean response = false; - RestHighLevelClient client = getClient(indexName); - if (!isIndexExists(indexName)) { - CreateIndexRequest createRequest = new CreateIndexRequest(indexName); - if(esvalues != null) { - createRequest.source(esvalues,XContentType.JSON); - } - else { - if (StringUtils.isNotBlank(alias)) - createRequest.alias(new Alias(alias)); - if (StringUtils.isNotBlank(settings)) - createRequest.settings(Settings.builder().loadFromSource(settings, XContentType.JSON)); - if (StringUtils.isNotBlank(documentType) && StringUtils.isNotBlank(mappings)) - createRequest.mapping(mappings, XContentType.JSON); - } - CreateIndexResponse createIndexResponse = client.indices().create(createRequest,RequestOptions.DEFAULT); - - response = createIndexResponse.isAcknowledged(); - } - return response; - } - - - public static void addDocumentWithId(String indexName, String documentId, String document) { - try { - logger.info("Inside addDocuemntwithId"); - IndexRequest indexRequest = new IndexRequest(indexName); - indexRequest.id(documentId); - indexRequest.source(document,XContentType.JSON); - IndexResponse indexResponse = getClient(indexName).index(indexRequest,RequestOptions.DEFAULT); - logger.info("Response after inserting inside ES :: " + indexResponse.toString()); - TelemetryManager.log("Added " + indexResponse.getId() + " to index " + indexResponse.getIndex()); - } catch (IOException e) { - logger.info("Error after inserting inside ES :: " + indexName + " " + e); - TelemetryManager.error("Error while adding document to index :" + indexName, e); - } - } - - - public static void updateDocument(String indexName, String documentId, String document) - { - try {Map doc = mapper.readValue(document, new TypeReference>() { - }); - logger.info("Inside updateDocument"); - UpdateRequest updateRequest = new UpdateRequest(); - updateRequest.index(indexName); - updateRequest.id(documentId); - updateRequest.doc(doc); - UpdateResponse response = getClient(indexName).update(updateRequest,RequestOptions.DEFAULT); - TelemetryManager.log("Updated " + response.getId() + " to index " + response.getIndex()); - logger.info("Response after updating inside ES :: " + response.toString()); - } catch (IOException e) { - logger.info("Error after updating inside ES :: " + indexName + " " + e); - TelemetryManager.error("Error while updating document to index :" + indexName, e); - } - - } - - - - - - public static void deleteIndex(String indexName) throws InterruptedException, ExecutionException, IOException { - AcknowledgedResponse response = getClient(indexName).indices().delete(new DeleteIndexRequest(indexName),RequestOptions.DEFAULT); - esClient.remove(indexName); - TelemetryManager.log("Deleted Index" + indexName + " : " + response.isAcknowledged()); - } - - public static String getDocumentAsStringById(String indexName, String documentId) - throws IOException { - GetResponse response = getClient(indexName).get(new GetRequest(indexName, documentId),RequestOptions.DEFAULT); - return response.getSourceAsString(); - } - - public static SearchResponse search(Map matchCriterias, Map textFiltersMap, - String indexName, String indexType, List> groupBy, boolean isDistinct, int limit) - throws Exception { - SearchSourceBuilder query = buildJsonForQuery(matchCriterias, textFiltersMap, groupBy, isDistinct, indexName); - query.size(limit); - return search(indexName, indexType, query); - } - - public static SearchResponse search(String indexName, String indexType, SearchSourceBuilder query) - throws Exception { - return getClient(indexName).search(new SearchRequest(indexName).source(query),RequestOptions.DEFAULT); - } - - public static Future search(String indexName, SearchSourceBuilder searchSourceBuilder) - throws IOException { - TelemetryManager.log("searching in ES index: " + indexName); - Promise promise = Futures.promise(); - getClient(indexName).searchAsync(new SearchRequest().indices(indexName).source(searchSourceBuilder),RequestOptions.DEFAULT, - new ActionListener() { - - @Override - public void onResponse(SearchResponse response) { - promise.success(response); - } - - @Override - public void onFailure(Exception e) { - promise.failure(e); - } - }); - return promise.future(); - } - - - - - @SuppressWarnings("unchecked") - public static SearchSourceBuilder buildJsonForQuery(Map matchCriterias, - Map textFiltersMap, List> groupByList, boolean isDistinct, - String indexName) - throws JsonGenerationException, JsonMappingException, IOException { - - SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); - - BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery(); - if (matchCriterias != null) { - - for (Map.Entry entry : matchCriterias.entrySet()) { - if (entry.getValue() instanceof List) { - for (String matchText : (ArrayList) entry.getValue()) { - queryBuilder.should(QueryBuilders.matchQuery(entry.getKey(), matchText)); - } - } - } - } - - if (textFiltersMap != null && !textFiltersMap.isEmpty()) { - BoolQueryBuilder boolQuery = QueryBuilders.boolQuery(); - for (Map.Entry entry : textFiltersMap.entrySet()) { - ArrayList termValues = (ArrayList) entry.getValue(); - for (String termValue : termValues) { - boolQuery.must(QueryBuilders.termQuery(entry.getKey(), termValue)); - } - } - queryBuilder.filter(boolQuery); - } - - searchSourceBuilder.query(QueryBuilders.boolQuery().filter(queryBuilder)); - - if (groupByList != null && !groupByList.isEmpty()) { - if (!isDistinct) { - for (Map groupByMap : groupByList) { - String groupByParent = (String) groupByMap.get("groupByParent"); - List groupByChildList = (List) groupByMap.get("groupByChildList"); - TermsAggregationBuilder termBuilder = AggregationBuilders.terms(groupByParent).field(groupByParent); - if (groupByChildList != null && !groupByChildList.isEmpty()) { - for (String childGroupBy : groupByChildList) { - termBuilder.subAggregation(AggregationBuilders.terms(childGroupBy).field(childGroupBy)); - } - - } - searchSourceBuilder.aggregation(termBuilder); - } - } else { - for (Map groupByMap : groupByList) { - String groupBy = (String) groupByMap.get("groupBy"); - String distinctKey = (String) groupByMap.get("distinctKey"); - searchSourceBuilder.aggregation( - AggregationBuilders.terms(groupBy).field(groupBy).subAggregation(AggregationBuilders - .cardinality("distinct_" + distinctKey + "s").field(distinctKey))); - } - } - } - - return searchSourceBuilder; - } - - - private static void registerShutdownHook() { - Runtime.getRuntime().addShutdownHook(new Thread() { - @Override - public void run() { - try { - cleanESClient(); - } catch (Exception e) { - e.printStackTrace(); - } - } - }); - } - - public static void cleanESClient() { - if (!esClient.isEmpty()) - for (RestHighLevelClient client : esClient.values()) { - if (null != client) - try { - client.close(); - } catch (IOException e) { - } - } - } - - -} \ No newline at end of file diff --git a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/main/java/org/sunbird/mvcsearchindex/transformer/AggregationsResultTransformer.java b/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/main/java/org/sunbird/mvcsearchindex/transformer/AggregationsResultTransformer.java deleted file mode 100644 index 396bc271f7..0000000000 --- a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/main/java/org/sunbird/mvcsearchindex/transformer/AggregationsResultTransformer.java +++ /dev/null @@ -1,35 +0,0 @@ -package org.sunbird.mvcsearchindex.transformer; - -import java.util.ArrayList; -import java.util.HashMap; -import java.util.List; -import java.util.Map; - -public class AggregationsResultTransformer implements IESResultTransformer{ - - - @SuppressWarnings("unchecked") - public Object getTransformedObject(Object obj){ - Map aggObj = (Map) obj; - List> transformedObj = new ArrayList>(); - for(Map.Entry entry: aggObj.entrySet()){ - Map facetMap = new HashMap(); - String facetName = entry.getKey(); - facetMap.put("name", facetName); - Map aggKeyMap = (Map) entry.getValue(); - List> facetKeys = new ArrayList>(); - for(Map.Entry aggKeyEntry: aggKeyMap.entrySet()){ - Map facetKeyMap = new HashMap(); - String facetKeyName = aggKeyEntry.getKey(); - facetKeyMap.put("name", facetKeyName); - Map facetKeyCountMap = (Map) aggKeyEntry.getValue(); - long count = (long) facetKeyCountMap.get("count"); - facetKeyMap.put("count", count); - facetKeys.add(facetKeyMap); - } - facetMap.put("values", facetKeys); - transformedObj.add(facetMap); - } - return transformedObj; - } -} diff --git a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/main/java/org/sunbird/mvcsearchindex/transformer/IESResultTransformer.java b/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/main/java/org/sunbird/mvcsearchindex/transformer/IESResultTransformer.java deleted file mode 100644 index b8115799b6..0000000000 --- a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/main/java/org/sunbird/mvcsearchindex/transformer/IESResultTransformer.java +++ /dev/null @@ -1,5 +0,0 @@ -package org.sunbird.mvcsearchindex.transformer; - -public interface IESResultTransformer { - public Object getTransformedObject(Object obj); -} diff --git a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/test/java/org/sunbird/test/ElasticSearchUtilTest.java b/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/test/java/org/sunbird/test/ElasticSearchUtilTest.java deleted file mode 100644 index b74a71cefd..0000000000 --- a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/test/java/org/sunbird/test/ElasticSearchUtilTest.java +++ /dev/null @@ -1,105 +0,0 @@ -/** - * - */ - -package org.sunbird.test; - -import static org.junit.Assert.assertTrue; -import static org.mockito.Mockito.when; - -import java.util.Date; -import java.util.HashMap; -import java.util.Map; -import java.util.Random; -import org.apache.commons.lang.StringUtils; -import org.codehaus.jackson.map.ObjectMapper; -import org.sunbird.mvcsearchindex.elasticsearch.ElasticSearchUtil; -import org.sunbird.searchindex.util.CompositeSearchConstants; -import org.junit.Before; -import org.junit.Test; -import org.junit.runner.RunWith; -import org.mockito.Mockito; -import org.mockito.MockitoAnnotations; -import org.powermock.api.mockito.PowerMockito; -import org.powermock.core.classloader.annotations.PowerMockIgnore; -import org.powermock.core.classloader.annotations.PrepareForTest; -import org.powermock.modules.junit4.PowerMockRunner; - -/** - * @author pradyumna - * - */ - -@RunWith(PowerMockRunner.class) -@PrepareForTest({ElasticSearchUtil.class}) -@PowerMockIgnore({"javax.management.*", "sun.security.ssl.*", "javax.net.ssl.*" , "javax.crypto.*"}) -public class ElasticSearchUtilTest { - - - private static ObjectMapper mapper = new ObjectMapper(); - private static Random random = new Random(); - - @Before - public void setup() { - MockitoAnnotations.initMocks(this); - } - - @Test - public void testAddDocumentWithId() throws Exception { - Map content = getContentTestRecord(); - String id = (String) content.get("identifier"); - // addToIndex(id, content); - String jsonIndexDocument = mapper.writeValueAsString(content); - PowerMockito.mockStatic(ElasticSearchUtil.class); - PowerMockito.doNothing().when(ElasticSearchUtil.class); - ElasticSearchUtil.addDocumentWithId(CompositeSearchConstants.MVC_SEARCH_INDEX, - id, jsonIndexDocument); - when(ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString())).thenReturn(id); - String doc = ElasticSearchUtil.getDocumentAsStringById(CompositeSearchConstants.MVC_SEARCH_INDEX, id); - assertTrue(StringUtils.contains(doc, id)); - } - - - @Test - public void testUpdateDocument() throws Exception { - Map content = getContentTestRecord(); - String id = (String) content.get("identifier"); - String jsonIndexDocument = mapper.writeValueAsString(content); - PowerMockito.mockStatic(ElasticSearchUtil.class); - PowerMockito.doNothing().when(ElasticSearchUtil.class); - ElasticSearchUtil.addDocumentWithId(CompositeSearchConstants.MVC_SEARCH_INDEX, - id, jsonIndexDocument); - content.put("name", "Content_" + System.currentTimeMillis() + "_name"); - PowerMockito.mockStatic(ElasticSearchUtil.class); - PowerMockito.doNothing().when(ElasticSearchUtil.class); - ElasticSearchUtil.updateDocument(CompositeSearchConstants.MVC_SEARCH_INDEX, - mapper.writeValueAsString(content), id); - when(ElasticSearchUtil.getDocumentAsStringById(Mockito.anyString(),Mockito.anyString())).thenReturn(id); - String doc = ElasticSearchUtil.getDocumentAsStringById(CompositeSearchConstants.MVC_SEARCH_INDEX, id); - assertTrue(StringUtils.contains(doc, id)); - } - - - - - private static Map getContentTestRecord() { - String objectType = "Content"; - Date d = new Date(); - Map map = new HashMap(); - long suffix = (long) (10000000 + random.nextInt(1000000)); - map.put("identifier", "do_" + suffix); - map.put("objectType", objectType); - map.put("name", "Content_" + System.currentTimeMillis() + "_name"); - map.put("contentType", "Content"); - map.put("status", "Draft"); - return map; - } - /*private static void addToIndex(String uniqueId, Map doc) throws Exception { - String jsonIndexDocument = mapper.writeValueAsString(doc); - PowerMockito.mockStatic(ElasticSearchUtil.class); - PowerMockito.doNothing().when(ElasticSearchUtil.class); - ElasticSearchUtil.addDocumentWithId(CompositeSearchConstants.MVC_SEARCH_INDEX, - uniqueId, jsonIndexDocument); - }*/ - -} diff --git a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/test/resources/log4j2.xml b/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/test/resources/log4j2.xml deleted file mode 100644 index fd6a9ef4a9..0000000000 --- a/searchIndex-platform/module/mvcsearchindex-elasticsearch/src/test/resources/log4j2.xml +++ /dev/null @@ -1,36 +0,0 @@ - - - - - - - - - %d [%t] %-5level %logger{36} - %msg%n - - - - - - - - - %d %msg%n - - - - - - - - - - - - - - - \ No newline at end of file diff --git a/searchIndex-platform/module/searchindex-elasticsearch/src/main/java/org/sunbird/searchindex/elasticsearch/ElasticSearchUtil.java b/searchIndex-platform/module/searchindex-elasticsearch/src/main/java/org/sunbird/searchindex/elasticsearch/ElasticSearchUtil.java index 771a009476..920ab0560a 100644 --- a/searchIndex-platform/module/searchindex-elasticsearch/src/main/java/org/sunbird/searchindex/elasticsearch/ElasticSearchUtil.java +++ b/searchIndex-platform/module/searchindex-elasticsearch/src/main/java/org/sunbird/searchindex/elasticsearch/ElasticSearchUtil.java @@ -265,6 +265,7 @@ public static List getMultiDocumentAsStringByIdList(String indexName, St public static void bulkIndexWithIndexId(String indexName, String documentType, Map jsonObjects) throws Exception { if (isIndexExists(indexName)) { + System.out.println("ElasticSearchUtil: bulkIndexWithIndexId: indexName: " + indexName); RestHighLevelClient client = getClient(indexName); if (!jsonObjects.isEmpty()) { int count = 0; @@ -274,7 +275,9 @@ public static void bulkIndexWithIndexId(String indexName, String documentType, M request.add(new IndexRequest(indexName, documentType, key) .source((Map) jsonObjects.get(key))); if (count % BATCH_SIZE == 0 || (count % BATCH_SIZE < BATCH_SIZE && count == jsonObjects.size())) { + System.out.println("ElasticSearchUtil: bulkIndexWithIndexId: request: " + request); BulkResponse bulkResponse = client.bulk(request); + System.out.println("ElasticSearchUtil: bulkIndexWithIndexId: bulkResponse: " + bulkResponse.status()); if (bulkResponse.hasFailures()) { TelemetryManager .log("Failures in Elasticsearch bulkIndex : " + bulkResponse.buildFailureMessage()); diff --git a/searchIndex-platform/pom.xml b/searchIndex-platform/pom.xml index 3d4709b867..c9079ea4f7 100644 --- a/searchIndex-platform/pom.xml +++ b/searchIndex-platform/pom.xml @@ -24,7 +24,6 @@ module/searchindex-elasticsearch module/searchindex-common - module/mvcsearchindex-elasticsearch