This repository contains two Python scripts for managing external organizations in Pure. The first script identifies duplicate External Organisations and exports them to a CSV file. The second script merges duplicate External Organisations based on the CSV data.
This script fetches data from the Pure API, identifies potential duplicate organizations based on name, country, and type, and exports the duplicates to a CSV file. Only external organisations with an exact match on both name, country and type are defined as duplicates, in order to avoid merging non-duplicate external organisations by accident - but adjust the logic accordingly, to accomodate for any 'business logic'.
- Fetches all external organizations from Pure via paginated API calls.
- Groups organizations by name, country, and type.
- Identifies duplicates and determines a potential merge candidate based on workflow status. If none of the external organisation duplicates have workflow status 'Approved', the first UUID will be set as target.
- Saves duplicates to
duplicate_organizations.csv
.
-
Run the script:
python PureAPI_Ex_org_duplicate_finder.py
-
Enter the following when prompted:
- Base URL: The API's base URL (e.g.,
xyz.elsevierpure.com
). - API Key: Your API key for authentication.
- Base URL: The API's base URL (e.g.,
-
The script will create a CSV file named
duplicate_organizations.csv
with the following columns:Organization Name
: Name of the organization.Country
: Country of the organization.Type
: Type of the organization.UUIDs
: Comma-separated list of UUIDs for duplicate organizations.Count
: Number of duplicates.Merge Candidate
: UUID of the suggested merge target.
This script reads the duplicate_organizations.csv
file generated by the first script and merges duplicate organizations using the API.
- Reads duplicates and merge candidates from a CSV file.
- Confirms the merge operation with the user before proceeding.
- Sends merge requests to the API with detailed logging of results.
-
Ensure that the
duplicate_organizations.csv
file is present in the same directory. -
Run the script:
python PureAPI_Ex_org_merger.py
-
Enter the following when prompted:
- Base URL: The API's base URL (e.g.,
xyz.elsevierpure.com
). - API Key: Your API key for authentication.
- Base URL: The API's base URL (e.g.,
-
The script will log the merge results to
merge_log.txt
.
requests
csv
(Standard Python Library)collections
(Standard Python Library)
Install any missing libraries using pip:
pip install requests
The duplicate_organizations.csv
file should have the following columns:
Organization Name
Country
Type
UUIDs
Count
Merge Candidate
Both scripts generate logs for tracking operations:
- Duplicate Finder Script: Outputs
duplicate_organizations.csv
. - Merge Script: Logs all operations to
merge_log.txt
, including successful merges and errors.
This project is licensed under the MIT License.