Bootstrap is a CMR application that can bootstrap the CMR with data from Catalog REST. It has API methods for copying data from Catalog REST to the metadata db. It can also bulk index everything in the Metadata DB.
Used to create/remove user for bootstrap jobs.
-
Create the user
lein create-user
-
Run the migration scripts
lein migrate
You can use lein migrate -version version to restore the database to
a given version. lein migrate -version 0 will clean the database
completely.
-
Remove the user
lein drop-user
-
Create the user
CMR_DB_URL=thin:@localhost:1521/orcl
CMR_BOOTSTRAP_DB_PASSWORD=******
java -cp target/cmr-bootstrap-app-0.1.0-SNAPSHOT-standalone.jar cmr.db create-user -
Run db migration
CMR_DB_URL=thin:@localhost:1521/orcl
CMR_BOOTSTRAP_DB_PASSWORD=******
java -cp target/cmr-bootstrap-app-0.1.0-SNAPSHOT-standalone.jar cmr.db migrate
You can provider additional arguments to migrate the database to a given version as in lein migrate.
-
Remove the user
CMR_DB_URL=thin:@localhost:1521/orcl
CMR_BOOTSTRAP_DB_PASSWORD=******
java -cp target/cmr-bootstrap-app-0.1.0-SNAPSHOT-standalone.jar cmr.db drop-user
Starts moving all the granules in a specified collection from its current index to the target index. If no target is provided the default is to move the granules from the small collections index into their own separate index. The work is performed asynchronously in the background. The job should be monitored through the bootstrap application's logs and using the status endpoint detailed below. The two valid values for the target query parameter are separate-index and small-collections. This also supports a synchronous=true query parameter to cause the collection reindexing to happen synchronously with the request. This is mostly for testing purposes.
curl -i \
-X POST \
http://localhost:3006/rebalancing_collections/C5-PROV1/start?target=separate-index
HTTP/1.1 200 OK
{"message":"Rebalancing started for collection C5-PROV1"}
Fetches the status of rebalancing. It returns counts of the collection in the small collections index and in the separate index.
curl -i \
-H "Accept: application/json" \
http://localhost:3006/rebalancing_collections/C5-PROV1/status
HTTP/1.1 200 OK
{"small-collections":4,"separate-index":4, "rebalancing-status": "COMPLETE"}
Finalizes a rebalancing collection. Removes the collection from the list of rebalancing collections in the index set. If the target was a separate index this call deletes all granules from the small collections index for the specified collection. If the target was small collections the collection specific index will be removed from the index-set, but the index will not be deleted since this could cause errors until search caches have all been refreshed. The caller is expected to manually delete the index after a period of time.
curl -i \
-X POST \
"http://localhost:3006/rebalancing_collections/C5-PROV1/finalize"
HTTP/1.1 200 OK
{"message":"Rebalancing completed for collection C5-PROV1"}
In order to reshard an index to have a different number of shards in elasticsearch clusters, you will do the following:
- Start a reshard process
- Check the status for a 'COMPLETE' state
- Finalize the reshard process with the following API's below.
IMPORTANT: Only reshard one index at a time. Make sure you start, status, and finalize a reshard process COMPLETELY before resharding the next index.
Starts the resharding process to create a new index with the given number of shards and copies the data from the old index to the new index. Both indexes will be used for ingest until the resharding is finalized.
You MUST give the elastic_name parameter to tell CMR which elastic cluster your index is in that is going to be resharded.
Required params:
- num_shards = int (num of shards you want the index to have at the end)
- elastic_name = string (elastic cluster name you want to reshard in)
- Options:
gran-elasticorelastic
- Options:
curl -i \
-X POST \
"http://localhost:3006/reshard/1_small_collections/start?num_shards=50&elastic_name=gran-elastic"
HTTP/1.1 200 OK
{"message": "Resharding started for index 1_small_collections"}
Retrieves the resharding status for an index, including the original index name, target index name, and current resharding status. Returns a 404 status code if the specified index is not currently undergoing resharding.
Required params:
- elastic_name = string (elastic cluster name you want to reshard in)
- Options:
gran-elasticorelastic
- Options:
curl -i \
-H "Accept: application/json" \
http://localhost:3006/reshard/1_c1234_prov1/status?elastic_name=gran-elastic
HTTP/1.1 200 OK
{"original-index":"1_c1234_prov1","reshard-index":"1_c1234_prov1_75_shards", "reshard-status": "COMPLETE"}
Once the status of the reshard returns 'COMPLETE', you can move on to finalizing the reshard process. Finalizing an index resharding moves the ES alias to point to the new resharded index and clean up the index-set.
Required params:
- elastic_name = string (elastic cluster name you want to reshard in)
- Options:
gran-elasticorelastic
- Options:
IMPORTANT: Only finalize if the reshard status is 'COMPLETE'
curl -i \
-X POST \
"http://localhost:3006/reshard/1_small_collections/finalize?elastic_name=gran-elastic"
HTTP/1.1 200 OK
{"message": "Resharding completed for index 1_small_collections"}
Rollback the resharding of a specified index to its original state before reshard was attempted.
You MUST give the elastic_name parameter to tell CMR which cluster your index is in that is going to be resharded.
Required params:
- elastic_name (str)
- Options:
gran-elasticorelastic
- Options:
Rollback will be allowed IF the reshard has not been finalized, else it will not allow
curl -i \
-X POST \
"http://localhost:3006/reshard/1_small_collections/rollback?elastic_name=gran-elastic"
HTTP/1.1 200 OK
{"message": "Resharding rollback completed for index 1_small_collections"}
curl -i \
-X POST \
-H "Content-Type: application/json" \
-d '{"provider_id": "FIX_PROV1"}' \
"http://localhost:3006/bulk_migration/providers"
For the echo-reverb test fixture data, the following curl can be used to check metadata db to make sure the new data is available:
curl -i http://localhost:3001/concepts/G1000000033-FIX_PROV1
This should return the granule including the echo-10 xml.
curl -i \
-X POST \
-H "Content-Type: application/json" \
-d '{"provider_id": "FIX_PROV1", "collection_id": "C1000000073-FIX_PROV1"}' \
"http://localhost:3006/bulk_migration/collections"
curl -i \
-X POST \
-H "Content-Type: application/json" \
-d '{"provider_id": "FIX_PROV1"}' \
"http://localhost:3006/bulk_index/providers"
NOTE from CMR-1908 that when reindexing a provider the collections are not reindexed in the all revisions index. The workaround here is to use the indexer endpoint for reindexing collections.
Operator can use the start_index parameter to index concepts with sequence number larger than the start_index value. This parameter is used to recover from a prematurely terminated bulk index request.
curl -i -X POST "http://localhost:3006/bulk_index/providers/all"
Bulk indexing all collections and granules for all providers. Similar to the bulk index a provider endpoint, this reindexing of providers will not reindex the collections in the all revisions index. The workaround here is to use the indexer endpoint for reindexing collections.
curl -i \
-X POST \
-H "Content-Type: application/json" \
-d '{"provider_id": "FIX_PROV1", "collection_id":"C123-FIX_PROV1"}' \
"http://localhost:3006/bulk_index/collections"
Operator can use the start_index parameter to index concepts with sequence number larger than the start_index value. This parameter is used to recover from a prematurely terminated bulk index request.
curl -X POST \
-H "Content-Type: application/json" \
-d '{"provider_id": "PROV1", "concept_type":"collection", "concept_ids":["C123-PROV1","C124-PROV1"]}' \
"http://localhost:3006/bulk_index/concepts"
For all providers and all system concepts:
curl -i \
-X POST \
"http://localhost:3006/bulk_index/after_date_time?date_time=2015-02-02T10:00:00Z"
For a given list of providers (use provider CMR to index all system concepts):
curl -i \
-X POST \
-H "Content-Type: application/json" \
-d '{"provider_ids": ["PROV1", "PROV2", "CMR"]}' \
"http://localhost:3006/bulk_index/after_date_time?date_time=2015-02-02T10:00:00Z"
Similar to the bulk index a provider endpoint, this reindexing of concepts newer than a given date-time will not reindex the concepts in the all revisions index. The workaround here is to use the indexer endpoint for reindexing collections and the concept specific bootstrap bulk indexing endpoint for variables, services, tools and subscriptions.
curl -i -X POST "http://localhost:3006/bulk_index/system_concepts"
For a single provider:
curl -i -X POST "http://localhost:3006/bulk_index/variables/PROV1"
For all providers:
curl -i -X POST "http://localhost:3006/bulk_index/variables"
For a single provider:
curl -i -X POST "http://localhost:3006/bulk_index/services/PROV1"
For all providers:
curl -i -X POST "http://localhost:3006/bulk_index/services"
For a single provider:
curl -i -X POST "http://localhost:3006/bulk_index/tools/PROV1"
For all providers:
curl -i -X POST "http://localhost:3006/bulk_index/tools"
For a single provider:
curl -i -X POST "http://localhost:3006/bulk_index/subscriptions/PROV1"
For all providers:
curl -i -X POST "http://localhost:3006/bulk_index/subscriptions"
Generic document bulk index endpoints are generated dynamically based on the value of the approved generic documents array defined in common lib. All endpoints take the plural form, so for example, for the generic concept type "data-quality-summary", they would be:
For a single provider:
curl -i -X POST "http://localhost:3006/bulk_index/data-quality-summaries/PROV1"
For all providers:
curl -i -X POST "http://localhost:3006/bulk_index/data-quality-summaries/"
For a single variable:
curl -i -X POST "http://localhost:3006/fingerprint/variables/V1-PROV1"
For all variables:
curl -i -X POST "http://localhost:3006/fingerprint/variables"
For all variables of a single provider:
curl -i \
-X POST \
-H "Content-Type: application/json" \
-d '{"provider_id": "PROV1"}' \
"http://localhost:3006/fingerprint/variables"
Virtual collections contain granules derived from a source collection. Only granules specified in the source collections in the virtual product app configuration will be considered. Virtual granules will only be created in the configured destination virtual collections if they already exist. To initialize virtual granules from existing source granules, use the following command:
curl -i \
-X POST \
"http://localhost:3006/virtual_products?provider-id=PROV1&entry-title=et1"
Note that provider-id and entry-title are required.
This endpoint should only be using in an AWS environment where the Database Migration Service (DMS) is being used to replicate changes from another database to this environment. This will index any concepts that have been replicated since the last replication run.
curl -i -X POST "http://localhost:3006/index_recently_replicated"
Migrate database to the latest schema version:
curl -i \
-X POST \
-H "Echo-Token: XXXX" \
"http://localhost:3006/db-migrate"
Migrate database to a specific schema version (e.g. 3):
curl -i \
-X POST \
-H "Echo-Token: XXXX" \
"http://localhost:3006/db-migrate?version=3"
Keyword Manamegent System (KMS) Redis Cache is no longer "lazy loaded" where cache items expire and are loaded on demand from the KMS API. Bootstrap will call refresh when started and this cache will remain in memory forever. To refresh the cache, the following call can be made.
New KMS servers can be found at:
- https://cmr.sit.earthdata.nasa.gov/kms
- https://cmr.uat.earthdata.nasa.gov/kms
- https://cmr.earthdata.nasa.gov/kms
To issue a KMS Redis Cache reindex, call
curl -i \
-X POST \
-H "Authorization: token" \
http://localhost:3006/caches/refresh/kms
No output is returned on success. HTTP status code of 200 when cache has been refreshed.
This will report the current health of the application. It checks all resources and services used by the application and reports their healthes in the response body in JSON format. For resources, the report includes an "ok?" status and a "problem" field if the resource is not OK. For services, the report includes an overall "ok?" status for the service and health reports for each of its dependencies. It returns HTTP status code 200 when the application is healthy, which means all its interfacing resources and services are healthy; or HTTP status code 503 when one of the resources or services is not healthy. It also takes pretty parameter for pretty printing the response.
curl -i "http://localhost:3006/health?pretty=true"
Example healthy response body:
{
"metadata-db" : {
"ok?" : true,
"dependencies" : {
"oracle" : {
"ok?" : true
},
"echo" : {
"ok?" : true
}
}
},
"internal-metadata-db" : {
"ok?" : true,
"dependencies" : {
"oracle" : {
"ok?" : true
},
"echo" : {
"ok?" : true
}
}
},
"indexer" : {
"ok?" : true,
"dependencies" : {
"elastic_search" : {
"ok?" : true
},
"echo" : {
"ok?" : true
},
"metadata-db" : {
"ok?" : true,
"dependencies" : {
"oracle" : {
"ok?" : true
},
"echo" : {
"ok?" : true
}
}
}
}
}
}
Example un-healthy response body:
{
"metadata-db" : {
"ok?" : true,
"dependencies" : {
"oracle" : {
"ok?" : true
},
"echo" : {
"ok?" : true
}
}
},
"internal-metadata-db" : {
"ok?" : true,
"dependencies" : {
"oracle" : {
"ok?" : true
},
"echo" : {
"ok?" : true
}
}
},
"indexer" : {
"ok?" : false,
"dependencies" : {
"elastic_search" : {
"ok?" : false,
"problem" : {
"status" : "Inaccessible",
"problem" : "Unable to get elasticsearch cluster health, caught exception: Connection refused"
}
},
"echo" : {
"ok?" : true
},
"metadata-db" : {
"ok?" : true,
"dependencies" : {
"oracle" : {
"ok?" : true
},
"echo" : {
"ok?" : true
}
}
}
}
}
}
Copyright © 2007-2025 United States Government as represented by the Administrator of the National Aeronautics and Space Administration. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.