-
Notifications
You must be signed in to change notification settings - Fork 34
CSM BDS python scripts don't work with --starttime or --endtime #992
Description
Describe the bug
Regardless of the date entered with --starttime or --endtime, a script queries and returns all job or allocation records. The script findUserJobs.py is used here as example. All other scripts that support --starttime and --endtime options have similar issue.
To Reproduce
Steps to reproduce the behavior:
- Go to /opt/ibm/csm/bigdata/python/
- Run one of the scripts like
./findUserJobs.py --starttime 2021-01-12to search for jobs started on that date. - See the command output with all jobs even at a different start time.
- No error
Expected behavior
For these scripts to work correctly, they should only return the jobs within the time ranges specified with --starttime and|or --endtime.
Screenshots
[/opt/ibm/csm/bigdata/python]> export CAST_ELASTIC="...:9200"
[/opt/ibm/csm/bigdata/python]> ./findUserJobs.py -u tlam --starttime 2021-01-12
State | AID | P Job ID | S Job ID | Begin Time | End Time
failed | 12 | 112 | 0 | 2020-09-23 15:33:19.868194 | 2020-09-23 15:34:16.817299
failed | 13 | 113 | 0 | 2020-09-23 15:40:02.624777 | 2020-09-23 15:40:02.845004
failed | 14 | 114 | 0 | 2020-09-23 15:44:55.221521 | 2020-09-23 15:44:55.431779
...
complete | 96 | 195 | 0 | 2020-09-28 11:43:35.973273 | 2020-09-28 11:44:28.118919
complete | 97 | 196 | 0 | 2020-09-28 12:42:20.720108 | 2020-09-28 12:42:21.852965
complete | 98 | 197 | 0 | 2020-09-28 12:43:17.145406 | 2020-09-28 12:43:21.830506
complete | 99 | 198 | 0 | 2020-09-28 12:44:37.372068 | 2020-09-28 12:44:38.463995
failed | 100 | 299 | 0 | 2020-10-08 09:52:26.881603 | 2020-10-08 09:54:54.950164
I added some debugged prints in the script to show the time range that it created from the --starttime input.
Target list: [{'range': {'data.begin_time': {'format': 'epoch_millis', 'lte': '1610487082000'}}}, {'range': {'data.history.end_time': {'format': 'epoch_millis', 'gte': '1610427600000'}}}, {'bool': {'must_not': {'exists': {'field': 'data.history.end_time'}}}}] Match min: 2
And the query the script will send to elasticsearch:
Query body: {'query': {'bool': {'must': [{'match': {'data.user_name': 'tlam'}}], 'should': [{'range': {'data.begin_time': {'format': 'epoch_millis', 'lte': '1610487082000'}}}, {'range': {'data.history.end_time': {'format': 'epoch_millis', 'gte': '1610427600000'}}}, {'bool': {'must_not': {'exists': {'field': 'data.history.end_time'}}}}], 'minimum_should_match': 2}}, '_source': ['data.primary_job_id', 'data.secondary_job_id', 'data.allocation_id', 'data.user_name', 'data.begin_time', 'data.history.end_time', 'data.state'], 'size': 1000}
The time ranges and the query look correct. However, the problem is with the two fields data.begin_time and data.history.end_time which have type "text" or "string". For example:
GET /cast-allocation/_mapping/field/data.begin_time
{
"cast-allocation" : {
"mappings" : {
"data.begin_time" : {
"full_name" : "data.begin_time",
"mapping" : {
"begin_time" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
Environment (please complete the following information):
- Machine [IST CSM BDS mini CORAL cluster]
- Version [CSM 1.8.2]
Additional context
Add any other context about the problem here.
Issue Source:
CSM regression tests