Skip to content
This repository was archived by the owner on Jul 22, 2024. It is now read-only.
This repository was archived by the owner on Jul 22, 2024. It is now read-only.

CSM BDS python scripts don't work with --starttime or --endtime #992

@thanh-lam

Description

@thanh-lam

Describe the bug
Regardless of the date entered with --starttime or --endtime, a script queries and returns all job or allocation records. The script findUserJobs.py is used here as example. All other scripts that support --starttime and --endtime options have similar issue.

To Reproduce
Steps to reproduce the behavior:

  1. Go to /opt/ibm/csm/bigdata/python/
  2. Run one of the scripts like ./findUserJobs.py --starttime 2021-01-12 to search for jobs started on that date.
  3. See the command output with all jobs even at a different start time.
  4. No error

Expected behavior
For these scripts to work correctly, they should only return the jobs within the time ranges specified with --starttime and|or --endtime.

Screenshots

[/opt/ibm/csm/bigdata/python]> export CAST_ELASTIC="...:9200"
[/opt/ibm/csm/bigdata/python]> ./findUserJobs.py -u tlam --starttime 2021-01-12
     State |   AID | P Job ID | S Job ID | Begin Time                 | End Time                  
    failed |    12 |      112 | 0        | 2020-09-23 15:33:19.868194 | 2020-09-23 15:34:16.817299
    failed |    13 |      113 | 0        | 2020-09-23 15:40:02.624777 | 2020-09-23 15:40:02.845004
    failed |    14 |      114 | 0        | 2020-09-23 15:44:55.221521 | 2020-09-23 15:44:55.431779
...
  complete |    96 |      195 | 0        | 2020-09-28 11:43:35.973273 | 2020-09-28 11:44:28.118919
  complete |    97 |      196 | 0        | 2020-09-28 12:42:20.720108 | 2020-09-28 12:42:21.852965
  complete |    98 |      197 | 0        | 2020-09-28 12:43:17.145406 | 2020-09-28 12:43:21.830506
  complete |    99 |      198 | 0        | 2020-09-28 12:44:37.372068 | 2020-09-28 12:44:38.463995
    failed |   100 |      299 | 0        | 2020-10-08 09:52:26.881603 | 2020-10-08 09:54:54.950164

I added some debugged prints in the script to show the time range that it created from the --starttime input.

Target list:  [{'range': {'data.begin_time': {'format': 'epoch_millis', 'lte': '1610487082000'}}}, {'range': {'data.history.end_time': {'format': 'epoch_millis', 'gte': '1610427600000'}}}, {'bool': {'must_not': {'exists': {'field': 'data.history.end_time'}}}}] Match min:  2

And the query the script will send to elasticsearch:

Query body:  {'query': {'bool': {'must': [{'match': {'data.user_name': 'tlam'}}], 'should': [{'range': {'data.begin_time': {'format': 'epoch_millis', 'lte': '1610487082000'}}}, {'range': {'data.history.end_time': {'format': 'epoch_millis', 'gte': '1610427600000'}}}, {'bool': {'must_not': {'exists': {'field': 'data.history.end_time'}}}}], 'minimum_should_match': 2}}, '_source': ['data.primary_job_id', 'data.secondary_job_id', 'data.allocation_id', 'data.user_name', 'data.begin_time', 'data.history.end_time', 'data.state'], 'size': 1000}

The time ranges and the query look correct. However, the problem is with the two fields data.begin_time and data.history.end_time which have type "text" or "string". For example:

GET /cast-allocation/_mapping/field/data.begin_time
{
  "cast-allocation" : {
    "mappings" : {
      "data.begin_time" : {
        "full_name" : "data.begin_time",
        "mapping" : {
          "begin_time" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      }
    }
  }

Environment (please complete the following information):

  • Machine [IST CSM BDS mini CORAL cluster]
  • Version [CSM 1.8.2]

Additional context
Add any other context about the problem here.

Issue Source:
CSM regression tests

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions