Skip to content

Government RAG Example: Incorrect location extraction in basemodel.py breaks province filtering #379

@ARYANPATEL-BIT

Description

@ARYANPATEL-BIT

Background

When running the government_rag benchmark example via the standard command:

ianvs -f ./examples/government_rag/singletask_learning_bench/benchmarkingjob.yaml

the test attempts to extract the location/province string to correctly filter the RAG target dataset.

However, the location extraction logic relies entirely on the basename of the current working directory, which incorrectly returns the root directory name instead of the target province.


The Bug

In examples/government_rag/singletask_learning_bench/testalgorithms/basemodel.py around line 211, the script assigns:

current_dir = os.path.basename(os.getcwd())

Because ianvs is usually executed from the root of the project, os.getcwd() returns the path to the root folder. As a result, current_dir becomes "ianvs" (or whatever the root workspace is named) rather than an actual province string (like "Beijing" or "Shanghai").


Impact

This wrong location string (loc) is then passed as an argument to all subsequent tasks in the threading pipeline. Consequently, [local] and [other] RAG queries use the literal string "ianvs" as the province name filter, which:

  • Completely breaks province-level target filtering
  • Produces no exception or warning — the failure is silent
  • Compromises the integrity of all benchmark results

Steps to Reproduce

  1. Run the benchmarking job from the project root:
   ianvs -f ./examples/government_rag/singletask_learning_bench/benchmarkingjob.yaml
  1. At basemodel.py:211, os.getcwd() resolves to the project root directory, so current_dir becomes "ianvs".
  2. The invalid location string is forwarded to self.rag.query(), causing silent filtering failures across all province-level queries.

Proposed Solution

Replace the os.getcwd()-based approach with a method that derives the target province from the dataset directly. The dataset's JSONL file contains a level_4_dim field that holds the correct province name and should be used as the source of truth:

def _load_locations_from_dataset(self, data):
    query_to_location = {}
    all_locations = set()

    dataset_path = getattr(data, 'data_file', None)

    if dataset_path and os.path.isfile(dataset_path):
        with open(dataset_path, 'r', encoding='utf-8') as f:
            for line in f:
                line = line.strip()
                if not line:
                    continue
                entry = json.loads(line)
                query = entry.get('query', '')
                location = entry.get('level_4_dim', 'Unknown')
                query_to_location[query] = location
                all_locations.add(location)

    return query_to_location, list(all_locations)

Then in predict(), replace the os.getcwd() call with:

query_to_location, all_locations = self._load_locations_from_dataset(data)
self.all_locations = all_locations

for i in range(len(data.x)):
    query = data.x[i]
    location = query_to_location.get(query, "Unknown")

This ensures the province is always derived from the dataset metadata, regardless of where ianvs is invoked from.


Additional Issues Found

While investigating this bug, the following additional issues were identified and fixed in this PR:

Issue Fix
Hardcoded API keys in get_model_response_* methods Replaced with environment variables
Hardcoded model path /home/icyfeather/models/bge-m3 Made configurable via __init__ kwargs or Context
Hardcoded persist_directory="./chroma_db" Made configurable via __init__ kwargs
self.rag mutated across threads causing race condition Changed to local variable rag inside process_query
Unused imports (tempfile, time, zipfile, numpy, etc.) Removed
json imported twice Removed duplicate import

Files Changed

  • examples/government_rag/singletask_learning_bench/testalgorithms/basemodel.py
  • .env.example (added — documents required environment variables)

Metadata

Metadata

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions