Skip to content

update new aliasing code, add constants and test cases#124

Open
AmeyHengle wants to merge 3 commits intomainfrom
new-aliasing-structure
Open

update new aliasing code, add constants and test cases#124
AmeyHengle wants to merge 3 commits intomainfrom
new-aliasing-structure

Conversation

@AmeyHengle
Copy link
Contributor

@AmeyHengle AmeyHengle commented Jan 29, 2023

The PR aims to address issues in the current codebase related to intent aliasing and preprocessing steps.
Associated Ticket: https://vernacular-ai.atlassian.net/browse/CMS-2524

Issues:

  1. In the current code structure intent aliases are defined as a dictionary mapping {intent: intent_alias}. This differs from the structure commonly used in eevee or production (ref), which makes it difficult to use the alias.yaml files directly in dialogy.
  2. Aliasing is present as a preprocessing step in train.py but not in test.py.
  3. There is no provision to define independent train and test aliases, which is necessary for multiple clients.
  4. Certain preprocessing steps are absent in test.py (make_label_column_uniform, make_data_column_uniform, etc).

Updates:

  1. Introduced a separate function for intent aliasing.
  2. Provision to define separate train and eval aliases.
  3. Check for duplicates and/or malformed values in alias files.
  4. Common preprocessing functions in train.py and test.py moved to a single utility file.
  5. Added test cases covering the new aliasing code.

@AmeyHengle AmeyHengle requested a review from janaab11 January 29, 2023 04:45
@dakshvar22 dakshvar22 removed the request for review from janaab11 February 24, 2023 06:50
@dakshvar22
Copy link
Contributor

@AmeyHengle Can you please describe the problem better? It's not clear what exactly are you solving here.

@AmeyHengle AmeyHengle self-assigned this Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants