🚅Search before asking
I have searched for issues similar to this one.
🚅Description
This GFI allows sdgx to obtain the column description from raw_data or raw_data sampled data, and return a string in a text form that LLM can understand. The information in the text should include but not be limited to:
- Column type: float type, int type, category type, datetime type, etc.;
- For numeric type columns: maximum value, minimum value, mean, standard deviation, distribution, etc.;
- For datetime type: start and end dates, datetime type format, etc.;
- For category types: specific category value, number of category values, etc.;
- For ID type: ID category, format, etc.;
- For other types of necessary information, developers are asked to add it based on their own creativity.
🏕Solution
Implements the _form_columns_description method of sdgx.models.LLM.single_table.base.LLMBaseModel.
|
def _form_columns_description(self): |
This method returns a string.
Developers can refer to the implementation ideas of _form_message_with_offtable_features and _form_dataset_description methods.
🚅Search before asking
I have searched for issues similar to this one.
🚅Description
This GFI allows sdgx to obtain the column description from raw_data or raw_data sampled data, and return a string in a text form that LLM can understand. The information in the text should include but not be limited to:
🏕Solution
Implements the
_form_columns_descriptionmethod ofsdgx.models.LLM.single_table.base.LLMBaseModel.synthetic-data-generator/sdgx/models/LLM/base.py
Line 89 in ec31560
This method returns a string.
Developers can refer to the implementation ideas of
_form_message_with_offtable_featuresand_form_dataset_descriptionmethods.