We want to add a tool function for interacting with haystack_service.get_component_schemas
This tool will use the HaystackServiceResource to fetch the component schemas.
Then, it needs to parse the schema and extract the component families and descriptions.
Additionally, it needs to format these families with descriptions into a nice string that is consumable by an LLM.
{
"component_schema": {
"definitions": {"Components": {<component_name>: <component_definition>, ...}}
}
}
{'description': 'Converts XLSX (Excel) files into Documents.
Supports reading data from specific sheets or all sheets in the Excel file. If all sheets are read, a Document is
created for each sheet. The content of the Document is the table which can be saved in CSV or Markdown format.
### Usage example
```python
from haystack.components.converters.xlsx import XLSXToDocument
converter = XLSXToDocument()
results = converter.run(sources=["sample.xlsx"], meta={"date_added": datetime.now().isoformat()})
documents = results["documents"]
print(documents[0].content)
# ",A,B
1,col_a,col_b
2,1.5,test
"
```', 'properties': {'init_parameters': {'properties': {'read_excel_kwargs': {'_annotation': 'typing.Optional[typing.Dict[str, typing.Any]]', 'default': None, 'description': 'Additional arguments to pass to `pandas.read_excel`.
See https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html#pandas-read-excel', 'properties': {'_python_type': {...}}, 'type': ['null', 'object']}, 'sheet_name': {'_annotation': 'typing.Union[str, int, typing.List[typing.Union[str, int]], NoneType]', 'anyOf': [{...}, {...}, {...}, {...}], 'default': None, 'description': 'The name of the sheet to read. If None, all sheets are read.'}, 'store_full_path': {'_annotation': '<class 'bool'>', 'default': False, 'description': 'If True, the full path of the file is stored in the metadata of the document.
If False, only the file name is stored.', 'type': ['boolean']}, 'table_format': {'_annotation': 'typing.Literal['csv', 'markdown']', 'default': 'csv', 'description': 'The format to convert the Excel file to.', 'enum': ['csv', 'markdown'], 'type': 'string'}, 'table_format_kwargs': {'_annotation': 'typing.Optional[typing.Dict[str, typing.Any]]', 'default': None, 'description': 'Additional keyword arguments to pass to the table format function.
- If `table_format` is "csv", these arguments are passed to `pandas.DataFrame.to_csv`.
See https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html#pandas-dataframe-to-csv
- If `table_format` is "markdown", these arguments are passed to `pandas.DataFrame.to_markdown`.
See https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_markdown.html#pandas-dataframe-to-markdown', 'properties': {'_python_type': {...}}, 'type': ['null', 'object']}}, 'required': [], 'type': 'object'}, 'type': {'component_frequency': 29, 'const': 'haystack.components.converters.xlsx.XLSXToDocument', 'family': 'converters', 'family_description': 'Convert data into a format your pipeline can query. Use a converter that matches your data type.', 'readme_link': 'https://docs.haystack.deepset.ai/docs/xlsxtodocument', 'type': 'string'}}, 'title': 'XLSXToDocument', 'type': 'object'}
From that definition we need to extract the family and family_description.
Many components can belong to the same family.
We want to add a tool function for interacting with haystack_service.get_component_schemas
Specifically, we want the following tool:
This tool will use the HaystackServiceResource to fetch the component schemas.
Then, it needs to parse the schema and extract the component families and descriptions.
Additionally, it needs to format these families with descriptions into a nice string that is consumable by an LLM.
We are expecting the following response structure from
get_component_schemas:Here is an example of a component definition:
From that definition we need to extract the family and family_description.
Many components can belong to the same family.
Steps: