Skip to content

Add get_component_definition tool #26

@mathislucka

Description

@mathislucka

Overview

We want to add a tool that can get the component definition of a single component by using the fully qualified component type (e.g. haystack.components.routers.conditional_router.ConditionalRouter). The tool should extract the requested component definition from the component schema and then return the component definition as a neatly formatted string.

Component definition

Here is an example of a component definition:

{'description': 'Converts XLSX (Excel) files into Documents.

    Supports reading data from specific sheets or all sheets in the Excel file. If all sheets are read, a Document is
    created for each sheet. The content of the Document is the table which can be saved in CSV or Markdown format.

    ### Usage example

    ```python
    from haystack.components.converters.xlsx import XLSXToDocument

    converter = XLSXToDocument()
    results = converter.run(sources=["sample.xlsx"], meta={"date_added": datetime.now().isoformat()})
    documents = results["documents"]
    print(documents[0].content)
    # ",A,B
1,col_a,col_b
2,1.5,test
"
    ```', 'properties': {'init_parameters': {'properties': {'read_excel_kwargs': {'_annotation': 'typing.Optional[typing.Dict[str, typing.Any]]', 'default': None, 'description': 'Additional arguments to pass to `pandas.read_excel`.
            See https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html#pandas-read-excel', 'properties': {'_python_type': {...}}, 'type': ['null', 'object']}, 'sheet_name': {'_annotation': 'typing.Union[str, int, typing.List[typing.Union[str, int]], NoneType]', 'anyOf': [{...}, {...}, {...}, {...}], 'default': None, 'description': 'The name of the sheet to read. If None, all sheets are read.'}, 'store_full_path': {'_annotation': '<class 'bool'>', 'default': False, 'description': 'If True, the full path of the file is stored in the metadata of the document.
            If False, only the file name is stored.', 'type': ['boolean']}, 'table_format': {'_annotation': 'typing.Literal['csv', 'markdown']', 'default': 'csv', 'description': 'The format to convert the Excel file to.', 'enum': ['csv', 'markdown'], 'type': 'string'}, 'table_format_kwargs': {'_annotation': 'typing.Optional[typing.Dict[str, typing.Any]]', 'default': None, 'description': 'Additional keyword arguments to pass to the table format function.
            - If `table_format` is "csv", these arguments are passed to `pandas.DataFrame.to_csv`.
              See https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html#pandas-dataframe-to-csv
            - If `table_format` is "markdown", these arguments are passed to `pandas.DataFrame.to_markdown`.
              See https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_markdown.html#pandas-dataframe-to-markdown', 'properties': {'_python_type': {...}}, 'type': ['null', 'object']}}, 'required': [], 'type': 'object'}, 'type': {'component_frequency': 29, 'const': 'haystack.components.converters.xlsx.XLSXToDocument', 'family': 'converters', 'family_description': 'Convert data into a format your pipeline can query. Use a converter that matches your data type.', 'readme_link': 'https://docs.haystack.deepset.ai/docs/xlsxtodocument', 'type': 'string'}}, 'title': 'XLSXToDocument', 'type': 'object'}

We would fetch the component using: haystack.components.converters.xlsx.XLSXToDocument

The neatly formatted version of the definition should have:

  • fully qualified component path
  • component name
  • component description
  • init_parameters (inlcuding name, type and description)
  • family and description

Make sure it is compact yet readable.

Steps

  • check src/deepset_mcp/tools/haystack_service.py to see a comparable tool and add the get_component_definition tool in the same file
  • you need to use the same call to fetch the component schema, then extract the requested component by fully qualified path
  • errors should be returned as string
  • add tests in test/unit/tools/test_haystack_service.py use the same testing style (fake resource)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions