Skip to content
@wbsg-uni-mannheim

Web-based Systems Group @ University of Mannheim

The Web-based Systems Group at the University of Mannheim conducts research on methods for integrating data from large numbers of data sources in the context of the open Web and in corporate data lakes. Our research includes areas such as entity matching, schema matching, table annotation, information extraction, and data discovery. Our current work focuses on utilizing large language models and LLM-based agents for data integration tasks. We apply the developed methods to integrate product data from large numbers of e-shops and to construct knowledge graphs such as DBpedia. The empirical research of the group includes monitoring the adoption of schema.org annotations on the public Web by regularly extracting structured data from large Web corpora.

Pinned Loading

  1. TabAnnGPT TabAnnGPT Public

    This repository contains code and data for reproducing the experiments of three papers that focus on two subtasks of table annotation: column type annotation (CTA) the task of annotating table colu…

    Python 12 2

  2. MatchGPT MatchGPT Public

    This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Entity Matching" and "Entity Matching using Large Language Models".

    Jupyter Notebook 65 13

  3. ExtractGPT ExtractGPT Public

    Attribute Value Extraction using Large Language Models

    Python 28 11

  4. wdcproducts wdcproducts Public

    This repository contains the code and data download links to reproduce building the WDC Products Benchmark.

    Python 14 4

  5. WebMall WebMall Public

    This repository contains the code and data of the WebMall benchmark for evaluating the capability of Web agents to find and compare product offers from multiple e-shops.

    HTML 8 4

  6. PyDI PyDI Public

    The PyDI framework provides methods for end-to-end data integration. The framework covers all steps of the integration process, including schema matching, data translation, entity matching, and dat…

    HTML 10

Repositories

Showing 10 of 34 repositories

Top languages

Loading…

Most used topics

Loading…