Skip to content
Douglas Blank edited this page Oct 31, 2022 · 8 revisions

Kangas: Explore multimedia datasets at scale

Kangas is a tool for exploring, analyzing, and visualizing large-scale multimedia data. It provides a straightforward Python API for logging large tables of data, along with an intuitive visual interface for performing complex queries against your dataset.

The key features of Kangas include:

  • Scalability. Kangas DataGrid, the fundamental class for representing datasets, can easily store millions of rows of data.
  • Performance. Group, sort, and filter across millions of data points in seconds with a simple, fast UI.
  • Interoperability. Any data, any environment. Kangas can run in a notebook or as a standalone app, both locally and remotely.
  • Integrated computer vision support. Visualize and filter bounding boxes, labels, and metadata without any extra setup.

You can access a live demo of Kangas at kangas.comet.com.

Getting started

Kangas is accessible as a Python library via pip

pip install kangas

Once installed, there are many ways to load or create a DataGrid:

import kangas as kg

# Load an existing DataGrid
dg = kg.read_datagrid("https://github.com/caleb-kaiser/kangas_examples/raw/master/coco-500.datagrid")

After your DataGrid is initialized, you can render it within the Kangas Viewer directly from Python:

dg.show()
image

From the Kangas Viewer, you can group, sort, and filter data. In addition, Kangas will do its best to parse any metadata attached to your assets. For example, if you're using the COCO-500 DataGrid from the quickstart above, Kangas will automatically parse labels and scores for each image:

And viola! Now you're started using Kangas.

Table of Contents

Clone this wiki locally