Skip to content
Change the repository type filter

All

    Repositories list

    • anserini

      Public
      Anserini is a Lucene toolkit for reproducible information retrieval research
      Java
      5681.1k2411Updated Feb 13, 2026Feb 13, 2026
    • NanoKnow

      Public
      Python
      0000Updated Feb 13, 2026Feb 13, 2026
    • pyserini

      Public
      Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
      Python
      4882k6113Updated Feb 13, 2026Feb 13, 2026
    • This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]
      Python
      50000Updated Feb 12, 2026Feb 12, 2026
    • rank_llm

      Public
      RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
      Python
      845762312Updated Feb 12, 2026Feb 12, 2026
    • HTML
      5101Updated Feb 6, 2026Feb 6, 2026
    • Python
      32301Updated Feb 2, 2026Feb 2, 2026
    • Evaluation tools shared across anserini, pyserini, and pygaggle
      Python
      303500Updated Jan 28, 2026Jan 28, 2026
    • quackir

      Public
      QuackIR is an IR toolkit built on DuckDB
      Python
      11313Updated Nov 6, 2025Nov 6, 2025
    • Onboarding guide to Jimmy Lin's research group at the University of Waterloo
      274110Updated Nov 1, 2025Nov 1, 2025
    • Creates a wrapper around the original UniIR and releases a PyPI package for Pyserini integration
      Python
      18002Updated Oct 31, 2025Oct 31, 2025
    • Python
      0102Updated Oct 9, 2025Oct 9, 2025
    • ragnarok

      Public
      Retrieval-Augmented Generation battle!
      Python
      96221Updated Jul 31, 2025Jul 31, 2025
    • 1600Updated Jul 24, 2025Jul 24, 2025
    • umbrela

      Public
      Python
      85343Updated Jul 20, 2025Jul 20, 2025
    • visa

      Public
      Python
      0300Updated Jun 2, 2025Jun 2, 2025
    • rlhn

      Public
      Identifying and relabeling false negatives in IR training datasets
      0920Updated May 22, 2025May 22, 2025
    • UniRAG

      Public
      Python
      01300Updated Mar 17, 2025Mar 17, 2025
    • TypeScript
      0000Updated Feb 9, 2025Feb 9, 2025
    • UniIR

      Public
      Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers"
      Python
      18101Updated Feb 4, 2025Feb 4, 2025
    • w1kp

      Public
      w1kp: Toolkit for analyzing perceptual variability in text-to-image generation.
      Python
      0200Updated Nov 14, 2024Nov 14, 2024
    • Python
      71100Updated Oct 15, 2024Oct 15, 2024
    • An in-memory, everything-on-GPU retrieval system.
      C++
      0000Updated Oct 14, 2024Oct 14, 2024
    • LiT5

      Public
      Python
      31800Updated Aug 9, 2024Aug 9, 2024
    • howl

      Public
      Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.
      Python
      31215293Updated Jul 25, 2024Jul 25, 2024
    • 2300Updated Jul 15, 2024Jul 15, 2024
    • A reproduction study of the Touché 2020 dataset in the BEIR benchmark
      Python
      2700Updated Jul 11, 2024Jul 11, 2024
    • Python
      8000Updated Jun 25, 2024Jun 25, 2024
    • LaVIT

      Public
      LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
      Jupyter Notebook
      31000Updated Jun 9, 2024Jun 9, 2024
    • Python
      4200Updated Apr 24, 2024Apr 24, 2024