Skip to content

martinmoldrup/RankySwanky

Repository files navigation

🕺 RankySwanky 🕺

NOTE: This is being actively developed and the API may change.

Effortless, annotation-free evaluation for AI search, retrieval, and ranking systems. Let LLMs do the judging — so you can get swanky with your ranks!


What is RankySwanky?

RankySwanky makes comparing and benchmarking search engines, retrievers, and rankers a breeze — with zero manual annotation. Just connect your engine and hand over a list of questions; RankySwanky automatically queries, scores results using state-of-the-art LLMs (Large Language Models), and computes all the vital metrics you need to optimize your RAG (Retrieval Augmented Generation) pipelines or search products.

No data labeling, no configs, no headaches. Instantly see how different engines/configurations stack up!


Features

  • 🪄 Plug & Play: Simple interface to connect any search engine or ranker.
  • 🤖 LLM-Powered Judging: Harness generative AI (LLMs) to score search/retrieval results, no human annotation required!
  • 🏅 Comprehensive Metrics: Computes NDCG, relevance, novelty, and normalized cumulative gain (NCG) — designed specifically for order-agnostic RAG and AI scenarios.
  • 📊 Clear Comparisons: Instantly compare engines/settings side-by-side with easy-to-understand summaries & visuals.
  • 🥳 Zero Hassle: Just supply your engine function and a question list. RankySwanky does the rest!
  • 🌱 Open Source & Friendly: Perfect for projects large and small, hobbyists or startups.

Installation

Not registered in PyPI yet, but you can install directly from GitHub:

git clone https://github.com/martinmoldrup/RankySwanky.git
cd rankyswanky  
pip install .  

Quickstart

from rankyswanky import RankySwanky  
   
# 1. Wrap your search engine or retriever in a simple Python function:  
def my_search_engine(query):  
    # Here you would implement your search logic, e.g. using an API or search library  
    # For demonstration, we'll just return dummy results
    return ["result1", "result2", "result3"]  
   
# 2. Create your questions list:  
questions = [  
    "Who invented the lightbulb?",  
    "What is the capital of Norway?",  
    "Explain quantum entanglement simply."  
]  
   
# 3. Run evaluation!  
ranker = RankySwanky(my_search_engine)  
results = ranker.evaluate(questions)  
   
# 4. Print or visualize results  
results.print_summary()
results.print_comparison()

Tip: Easily compare two or more engines by evaluating each and comparing summaries!


Metrics

  • NDCG (Normalized Discounted Cumulative Gain)
  • NCG (Normalized Cumulative Gain — order does NOT matter, RAG-friendly!)
  • Relevance (LLM-judged relevance for any query-result pair)
  • Novelty

All metrics are automatically scored by the LLM judge—no labeled data needed!


Why RankySwanky?

  • No more manual annotation: Small team? Big impact. No labeling needed.
  • Compare any engine: From in-house retrievers to search SaaS APIs.
  • Instant insights: See strengths, weaknesses, and pick the best configuration for your AI/RAG projects.

Docs & Examples

Documentation is coming soon!


Contributing

Pull requests, issues, and ideas are all welcome!
Whether you're adding metrics, new LLM providers, UI improvements, or docs—join the swanky crew!

Check CONTRIBUTING.md for more info.


License

MIT License. See LICENSE for details.


Made with 💖 by Martin Møldrup
Swank up your search & AI evaluation!


About

LLM-based search and rank evaluation toolkit for RAG and AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages