From 535501c3788191a563ac7c9375551b94fbab4942 Mon Sep 17 00:00:00 2001
From: Diego Souza <8016841+diegosouzapw@users.noreply.github.com>
Date: Fri, 20 Feb 2026 13:42:46 -0300
Subject: [PATCH] =?UTF-8?q?Add=20OmniRoute=20=E2=80=94=20self-hostable=20A?=
 =?UTF-8?q?I=20gateway=20with=204-tier=20fallback?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 README.md | 1 +
 1 file changed, 1 insertion(+)
diff --git a/README.md b/README.md
index 083ff999..c3ece483 100644
--- a/README.md
+++ b/README.md
@@ -455,6 +455,7 @@
 - [vLLM](https://github.com/vllm-project/vllm) - A high-throughput and memory-efficient inference and serving engine for LLMs.
 - [llama.cpp](https://github.com/ggerganov/llama.cpp) - LLM inference in C/C++.
 - [ollama](https://github.com/ollama/ollama) - Get up and running with Llama 3, Mistral, Gemma, and other large language models.
+- [OmniRoute](https://github.com/diegosouzapw/OmniRoute) - A self-hostable AI gateway with 4-tier cascading fallback, multi-provider load balancing, and OpenAI-compatible API. Supports 200+ models across OpenAI, Anthropic, Google, and local providers.
 - [TGI](https://huggingface.co/docs/text-generation-inference/en/index) - a toolkit for deploying and serving Large Language Models (LLMs).
 - [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) - Nvidia Framework for LLM Inference
 <details>