🏥 Quick Return to Emergency Room
You are in a specialist desk.
For full triage and doctors on duty, return here:
- WFGY Global Fix Map — main Emergency Room, 300+ structured fixes
- WFGY Problem Map 1.0 — 16 reproducible failure modes
Think of this page as a sub-room.
If you want full consultation and prescriptions, go back to the Emergency Room lobby.
This page helps you choose between LLM vendors and fix provider-looking bugs that are actually schema, retrieval, orchestration, or eval drift. If you are new, start with the Orientation table and the FAQ. If you are debugging, jump to the Fix Hub.
| Provider | What it is | Typical use case | Link |
|---|---|---|---|
| OpenAI | GPT-4/4o from OpenAI Inc. | Direct API, fastest model access | openai.md |
| Azure OpenAI | Microsoft enterprise wrapper for OpenAI models | VNet, compliance, enterprise billing | azure_openai.md |
| Anthropic | The company behind Claude | Safety-focused platform | anthropic.md |
| Claude (Anthropic) | The model family from Anthropic | Long context, tool use, JSON control | anthropic_claude.md |
| Google Gemini | Google DeepMind multimodal models | Multimodal chat, reasoning | gemini.md |
| Google Vertex AI | Google Cloud AI platform that hosts Gemini and more | Pipelines, deployment, governance | google_vertex_ai.md |
| Mistral | EU startup with efficient open-weight models (e.g., Mixtral MoE) | Cost/perf, open ecosystem | mistral.md |
| Meta LLaMA | Meta open-weight model family | Local or private deployment, llama.cpp | meta_llama.md |
| Cohere | Enterprise NLP API and embeddings | RAG stacks, enterprise NLP | cohere.md |
| DeepSeek | CN player with infra-optimized long-context models | Cost-efficient, long windows | deepseek.md |
| Kimi (Moonshot) | CN chat-first models, very large parameter claims | Consumer chat focus | kimi.md |
| Groq | Hardware vendor: LPUs for transformer inference | Ultra-low latency serving (not a model) | groq.md |
| xAI Grok | xAI model family | X/Twitter integration, general chat | grok_xai.md |
| AWS Bedrock | AWS gateway to many models via one API | Enterprises already on AWS | aws_bedrock.md |
| OpenRouter | Community model aggregator, OpenAI-style endpoint | Try many models via one API key | openrouter.md |
| Together AI | Aggregator + infra for open weights and fine-tunes | Fast hosting, tuning services | together.md |
| MiniMax | CN AI lab with long-context models (204K), OpenAI-compatible API | Cost-efficient chat, RAG, agent workflows | minimax.md |
OpenAI vs Azure OpenAI — are they the same?
Same models, different packaging. OpenAI = direct API and fastest releases. Azure OpenAI = Microsoft billing, VNet, compliance, data residency.
Anthropic vs Claude — why two pages?
Anthropic is the company. Claude is the model family. We separate because “platform issues” and “model quirks” often need different fixes.
Gemini vs Vertex AI — what is the relation?
Gemini is a model. Vertex AI is Google Cloud’s platform that runs Gemini and provides pipelines, eval, and deployment features.
What makes Mistral special?
Efficient open-weights and MoE designs. Good cost/perf. Easy to host in your own infra.
Meta LLaMA vs local LLaMA
Meta releases the weights. Community tools like llama.cpp let you run them locally on CPU or GPU.
Groq LPU vs GPU
GPU is general purpose. LPU is a chip specialized for transformer inference. You get very low latency for chat workloads.
Bedrock vs OpenRouter vs Together
Bedrock is an AWS enterprise gateway. OpenRouter is a community aggregator with OpenAI-style API. Together is an infra host for open weights with training and fine-tune options.
- Visual map and recovery: RAG Architecture & Recovery
- End to end retrieval knobs: Retrieval Playbook
- Why this snippet (traceability schema): Retrieval Traceability
- Ordering control: Rerankers
- Embedding vs meaning: Embedding ≠ Semantic
- Hallucination and chunk boundaries: Hallucination
- Long chains and entropy: Context Drift, Entropy Collapse
- Structural collapse and recovery: Logic Collapse
- Snippet and citation schema: Data Contracts
- Live ops: Live Monitoring for RAG, Debug Playbook
- Boot order issues: Bootstrap Ordering, Deployment Deadlock, Pre-Deploy Collapse
- ΔS(question, retrieved) ≤ 0.45
- Coverage ≥ 0.70 for the target section
- λ remains convergent across three paraphrases and two seeds
- E_resonance stays flat on long windows
| Symptom | Likely cause | Open this |
|---|---|---|
| JSON mode breaks, invalid objects | Schema too loose or nested tool calls | Data Contracts, Logic Collapse |
| Tool calls loop or stall | Agent role drift, missing timeouts | Multi-Agent Problems, Role-drift deep dive |
| High similarity yet wrong snippet | Metric mismatch or fragmented store | Embedding ≠ Semantic, Vectorstore Fragmentation |
| Answers flip between runs | Prompt headers reorder and λ flips | Context Drift, Retrieval Traceability |
| Hybrid retrievers worse than single | Query parsing split, mis-weighted rerank | Query Parsing Split, Rerankers |
| Jailbreaks or bluffing | Overconfidence and missing fences | Bluffing Controls, Retrieval Traceability |
-
Measure ΔS
Compute ΔS(question, retrieved) and ΔS(retrieved, expected anchor). Stable < 0.40, transitional 0.40–0.60, risk ≥ 0.60. -
Probe λ_observe
Vary top-k and prompt headers. If λ flips, lock the schema and apply a BBAM variance clamp. -
Apply the module
Retrieval drift → BBMC + Data Contracts
Reasoning collapse → BBCR bridge + BBAM
Dead ends in long runs → BBPF alternate paths -
Verify
Coverage ≥ 0.70 on three paraphrases. λ convergent on two seeds.
| Tool | Link | 3-step setup |
|---|---|---|
| WFGY 1.0 PDF | Engine Paper | 1) Download 2) Upload to your LLM 3) Ask “Answer using WFGY + ” |
| TXT OS (plain text OS) | TXTOS.txt | 1) Download 2) Paste into any LLM chat 3) Type “hello world” to boot |
| Layer | Page | What it’s for |
|---|---|---|
| ⭐ Proof | WFGY Recognition Map | External citations, integrations, and ecosystem proof |
| ⚙️ Engine | WFGY 1.0 | Original PDF tension engine and early logic sketch (legacy reference) |
| ⚙️ Engine | WFGY 2.0 | Production tension kernel for RAG and agent systems |
| ⚙️ Engine | WFGY 3.0 | TXT based Singularity tension engine (131 S class set) |
| 🗺️ Map | Problem Map 1.0 | Flagship 16 problem RAG failure taxonomy and fix map |
| 🗺️ Map | Problem Map 2.0 | Global Debug Card for RAG and agent pipeline diagnosis |
| 🗺️ Map | Problem Map 3.0 | Global AI troubleshooting atlas and failure pattern map |
| 🧰 App | TXT OS | .txt semantic OS with fast bootstrap |
| 🧰 App | Blah Blah Blah | Abstract and paradox Q&A built on TXT OS |
| 🧰 App | Blur Blur Blur | Text to image generation with semantic control |
| 🏡 Onboarding | Starter Village | Guided entry point for new users |
If this repository helped, starring it improves discovery so more builders can find the docs and tools.