Skip to content

Navigation Menu

Appearance settings

inferless

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

Inferless

Building World's most reliable Serverless GPU Inference Offering. In Private Beta

46 followers
India
https://www.inferless.com/
company/inferless
@Inferless_
https://ypogria3pl4.typeform.com/to/nzuhQtba?typeform-source=www.inferless.com
https://docs.inferless.com/
contact@inferless.com

Overview
Repositories
Projects
Packages
People

More

Overview
Repositories
Projects
Packages
People

Popular repositories Loading

rmbg-1.4 Public template

State-of-the-art background removal model, designed to effectively separate foreground from background. <metadata> gpu: T4 | collections: ["HF Transformers"] </metadata>

Python 23 12
triton-co-pilot Public

Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments

Python 20 3
smaug-72b Public

Smaug-72B topped the Hugging Face LLM leaderboard and it’s the first model with an average score of 80, making it the world’s best open-source foundation model. <metadata> gpu: A100 | collections: …

Python 17 5
qwq-32b-preview Public template

A 32B experimental reasoning model for advanced text generation and robust instruction following. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

Python 17 7
whisper-large-v3 Public template

State‑of‑the‑art speech recognition model for English, delivering transcription accuracy across diverse audio scenarios. <metadata> gpu: T4 | collections: ["CTranslate2"] </metadata>

Python 15 16
deepseek-r1-distill-qwen-32b Public template

A distilled DeepSeek-R1 variant built on Qwen2.5-32B, fine-tuned with curated data for enhanced performance and efficiency. <metadata> gpu: A100 | collections: ["vLLM"] </metadata>

Python 15 37

Repositories

Loading

Type

Select type

All Public Sources Forks Archived Mirrors Templates

Language

Select language

All Dockerfile Python

Sort

Select order

Last updated Name Stars

Showing 10 of 174 repositories

jina-embeddings-v4 Public template
A 3.8B multimodal-multilingual embedding that unifies text and image understanding in a single late-interaction space, delivers both dense and multi-vector outputs. <metadata> gpu: A10 | collections: ["HF_Transformers"] </metadata>

Python 0 0 0 0 Updated Jul 13, 2025
flux-1-kontext-dev Public template
12B model from Black Forest Labs that allows in‑context image editing with character and style consistency; supporting iterative, instruction-guided edits. <metadata> gpu: A100 | collections: ["HF_Transformers"] </metadata>

Python 0 0 0 0 Updated Jul 13, 2025
gemma-3n-e4b-it Public template
8B variant of the lightweight Gemma 3n series that operates with a 4B‑parameter memory footprint, enabling full multimodal inference (text, image, audio, video) on resource‑constrained hardware. <metadata> gpu: A100 | collections: ["HF_Transformers"] </metadata>

Python 0 0 0 0 Updated Jul 13, 2025
qwen3-embedding-0.6b Public template
600M parameter, 100 language embedding model that turns up to 32k token inputs into instruction-aware vectors. <metadata> gpu: A10 | collections: ["HF_Transformers"] </metadata>

Python 0 0 0 0 Updated Jun 23, 2025
devstral-small Public template
An agentic LLM for software engineering tasks, excels at using tools to explore codebases, editing multiple files and power software engineering agents. <metadata> gpu: A100 | collections: ["HF_Transformers"] </metadata>

Python 0 0 0 0 Updated Jun 23, 2025
deepseek-r1-qwen3-8b Public template
A distilled 8B parameter reasoning powerhouse, leveraging deep chain‑of‑thought from the DeepSeek R1‑0528—delivering SOTA open‑source performance. <metadata> gpu: A100 | collections: ["HF Transformers"] </metadata>

Python 0 0 0 0 Updated Jun 23, 2025
nanonets-ocr-s Public template
Nanonets-OCR-s that turns images or PDFs into structured Markdown capturing tables, LaTeX, captions and tags—for fast, powerful, human-readable OCR. <metadata> gpu: A10 | collections: ["HF_Transformers"] </metadata>

Python 0 2 0 0 Updated Jun 23, 2025
Open-NotebookLM Public

Python 0 0 0 0 Updated Jun 11, 2025
yolo11m-detect Public

Python 0 1 0 0 Updated May 20, 2025
kokoro Public template
82M parameters lightweight text-to-speech (TTS) model that delivers high-quality voice synthesis. <metadata> gpu: T4 | collections: ["SSE Events"] </metadata>

Python 1 1 0 0 Updated May 19, 2025

View all repositories

People

Top languages

Python Dockerfile

Most used topics

Loading…

Uh oh!

There was an error while loading. Please reload this page.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.