Skip to content

tenstorrent/tt-inference-server

Repository files navigation

Ask DeepWiki

tt-inference-server

tt-inference-server is the fastest way to deploy and test models for serving inference on Tenstorrent hardware.

Quickstart guide

On first-run please see the prerequisites guide for general Tenstorrent hardware and software setup.

For the specific quickstart guide and details for your model, select your model and hardware configuration in Model Support pages and tables below. Alternatively you can see all models supported for your given Tenstorrent hardware.

Models by Model Type

Browse models by type:

Models by Hardware Configuration

Browse models by hardware:

Workflow automation in tt-inference-server

For details on the workflow automation for:

  • deploying inference servers
  • running E2E performance benchmarks
  • running accuracy evals

See:

Benchmarking

For more details see benchmarking/README.md

Evals

For more details see evals/README.md

Development

Developer documentation: docs/README.md

Release documentation: scripts/release/README.md

If you encounter setup or stability problems with any model please file an issue and our team will address it.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 46