This tutorial demonstrates how to deploy a pretrained Hugging Face sentiment analysis model using FastAPI. It is intended for developers, machine learning (ML) engineers, and data scientists seeking to expose ML models through a RESTful API using Python-based tools.
This guide follows best practices in documentation as outlined in the Google Developer Style Guide, prioritizing clarity, precision, and reader-first communication.
By the end of this guide, you will learn how to:
- Load a pretrained model using Hugging Face Transformers.
- Serve model predictions through an HTTP endpoint using FastAPI.
- Test the endpoint using
curl
, Postman, or Swagger UI. - Understand the structure of a production-friendly ML API deployment.
fastapi-ml-api/
├── app.py # Core application file with model and endpoint
├── requirements.txt # Python package dependencies
├── README.md # Project documentation (this file)
├── LICENSE # MIT License for project use
└── .github/
└── ISSUE_TEMPLATE/
└── bug_report.md # Template for structured bug reporting
- Python 3.8 or later
- Familiarity with Python, REST APIs, and the command line
- A virtual environment (recommended)
git clone https://github.com/MeelahMe/fastapi-ml-api.git
cd fastapi-ml-api
python -m venv venv
Activate the environment:
- On macOS/Linux
source venv/bin/activate
- On Windows:
venv\\Scripts\\activate
You should now see your terminal prompt prefixed with (venv).
pip install -r requirements.txt
This will install:
- FastAPI for creating the web application.
- Uvicorn as the ASGI server.
- Transformers for accessing pretrained models.
- Torch as the backend deep learning framework.
- app.py
- requirements.txt
- README.md
- .github/ISSUE_TEMPLATE/bug_report.md
- LINCENSE
uvicorn app:app --reload
The --reload flag enables automatic server restart on code changes.
Expected output:
INFO: Uvicorn running on http://127.0.0.1:8000
- Swagger UI: http://127.0.0.1:8000/docs
- ReDoc: http://127.0.0.1:8000/redoc
Once your application server is running, you can interact with the API using the Swagger UI, command-line tools like curl, or API clients such as Postman.
- Open your browser and navigate to:
http://127.0.0.1:8000/docs
- Locate the
POST /predict
endpoint in the list. - Click Try it out.
- In the request body input, enter sample text:
{
"text": "This is a fantastic project!"
}
- Click Execute.
- The response section below will display the prediction result.
This is ideal for quickly verifying functionality and exploring the API.
To send a request directly from the terminal:
curl -X POST http://127.0.0.1:8000/predict \
-H "Content-Type: application/json" \
-d '{"text": "Deploying models is easier than ever."}'
Expected output:
{
"result": [
{
"label": "POSITIVE",
"score": 0.9997
}
]
}
This method is useful for scripting and quick tests.
- Open Postman and create a new POST request.
- Set the request URL to:
http://127.0.0.1:8000/predict
-
Go to the Headers tab and add:
Key
:Content-Type
Value
:application/json
-
Navigate to the Body tab.
- Select raw and choose JSON as the format.
- Paste the following:
{
"text": "FastAPI is great for ML deployment."
}
- Click Send to see the response.
Postman is especially helpful for testing with varying inputs or headers.
If you prefer a different layout, FastAPI also auto-generates a ReDoc UI:
http://127.0.0.1:8000/redoc
This is a read-only reference for exploring endpoints and data structures.
This project is licensed under the MIT License.