Cover image

Containerizing a Hugging Face Model with FastAPI and Docker for Efficient Deployment

Efficiently Package and Deploy your Hugging Face Model with FastAPI and Docker for Scalable Production

Dipankar Medhi
3 min readJan 11, 2023

--

Hey everyone! 👋

Today we will see how we can serve a Hugging Face model and containerize it using Docker.

We are going to use a classification model to classify tweets as “Positive”, “Negative” and “Neutral” with their respective scores.

Here’s the folder structure of the project —

root
├── ml-service/
│ ├── app.py
│ ├── model.py
│ ├── classifier.py
│ ├── nlp.py
│ ├── download_model.ipynb
│ └── Dockerfile
└── docker-compose.yml

Our very first step is to download the model. We are going to set up a jupyter notebook for downloading the Hugging face model and putting it into a directory of our preference.

Let’s get going!

Download & Save the Hugging face model

  1. Create a ml-service directory.
  2. Create a download_model.ipynb file inside the directory and save the model.
from transformers import AutoModelForSequenceClassification, TFAutoModelForSequenceClassification, AutoTokenize
# download the model
MODEL = "cardiffnlp/twitter-roberta-base-sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
# save the model
save_dir = "ml-service/models/roberta-base"
tokenizer.save_pretrained(save_dir)
model.save_pretrained(save_dir)

Loading the model and classifier

  1. Create a ml-service/model.py file for loading the model and tokenizer.
# model.py
from transformers import AutoModelForSequenceClassification, TFAutoModelForSequenceClassification, AutoTokenize

class Model:
"""A model class to lead the model and tokenizer"""

def __init__(self) -> None:
pass

def load_model():
model = AutoModelForSequenceClassification.from_pretrained("./models/roberta-base/")
return model

def load_tokenizer():
tokenizer = AutoTokenize.from_pretrained("./models/roberta-base/")
return tokenizer

2. Next, create a ml-service/classifier.py file that will handle the sentiment scores and labels.

# classifier.py
from scipy.special import softmax
from model import Model
import numpy as np

class Classifier:
def __init__(self):
self.model = Model.load_model()
self.tokenizer = Model.load_tokenizer()

def get_sentiment_label_and_score(self, text: str):
result = {}
labels = ["Negative", "Neutral", "Positive"]
encoded_input = self.tokenizer(text, return_tensors='pt')
output = self.model(**encoded_input)
scores = output[0][0]jj.detach().numpy()
scores = softmax(scores)
ranking = np.argsort(scores)
ranking = ranking[::-1]
result["label"] = str(labels[ranking[0]])
result["score"] = np.round(float(scores[ranking[0]]), 4)
return result

Sentiment Analysis

We create a ml-service/nlp.py module to handle sentiment analysis.

# nlp.py
from classifier import Classifier

classifier = Classifier()

def sentiment_analysis(self, text:str, classifier:
sentiment = classifier.get_sentiment_label_and_score(text)
return sentiment

fastAPI server

  1. The first step is to create a ml-service/app.py file and import the necessary libraries.
# app.py
from fastapi import FastAPI, APIRouter
import uvicorn
from classifier import Classifier
from model import Model
from nlp import NLP
import logging

logging.basicConfig(level = logging.INFO)

2. Create the required class instances.

app = FastAPI()
nlp = NLP()
router = APIRouter()
classifier = Classifier()

3. Let’s create the required routes.

@router.get("/")
async def home():
return {"message": "Machine Learning service"}

@router.post("/sentiment")
async def data(data: dict):
try:
input_text = data["text"]
res = nlp.sentiment_analysis(classifier, input_text)
return res
except Exception as e:
log.error("Something went wrong")

app.include_router(router)

if __name__ == "__main__":
uvicorn.run("app:app", reload=True, port=6000, host="0.0.0.0)

Now that our API is ready, we need to create the Dockerfile and the requirements.txt file.

Getting Docker ready

  1. First, we create the requirements.txt file inside the ml-service dir.
$ pip freeze > requirements.txt

2. Next we create the Dockerfile.

FROM python:3.10.8-slim
LABEL description="Sentiment classifier of tweets service"
WORKDIR /app
COPY requirements.txt /app/requirements.txt
RUN pip install -r requirements.txt
COPY . /app/
EXPOSE 6000
CMD ["python", "app.py"]

3. Now create the docker image by using docekr run or creating a docker-compose.yml file.

Let’s look at how to use docker-compose to create the container.

3. Create a docker-compose.yml file in the root folder. And add the following to it.

version: '3

services:
ml-service:
build: ./ml-service
ports:
- 6000:6000

4. Now just run the docker-compose command.

$ docker-compose up --build

5. Open a separate shell and try running docker ps. It will show all the running containers. The container should be up and running.

Make a POST request to localhost:6000/sentiment with body as {"text": "Hi, Thanks"} .

The response to the POST request should be something like this:

{
"label": "Positive",
"score": 0.78
}

That’s all for now👋

Thank you and Happy Coding 💙

--

--