Containerizing a Hugging Face Model with FastAPI and Docker for Efficient Deployment

Efficiently Package and Deploy your Hugging Face Model with FastAPI and Docker for Scalable Production

3 min readJan 11, 2023

Hey everyone! 👋

Today we will see how we can serve a Hugging Face model and containerize it using Docker.

We are going to use a classification model to classify tweets as “Positive”, “Negative” and “Neutral” with their respective scores.

Here’s the folder structure of the project —

root
├── ml-service/
│   ├── app.py
│   ├── model.py
│   ├── classifier.py
│   ├── nlp.py
│   ├── download_model.ipynb
│   └── Dockerfile
└── docker-compose.yml

Our very first step is to download the model. We are going to set up a jupyter notebook for downloading the Hugging face model and putting it into a directory of our preference.

Let’s get going!

Download & Save the Hugging face model

Create a ml-service directory.
Create a download_model.ipynb file inside the directory and save the model.

from transformers import AutoModelForSequenceClassification, TFAutoModelForSequenceClassification, AutoTokenize

# download the model
MODEL = "cardiffnlp/twitter-roberta-base-sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)

# save the model
save_dir = "ml-service/models/roberta-base"
tokenizer.save_pretrained(save_dir)
model.save_pretrained(save_dir)

Loading the model and classifier

Create a ml-service/model.py file for loading the model and tokenizer.

# model.py
from transformers import AutoModelForSequenceClassification, TFAutoModelForSequenceClassification, AutoTokenize

class Model:
  """A model class to lead the model and tokenizer"""

  def __init__(self) -> None:
    pass
  
  def load_model():
    model = AutoModelForSequenceClassification.from_pretrained("./models/roberta-base/")
    return model

  def load_tokenizer():
    tokenizer = AutoTokenize.from_pretrained("./models/roberta-base/")
    return tokenizer

2. Next, create a ml-service/classifier.py file that will handle the sentiment scores and labels.

# classifier.py
from scipy.special import softmax
from model import Model
import numpy as np

class Classifier:
  def __init__(self):
    self.model = Model.load_model()
    self.tokenizer = Model.load_tokenizer()

  def get_sentiment_label_and_score(self, text: str):
    result = {}
    labels = ["Negative", "Neutral", "Positive"]
    encoded_input = self.tokenizer(text, return_tensors='pt')
    output = self.model(**encoded_input)
    scores = output[0][0]jj.detach().numpy()
    scores = softmax(scores)
    ranking = np.argsort(scores)
    ranking = ranking[::-1]
    result["label"] = str(labels[ranking[0]])
    result["score"] = np.round(float(scores[ranking[0]]), 4)
    return result

Sentiment Analysis

We create a ml-service/nlp.py module to handle sentiment analysis.

# nlp.py
from classifier import Classifier

classifier = Classifier()

def sentiment_analysis(self, text:str, classifier: 
  sentiment = classifier.get_sentiment_label_and_score(text)
  return sentiment

fastAPI server

The first step is to create a ml-service/app.py file and import the necessary libraries.

# app.py
from fastapi import FastAPI, APIRouter
import uvicorn
from classifier import Classifier
from model import Model
from nlp import NLP
import logging

logging.basicConfig(level = logging.INFO)

2. Create the required class instances.

app = FastAPI()
nlp = NLP()
router = APIRouter()
classifier = Classifier()

3. Let’s create the required routes.

@router.get("/")
async def home():
  return {"message": "Machine Learning service"}

@router.post("/sentiment")
async def data(data: dict):
  try:
    input_text = data["text"]
    res = nlp.sentiment_analysis(classifier, input_text)
    return res
  except Exception as e:
    log.error("Something went wrong")

app.include_router(router)

if __name__ == "__main__":
  uvicorn.run("app:app", reload=True, port=6000, host="0.0.0.0)

Now that our API is ready, we need to create the Dockerfile and the requirements.txt file.

Getting Docker ready

First, we create the requirements.txt file inside the ml-service dir.

$ pip freeze > requirements.txt

2. Next we create the Dockerfile.

FROM python:3.10.8-slim
LABEL description="Sentiment classifier of tweets service"
WORKDIR /app
COPY requirements.txt /app/requirements.txt
RUN pip install -r requirements.txt
COPY . /app/
EXPOSE 6000
CMD ["python",  "app.py"]

3. Now create the docker image by using docekr run or creating a docker-compose.yml file.

Let’s look at how to use docker-compose to create the container.

3. Create a docker-compose.yml file in the root folder. And add the following to it.

version: '3

services: 
  ml-service:
    build: ./ml-service
    ports:
      - 6000:6000

4. Now just run the docker-compose command.

$ docker-compose up --build

5. Open a separate shell and try running docker ps. It will show all the running containers. The container should be up and running.

Make a POST request to localhost:6000/sentiment with body as {"text": "Hi, Thanks"} .

The response to the POST request should be something like this:

{
  "label": "Positive",
  "score": 0.78
}

That’s all for now👋

Thank you and Happy Coding 💙