Skip to content

Latest commit

 

History

History

pytorch-grpc-serving

pytorch-grpc-serving

Serving PyTorch Models for Inference as gRPC API

Directory Structure

  • model - example torchscript module with example image and class to label mapping file
  • protos - protocol buffer defintion for gRPC service
  • client.py - gRPC client example for connecting to gRPC server to run inference on sample file in model directory
  • Dockerfile - containerize the gRPC server to for deployment
  • Makefile - commands for generating stubs, running gRPC service, building docker image, other chores
  • requirements.txt - python requirements for running this service
  • server.py - gRPC server for loading and running model inference

Getting Started

Releasing Model

  • Refer model/torchscript.py for conversion of torchvision model to TorchScript module
  • TorchScript module is written as follows:
class YourModule(nn.Module):
    def __init__(self):
        # initialize the quantized model with pretrained weights
        # load class to label dictionary
    def forward(self, input):
        # run forward pass and compute classes with its probabilities
        # map classes to labels
        # return result
  • The example file uses ImageNet Classes for mapping imagenet class to its human readable label
  • Run following command to create TorchScript module:
cd model
python torchscript.py save
  • Run following command to load TorchScript module and run inference over sample image:
cd model
python torchscript.py run

Setting up gRPC Service

  • Refer protos/inference.proto for Protocol Buffer definition of the gRPC Service
  • Run following command to generate the stubs:
make proto

Running gRPC Server

  • Run the following command to start the gRPC Server
python server.py

Running gRPC Client

  • Run the following command to start the gRPC Client
python client.py <url>
# python client.py localhost:8000

Example

  • Example server with quantized ResNet50 is hosted on Fly.
  • Configure client to run inferencing on the hosted server:
python client.py pytorch-serving.fly.dev:8000
  • Example Output:
health: UP
prediction: {"pizza": 0.44644981622695923, "potpie": 0.009865873493254185, "hot pot": 0.007180684246122837, "consomme": 0.005226321052759886, "spatula": 0.0047011710703372955}

License

This project is licensed under MIT License.