Serving ViT Models with TorchServe
Serving ViT Model for Feature Embedding using TorchServe Framework
Serving Steps:
Step 1: Install the TorchServe and It’s Requierments
Please use the following URL to install and configure the TorchServe service on your system. https://github.com/pytorch/serve/blob/master/README.md
Step 2: Saving the Model
Run : Python3 model/model.py Note: Before running please define the model name at this file that has been hardcoded.
TorchServe takes the following model artifacts: a model checkpoint file in case of torchscript or a model definition file and a state_dict file in case of eager mode. TorchScript is a way to create serializable and optimizable models from PyTorch code. Any TorchScript program can be saved from a Python process and loaded in a process where there is no Python dependency. Actually, torchScript is a common way to do inference with a trained model, an intermediate representation of a PyTorch model that can be run in Python as well as in a high performance environment like C++. TorchScript is actually the recommended model format for scaled inference and deployment.
Step 3: Creating the Handler File
Customize the behavior of TorchServe by writing a Python script that you package with the model when you use the model archiver. TorchServe executes this code when it runs. Please check the handler/vit_handlr.py file.
Step 4: Exporting the .mar File (Torch Model Archiver)
Run : torch-model-archiver –model-name ViT –version 1.0 –serialized-file checkpoints/clip-vit-large-patch14.pt –export-path model_store –handler handler/vit_handler.py
A key feature of TorchServe is the ability to package all model artifacts into a single model archive file. It is a separate command line interface (CLI), torch-model-archiver, that can take model checkpoints or model definition file with state_dict, and package them into a .mar file. This file can then be redistributed and served by anyone using TorchServe. It takes in the following model artifacts: a model checkpoint file in case of torchscript or a model definition file and a state_dict file in case of eager mode, and other optional assets that may be required to serve the model. The CLI creates a .mar file that TorchServe’s server CLI uses to serve the models.
Step 5: Start TorchServe to serve the model
Run : torchserve –start –ncs –model-store model_store/ –models ViT.mar
Step 6: Get predictions from a model
To test the model server, send a request to the server’s predictions API. TorchServe supports all inference and management api’s through both gRPC and HTTP/REST. Downlaod a sample iamge : curl -O https://raw.githubusercontent.com/pytorch/serve/master/docs/images/kitten_small.jpg
Run: curl http://127.0.0.1:8082/predictions/ViT -T kitten_small.jpg
to stop the serve use the following command : torchserve –stop