Tensorflow Serving | Deployment

Introduction

TensorFlow Serving is a flexible, high-performance serving system for machine learning models designed for production environments. It allows you to deploy and serve machine learning models created with TensorFlow in a simple and efficient way. This tutorial will guide you through the process of setting up TensorFlow Serving, deploying a Keras model, and making predictions.

Prerequisites

Before you start, ensure you have the following installed:

Docker: TensorFlow Serving can be easily run in a Docker container.
TensorFlow and Keras: To create a model that we will serve.

Creating a Keras Model

First, we need to create a simple Keras model. Below is an example of how to create a basic model and save it for serving.

Example: Create and Save a Keras Model

import tensorflow as tf from tensorflow import keras import numpy as np # Create a simple model model = keras.Sequential([ keras.layers.Dense(10, activation='relu', input_shape=(5,)), keras.layers.Dense(1) ]) model.compile(optimizer='adam', loss='mean_squared_error') # Generate dummy data x_train = np.random.rand(1000, 5) y_train = np.random.rand(1000, 1) # Train the model model.fit(x_train, y_train, epochs=10) # Save the model model.save('my_model')

Setting Up TensorFlow Serving

TensorFlow Serving can be run using Docker. You can pull the TensorFlow Serving image and run it to serve your model. Here’s how you can do that:

Example: Run TensorFlow Serving with Docker

docker pull tensorflow/serving docker run -p 8501:8501 --name=tf_serving \ --mount type=bind,source=$(pwd)/my_model,target=/models/my_model \ -e MODEL_NAME=my_model -t tensorflow/serving

In this command, we are binding the local directory where our model is saved to the Docker container, and we specify the model name.

Making Predictions

Once TensorFlow Serving is running, you can make requests to the model's REST API to get predictions. Here’s an example of how to do that using Python:

Example: Make Predictions

import requests import json # Prepare the data data = json.dumps({"signature_name": "serving_default", "instances": [[0.1, 0.2, 0.3, 0.4, 0.5]]}) # Make a POST request to the model server headers = {"content-type": "application/json"} response = requests.post('http://localhost:8501/v1/models/my_model:predict', data=data, headers=headers) # Print the prediction print(response.json())

This code snippet demonstrates how to send a JSON payload to the TensorFlow Serving API and retrieve the model's prediction.

Conclusion

In this tutorial, you have learned how to create a Keras model, set up TensorFlow Serving using Docker, and make predictions through the model's REST API. TensorFlow Serving provides a robust way to deploy machine learning models in production, allowing for easy updates and management.

Tutorial on TensorFlow Serving