How to Deploy a Machine Learning Model on AWS SageMaker?

Machine learning has its fair share of ‘heroic’ contributions in solving problems for businesses, society, and the environment that felt next to impossible previously. 

It has got to be one of the greatest gifts by humankind to humankind.

In terms of business, machine-learning plays a huge role in amplifying the success of companies, enabling them to make smart choices in the future and deciding how (and why) software must be designed. 

And to enjoy greater benefits of machine-learning, it must be used on a large scale and accessed by all developers, expert practitioners, or data scientists. 

Thanks to AWS SageMaker, this is now possible. 

So grab a cup of coffee and sit back because now we’re about to tell you everything about what an AWS SageMaker is and how to deploy an ML model on it. 

Let’s begin with briefly discussing what an AWS SageMaker is. 

AWS SageMaker is a fully managed service provided by Amazon Web Services to train and deploy a machine learning model in production environments.

The best thing about it is that you can leverage it for any use case and enjoy it with its fully managed infrastructures, tools, and workflows. 

To deploy it, you must follow the below-mentioned steps: 

3 Steps to Deploy a Machine Learning Model on AWS SageMaker

The deployment of the machine-learning model on AWS can be done through the following steps: 

  • Saving model artifacts on AWS S3 bucket in tar.gz format.
  • Creating an inference handler .py file.
  • Deploying inference file and model artifacts 

Let’s discuss each in detail:

1. Saving model artifacts on AWS S3 bucket in tar.gz format.

So, as a first step, we often save our machine learning models artifacts in pickle, .h5, .pth files depending upon the libraries we use like sklearn, tensorflow, pytorch etc., to reuse them for prediction and avoid doing model training every time.

At first we need to convert this model artifact files into .tar.gz files. Once this is done, we can easily upload this compressed tar.gz package to any S3 bucket on our AWS account through awswrangler or by using AWS CLI command.

aws s3 cp ./model.tar.gz s3://mybucket

2. Creating an inference handler .py file.

The next step is to create an inference handler script. This inference scripts consists of 4 functions, to reserialize your saved model package, preprocess model input, predict and return required output.
def model_fn(model_dir):
def input_fn(request_body, request_content_type):
def predict_fn(input_data, model):
def output_fn(prediction, content_type):

mode_fn function will deserialize your machine learning model, this function will fetch model from model_dir, initialize model object, and load model artifacts from specified path and return the model. We can see an example of a pytorch model loading here.

def model_fn(model_dir):
model = nn.Sequential(
nn.Linear(inputsize, 2048),
nn.Linear(2048, 10),

with open(os.path.join(model_dir, 'model_0.pth'), 'rb') as f:
model.load_state_dict(torch.load(f))'Done loading model')
return model

3. Deploying inference file and model artifacts 

As of now, model loading is done, the next step is to handle input requests. For example, while calling a model API, we input required variables in the format of a json file.  Input_fn function is a dedicated function that will preprocess this input json and convert it into a format that we can input to the loaded model.

def input_fn(request_body, content_type='application/json'):
if content_type == 'application/json':
input_data = json.loads(request_body)
url = input_data['url']
image_data =, stream=True).raw)
​return image_data

So, in this function we are just parsing the input json, that was sending an image url, We just unfold that url, extract the image from that url and return it. You can also apply any kind of transformation here on input data.

def predict_fn(input_data, model):
with torch.no_grad():
out = model(input_data)
ps = torch.exp(out)
return ps

In predict_fn, we just passed the output of input_fn and model_fn. Model will predict on this input data, and return the prediction, that will be received in output_fn.

For example, output_fn returns class number in case of a classifier. You can map this class number to class name and return it in the form of a json as well. In this way it can be integrated into another application in a standard format.

def output_fn(prediction_output, accept='application/json'):
classes = {0: 'Covid Patient', 1: 'Not a Covid Patient'}

topk, topclass = prediction_output.topk(3, dim=1)
result = []

for i in range(3):
pred = {'prediction': classes[topclass.cpu().numpy()[0][i]], 'score': f'{topk.cpu().numpy()[0][i] * 100}%'}
if accept == 'application/json':
return json.dumps(result), accept

This is all about inference scripts. We developed four standard functions, that load model, preprocess input, predict and output predictions and they execute in an order.

Compile all these functions in an file.

Now, create another jupyter notebook, and just deploy this inference file by using this. 

You can add compute and other configurations here and refer to your model artifacts file and file as entry point.

from sagemaker.pytorch import PyTorchModel
from sagemaker import get_execution_role
role = get_execution_role()
PyTorchModel(model_data='s3://mybucket/model.tar.gz', role=role, entry_point='', framework_version='1.3.1')
predictor = model.deploy(instance_type='ml.t2.medium', initial_instance_count=1)

And that’s about it. 

What does AWS SageMaker offer? 

Now that you’ve learned how to deploy a machine-learning model on AWS SageMaker let's talk about why you should opt for it. 

Well, you can use it to build AI-based applications that can be used to analyze images and videos, detect faults and defects, control quality and more. 

Thanks to AWS SageMaker, you’ll be able to automate mundane tasks, and make workflow processes more efficient and seamless. 

AWS SageMaker is a game changer for businesses. It has significantly outperformed all competitors by providing better services than any cloud service ever. 

And to ensure that users continue to enjoy innovative and cost-efficient solutions, Data Pilot leverages AWS SageMaker to provide the best solutions in the market. Data Pilot’s services using AWS SageMaker will enable users to automate numerous mundane tasks and invest their time and energy on more profitable ideas or strategies. Explore Data Pilot’s use cases to learn how you and your business can benefit from our services at the best pricing. 

Written by: Irfan Umar and Rida Ali Khan

Related Blogs