How To Save & Load Trained ML Model

In machine learning, when working with model training and testing, we often need to save and restore the trained models in a file, to reuse them to compare the model with other models, and to deploy the model on to another place for new data. Data saving in a file is called Serialization, while data restoration is called Deserialization.

We are also interested in various data forms and sizes. Some datasets are easily trained i.e. they take less time to train but even with GPU the datasets whose size is huge (more than 1 GB or more ) will take very long to train on a local machine. If in another project, or sometime later, we need the same trained data to avoid wasting the training time, store trained model so that it can be used sometime in the future.

We will be covering the following 2 approaches to Save and Reload an ML Model.

  1. Pickle Approach
  2. Joblib Approach

For the purpose of demonstration let create a simple Knn model using a scikit-learn library with iris dataset, which is preloaded in scikit library.

Defining and traning KNeighborsClassifer with Iris dataset

First Approach

let's save and load our Knn model using pickle approach. the pickle module is used for the serialization and de-serialization of an object structure in the Python . The following functions are given in the pickle module.

– pickle.dump(): For the serialization of an object.

– pickle.load(): To deserialize data.

Saving and Loading compiled Knn model with the pickle module.

Second Approach

joblib replaces pickled objects with large numpy arrays because it is more powerful. Instead of filenames, these functions accept file-like objects. The following functions are given in the pickle module.

–joblib.dump():To serialize an object.

–joblib.load(): To deserialize an object.

Saving and Loading compiled model with joblib module

Summary

In this article you have found out how your machine learning model can be saved and loaded with pickle and joblib packages. You learned two techniques :

1. The pickle API for basic Python serialization.

2. The joblib API for powerful Python object serialization with NumPy arrays.

Written by

I’m Data Science student.I love to create, learn and share my skills. learning a new technology, brushing up on current skills or writing Data Science articles.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store