Pickles in Python are tasty in the sense that they represent a Python object as a string of bytes. Many things can actually be done with those bytes. For instance, you can store them in a file or database, or transfer them over a network.
The pickled representation of a Python object is called a pickle file. The pickled file thus can be used for different purposes, like storing results to be used by another Python program or writing backups. To get the original Python object, you simply unpickle that string of bytes.
To pickle in Python, we will be using the pickle
module. As stated in the documentation:
The pickle module implements binary protocols for serializing and
de-serializing a Python object structure. “Pickling” is the process
whereby a Python object hierarchy is converted into a byte stream, and
“unpickling” is the inverse operation, whereby a byte stream (from a
binary file or bytes-like object) is converted back into an object
hierarchy. Pickling (and unpickling) is alternatively known as
“serialization”, “marshalling,” or “flattening”; however, to avoid
confusion, the terms used here are “pickling” and “unpickling”.
The pickle
module allows us to store almost any Python object directly to a file or string without the need to perform any conversions. What the pickle
module actually performs is what’s so called object serialization, that is, converting objects to and from strings of bytes. The object to be pickled will be serialized into a stream of bytes that can be written to a file, for instance, and restored at a later point.
Installing pickle
The pickle
module actually comes already bundled with your Python installation. In order get a list of the installed modules, you can type the following command in the Python prompt: help('modules')
.
So all you need to do to work with the pickle
module is to import pickle
!
Creating a Pickle File
From this section onwards, we’ll take a look at some examples of pickling to understand the concept better. Let’s start by creating a pickle file from an object. Our object here will be the todo
list we made in the Python’s lists tutorial.
todo = ['write blog post', 'reply to email', 'read in a book']
In order to pickle our list object (todo
), we can do the following:
import pickle todo = ['write blog post', 'reply to email', 'read in a book'] pickle_file = file('todo.pickle', 'w') pickle.dump(todo, pickle_file)
Notice that we have made an import pickle
to be able to use the pickle
module. We have also created a pickle file to store the pickled object in, namely todo.pickle
. The dump
function writes a pickled representation of todo
to the open file object pickle_file
. In other words, the dump
function here has two arguments: the object to pickle, which is the todo
list, and a file object where we want to write the pickle, which is todo.pickle
.
Unpickling (Restoring) the Pickled Data
Say that we would like to unpickle (restore) the pickled data; in our case, this is the todo
list. To do that, we can write the following script:
import pickle pickle_file = file('todo.pickle') todo = pickle.load(pickle_file) print(todo)
The above script will output the todo
list items:
['write blog post', 'reply to email', 'read in a book']
As mentioned in the documentation, the load(file)
function does the following:
Read a string from the open file object file and interpret it as a pickle data stream, reconstructing and returning the original object hierarchy. This is equivalent to
Unpickler(file).load()
.
Pickles as Strings
In the above section, we saw how we can write/load pickles to/from a file. This is not necessary, however. I mean that if we want to write/load pickles, we don’t always need to deal with files—we can instead work with pickles as strings. We can thus do the following:
import pickle todo = ['write blog post', 'reply to email', 'read in a book'] pickled_data = pickle.dumps(todo) print(pickled_data)
Notice that we have used the dumps
(with an “s” at the end) function, which, according to the documentation:
Returns the pickled representation of the object as a string, instead of writing it to a file.
In order to restore the pickled data above, we can use the loads(string)
function, as follows:
restored_data = pickle.loads(pickled_data)
According to the documentation, what the loads
function does is that it:
Reads a pickled object hierarchy from a string. Characters in the string past
the pickled object’s representation are ignored.
Pickling More Than One Object
In the above examples, we have dealt with pickling and restoring (loading) only one object at a time. In this section, I’m going to show you how we can do that for more than one object. Say that we have the following objects:
name = 'Abder' website = 'http://abder.io' english_french = {'paper':'papier', 'pen':'stylo', 'car':'voiture'} # dictionary tup = (31,'abder',4.0) # tuple
If you would like to learn more about Python dictionaries and tuples, check the following tutorials:
- A Smooth Refresher on Python’s Dictionaries
- A Smooth Refresher on Python’s Tuples
We can simply pickle the above objects by running a series of dump
functions, as follows:
import pickle name = 'Abder' website = 'http://abder.io' english_french = {'paper':'papier', 'pen':'stylo', 'car':'voiture'} # dictionary tup = (31,'abder',4.0) # tuple pickled_file = open('pickled_file.pickle', 'w') pickle.dump(name, pickled_file) pickle.dump(website, pickled_file) pickle.dump(english_french, pickled_file) pickle.dump(tup, pickled_file)
This will pickle all the four objects in the pickle file pickled_file.pickle
.
There is another wonderful way to write the above script using the Pickler
class in the pickle
module, as follows:
from pickle import Pickler name = 'Abder' website = 'http://abder.io' english_french = {'paper':'papier', 'pen':'stylo', 'car':'voiture'} # dictionary tup = (31,'abder',4.0) # tuple pickled_file = open('pickled_file.pickle', 'w') p = Pickler(pickled_file) p.dump(name); p.dump(website); p.dump(english_french); p.dump(tup)
To restore (load) the original data, we can simply use more than one load
function, as follows:
import pickle pickled_file = open('pickled_file.pickle') name = pickle.load(pickled_file) website = pickle.load(pickled_file) english_french = pickle.load(pickled_file) tup = pickle.load(pickled_file) print('Name: ') print(name) print('Website:') print(website) print('Englsh to French:') print(english_french) print('Tuple data:') print(tup)
The output of the above script is:
Name: Abder Website: http://abder.io Englsh to French: {'car': 'voiture', 'pen': 'stylo', 'paper': 'papier'} Tuple data: (31, 'abder', 4.0)
As with the Pickler
class, we can rewrite the above script using the Unpickler
class in the pickle
module, as follows:
from pickle import Unpickler pickled_file = open('pickled_file.pickle') u = Unpickler(pickled_file) name = u.load(); website = u.load(); english_french = u.load(); tup = u.load() print('Name: ') print(name) print('Website:') print(website) print('English to French:') print(english_french) print('Tuple data:') print(tup)
Note that the variables have to be written and read in the same order to get the desired output. To avoid any issues here, we can use a dictionary to administer the data, as follows:
import pickle name = 'Abder' website = 'http://abder.io' english_french = {'paper':'papier', 'pen':'stylo', 'car':'voiture'} # dictionary tup = (31,'abder',4.0) # tuple pickled_file = open('pickled_file.pickle', 'w') data = {'name':name, 'website':website,'english_french_dictionary':english_french,'tuple':tup } pickle.dump(data, pickled_file)
To restore (load) the data pickled in the above script, we can do the following:
import pickle pickled_file = open('pickled_file.pickle') data = pickle.load(pickled_file) name = data['name'] website = data['website'] english_french = data['english_french_dictionary'] tup = data['tuple'] print('Name: ') print(name) print('Website:') print(website) print('English to French:') print(english_french) print('Tuple data:') print(tup)
Pickles and Pandas
Well, this seems an interesting combination. If you are wondering what Pandas are, you can learn more about them from the Introducing Pandas tutorial. The basic data structure of pandas
is called DataFrame
, a tabular data structure composed of ordered columns and rows.
Let’s take an example of DataFrame
from the Pandas tutorial:
import pandas as pd name_age = {'Name' : ['Ali', 'Bill', 'David', 'Hany', 'Ibtisam'], 'Age' : [32, 55, 20, 43, 30]} data_frame = pd.DataFrame(name_age)
In order to pickle our DataFrame
, we can use the to_pickle()
function, as follows:
data_frame.to_pickle('my_panda.pickle')
To restore (load) the pickled DataFrame
, we can use the read_pickle()
function, as follows:
restore_data_frame = pd.read_pickle('my_panda.pickle')
Putting what we have mentioned in this section all together, this is what the script that pickles and loads a pandas object looks like:
import pandas as pd name_age = {'Name' : ['Ali', 'Bill', 'David', 'Hany', 'Ibtisam'], 'Age' : [32, 55, 20, 43, 30]} data_frame = pd.DataFrame(name_age) data_frame.to_pickle('my_panda.pickle') restore_data_frame = pd.read_pickle('my_panda.pickle') print(restore_data_frame)
Conclusion
In this tutorial, I have covered an interesting module called pickle
. We have seen how easily this module enables us to store Python objects for different purposes, such as using the object with another Python program, transferring the object across a network, saving the object for later use, etc. We can simply pickle the Python object, and unpickle (load) it when we want to restore the original object.
Don’t hesitate to see what we have available for sale and for study in the marketplace, and don’t hesitate to ask any questions and provide your valuable feedback using the feed below.