==========
Plotting
==========
The objective of this lab is to give you practice in plotting data
and, more specifically, on how to work with the Matplotlib library.
Getting started
---------------
.. include:: includes/getting-started-labs.txt
Once you have collected the lab materials, navigate to the ``lab8``
directory.
If you are on a VM, make sure that Matplotlib is installed on the VM.
There was a Piazza post a long time ago asking you to do this, but it was only required
if you wanted to see the plots on Lab #2 so, in case you didn't install Matplotlib at
the time, make sure you run the following::
sudo apt-get update
sudo apt-get install python3-matplotlib
Matplotlib
----------
`Matplotlib `_ is a popular plotting library
for Python. It supports a `variety of plots `_
that can be tweaked and customized in many different ways. So, while
producing simple plots with Matplotlib is very easy, getting
all the details right (and figuring out the exact Matplotlib
code to do so) can sometimes be challenging.
Thus, when working with Matplotlib, it is common to follow two steps:
1. Start by experimenting with Matplotlib interactively from a Python interpreter.
When doing this, each call to a Matplotlib function will usually alter
a plot interactively, which is very convenient when figuring out
the exact Matplotlib code for our program.
2. Once we have figured out the code to produce our plot, we save it
to a Python program which, when run, produces the full plot in one go
(either displaying it in a window or saving it to a file).
In this lab, we will first go through these two steps in detail with a simple
example. Then, we will show you two plots which you should produce following
the same methodology. This way of working will also be very useful in PA #7.
Plotting interactively with IPython
-----------------------------------
Matplotlib can be used interactively from any Python interpreter, but IPython
in particular has a "pylab" mode that pre-loads all the Matplotlib functions,
allowing us to easily use them from the IPython interpreter. To start the
interpreter in this mode, run the following:
::
ipython3 --pylab
We have provided a ``plotting.py`` file which includes the data that
we will plot in this part of the lab. Run the following to import this
data into the interpreter:
::
from plotting import TEMPS_MIN, TEMPS_AVG, TEMPS_MAX
Each of these variables is a list with 31 floating point numbers,
representing temperatures in Chicago during each day of the month
of January 2014. ``TEMPS_MIN`` contains the minimum temperatures,
``TEMPS_AVG`` the average temperatures, and ``TEMPS_MAX`` the
maximum temperatures. The first element of each list is the
temperature for January 1st, the second element corresponds to January 2nd, etc.
We can plot the average temperatures just by running this::
plot(TEMPS_AVG)
This should open up a Matplotlib window (titled "Figure 1") with a graph
that looks roughly like this:
.. image:: img/simple1.png
:target: _images/simple1.png
Don't worry if the graphs you see don't look *exactly* like the
ones you see on this page; they just have to look roughly the same.
Before continuing, close the Matplotlib window (i.e., the window with
the graph; never close the window that is running IPython).
Now, let's try plotting multiple lines. Start by running just this::
plot(TEMPS_AVG)
The same graph as before should appear. If possible, move the Matplotlib window
in such a way that you can see both the graph and
the IPython interpreter. Now, **without closing the Matplotlib window**
run the following::
plot(TEMPS_MIN)
plot(TEMPS_MAX)
Two additional lines should appear, and you should see them appear on the
Matplotlib window. The result should look like this:
.. image:: img/simple2.png
:target: _images/simple2.png
As you can see, given a list of values, we can very easily create a line
plot just by calling the ``plot()`` function. However, the resulting graph
is very basic: it has no title, no legend, no axis labels, etc.
Let's produce a more complete version of this graph. Close the Matplotlib window
and run the following on the IPython interpreter. The first call to ``plot()`` will open a
Matplotlib window. Notice how every call after that (not just the other
two ``plot()`` calls) modifies the plot interactively::
plot(TEMPS_MAX, color="orange", label="Max Temp")
plot(TEMPS_AVG, color="green", label="Avg Temp")
plot(TEMPS_MIN, color="blue", label="Min Temp")
title("Temperatures in Chicago from 1/1/14 to 1/31/14")
xlabel("Day")
ylabel("Temperature (F)")
axhline(32, color="gray", linestyle="--")
legend()
The resulting graph should look something like this:
.. image:: img/simple3.png
:target: _images/simple3.png
Writing plotting code in a Python program
-----------------------------------------
Now, let's see how the plotting code we wrote works when we
include it in a Python program. Edit the ``plotting.py``
file to include this function::
def simple_plot():
plt.plot(TEMPS_MAX, color="orange", label="Max Temp")
plt.plot(TEMPS_AVG, color="green", label="Avg Temp")
plt.plot(TEMPS_MIN, color="blue", label="Min Temp")
plt.title("Temperatures in Chicago from 1/1/14 to 1/31/14")
plt.xlabel("Day")
plt.ylabel("Temperature (F)")
plt.axhline(32, color="gray", linestyle="--")
plt.legend()
Notice how the calls to the Matplotlib functions start with ``plt.``.
This is because the Matplotlib functions are not loaded the same
way as in the IPython interpreter. We need to import them explictly like this::
import matplotlib.pyplot as plt
Now, exit the IPython interpreter, and run ipython3 *without* the ``--pylab`` option::
ipython3
Now, run the ``plotting.py`` file::
run plotting.py
And call the ``simple_plot()`` function we just added::
simple_plot()
At this point, nothing should happen. The reason for this is that, when
we're not running code interactively, we need to *explicitly* tell
Matplotlib to show us the plot. We can do so like this::
plt.show()
Notice that, when running Matplotlib non-interactively, your code will
*block* whenever you call ``plt.show()``. I.e., you need to close the
Matplotlib window for your program to continue running (or, in this case,
to return to the IPython interpreter).
Instead of calling ``show()`` from IPython, let's add it to our
``simple_plot()`` function::
def simple_plot():
plt.plot(TEMPS_MAX, color="orange", label="Max Temp")
plt.plot(TEMPS_AVG, color="green", label="Avg Temp")
plt.plot(TEMPS_MIN, color="blue", label="Min Temp")
plt.title("Temperatures in Chicago from 1/1/14 to 1/31/14")
plt.xlabel("Day")
plt.ylabel("Temperature (F)")
plt.axhline(32, color="gray", linestyle="--")
plt.show()
However, there is a case where the above code will behave badly.
Run this::
plt.plot(range(31), color="red")
Nothing should happen (i.e., there should no new window with this
plot). Now, run this::
run plotting.py
simple_plot()
You will see the temperature plot, but also a red line running through it.
The reason this happened is that, when creating a new plot, Matplotlib
will include *all* the plotting commands we run before ``show()``. One
way of ensuring that we produce a plot only with the elements we want
is to call the ``figure()`` function, which basically indicates that
all the Matplotlib code that follows that call (and until ``show()``
is called) is part of the same "figure"::
def simple_plot():
plt.figure()
plt.plot(TEMPS_MAX, color="orange", label="Max Temp")
plt.plot(TEMPS_AVG, color="green", label="Avg Temp")
plt.plot(TEMPS_MIN, color="blue", label="Min Temp")
plt.title("Temperatures in Chicago from 1/1/14 to 1/31/14")
plt.xlabel("Day")
plt.ylabel("Temperature (F)")
plt.axhline(32, color="gray", linestyle="--")
plt.show()
Verify this is working correctly by running this::
run plotting.py
simple_plot()
The last modification we will make to our code is to add the ability
to save the plot to a file instead of showing it in a window. We can
do this by using the figure's ``savefig()`` method. Modify the
``simple_plot()`` function so it looks like this::
def simple_plot(save_to = None):
fig = plt.figure()
plt.plot(TEMPS_MAX, color="orange", label="Max Temp")
plt.plot(TEMPS_AVG, color="green", label="Avg Temp")
plt.plot(TEMPS_MIN, color="blue", label="Min Temp")
plt.title("Temperatures in Chicago from 1/1/14 to 1/31/14")
plt.xlabel("Day")
plt.ylabel("Temperature (F)")
plt.axhline(32, color="gray", linestyle="--")
plt.legend()
if save_to is None:
plt.show()
else:
fig.savefig(save_to)
Notice how we've added a ``save_to`` parameter that defaults to ``None``.
When we supply a string parameter, the figure is saved to the file specified by
that parameter::
run plotting.py
simple_plot("temperatures.png")
There should now be a ``temperatures.png`` file in the same directory
as ``plotting.py``. If you open this file, it should contain the same
graph that was displayed previously in a Matplotlib window.
When it is not specified, we simply see the plot in a window
as before::
simple_plot()
Including a parameter like this can make your function easier
to debug, since you can easily switch from saving to a file to viewing
the plot in a new window.
Plotting Weather and Crime Data
-------------------------------
Now that you've worked with some simple Matplotlib code, it's time to
produce a slightly more elaborate graph. The data for this graph
is contained in a CSV file called ``weather_crime.csv`` that contains
the average temperature in Chicago and the number of reported thefts
in Chicago for every day between 1/1/2012 and 12/31/2013. The first
few rows of the file look like this::
year,month,day,thefts,temp
2012,1,1,666,37.0
2012,1,2,267,24.0
2012,1,3,338,20.0
2012,1,4,345,33.0
2012,1,5,382,36.0
2012,1,6,396,47.0
Your code must read this data in (you may want to review the CSV examples from
the `Data Formats `_ lecture)
and you must produce a graph that plots
the number of thefts and the temperature over those two years:
.. image:: img/weather_crime.png
:target: _images/weather_crime.png
This graph informally shows a well-known correlation between certain types
of crimes and the weather. In particular, there are fewer thefts in the winter
because people (including criminals) tend to stay indoors.
Although this plot is more elaborate than the simple plot we saw earlier,
you should be able to produce it with ``plot`` and other Matplotlib
functions covered in class.
As you attempt to reproduce this graph, we encourage you to follow the same two steps
we followed earlier: first play around with Matplotlib in the IPython interpreter
(remember to run it with the ``--pylab`` option), and then write a function
in ``plotting.py`` that produces this graph.
Creating an Error Bar Graph
---------------------------
This final exercise is a bit more challenging, because it involves producing
a type of graph we did *not* see in class:
.. image:: img/errorbar.png
:target: _images/errorbar.png
This graph shows the average, maximum, and minimum temperatures for each month
between January 2012 and December 2012. To produce this graph, you must take
the data in ``weather_crime.csv`` and compute the average, maximum, and minimum
temperatures before you can call the Matplotlib functions. You may find it
helpful to read the Matplotlib documentation on `creating error bars `_
and the Matplotlib `gallery page on error bars `_.
As before, first write your code in IPython, and then write a function
in ``plotting.py`` that produces this graph.
When finished
-------------
.. include:: includes/finished-labs-1.txt
.. code::
git add plotting.py
git commit -m "Finished with lab8"
git push
.. include:: includes/finished-labs-2.txt