Matplotlib: Difference between revisions

From OpenHatch wiki
Content added Content deleted
imported>Brittag
(rv spam)
Line 1: Line 1:
[[File:grid.png|right|300px]]
[[File:grid.png|right|300px]]


== Project ==
Howdy very nice web site!! Guy .. Excellent .. Superb .. I'll bookmark your web site and take the feeds additionallyI am glad to search out so many helpful info here in the put up, we want develop more techniques in this regard, thanks for sharing. eebdadceckddebdg


Learn how to plot data with the matplotlib plotting library. Ditch Excel forever!
Very nice site!


== Goals ==
Very nice site!


* practice reading data from a file
Very nice site!
* practice using the matplotlib Python plotting library to analyze data and generate graphs



Very nice site!
== Project setup ==

=== 1. Install the project dependencies ===

Installing matplotlib is unfortunately rather complicated on OSX and Windows. Please ask for help if you get stuck or don't know where to start!

First, install <code>numpy</code>. On Linux, you can use your package manager to install the <code>python-numpy</code> package. On Windows and OSX, you can download the appropriate binary from http://sourceforge.net/projects/numpy/files/NumPy/1.7.1/.

Then, install <code>matplotlib</code>. On Linux, you can use your package manager to install the <code>python-matplotlib</code> package. On Windows and OSX, you can download the appropriate binary from https://github.com/matplotlib/matplotlib/downloads/.

If either of these installs fails, fear not! Wave over a staff member and we'll help.

=== 2. Download and un-archive the Matplotlib project skeleton code ===

* http://web.mit.edu/jesstess/www/IntermediatePythonWorkshop/Matplotlib.zip

Un-archiving will produce a <code>Matplotlib</code> folder containing several Python and text files.

=== 3. Test your setup ===

Run the <code>basic_plot.py</code> script in your <code>Matplotlib</code> directory. A window with a graph should pop up.

== Project steps ==

=== 1. Create a basic plot ===

<ol>
<li>
Run <code>python basic_plot.py</code>. This will pop up a window with a dot plot of some data.
</li>
<li>
Open <code>basic_plot.py</code>. Read through the code in this file. The meat of the file is in one line:

<pre>pyplot.plot([0, 2, 4, 8, 16, 32], "o")</pre>

In this example, the first argument to <code>pyplot.plot</code> is the list of y values, and the second argument describes how to plot the data. If two lists had been supplied, <code>pyplot.plot</code> would consider the first list to be the x values and the second list to be the y values.
</li>
<li>Change the plot to display lines between the data points by changing

<pre>pyplot.plot([0, 2, 4, 8, 16, 32], "o")</pre>

to

<pre>pyplot.plot([0, 2, 4, 8, 16, 32], "o-")</pre>

and re-run the script. What changed?
</li>
<li>
Add x-values to the data by changing

<pre>pyplot.plot([0, 2, 4, 8, 16, 32], "o-")</pre>

to

<pre>x_values = [0, 4, 7, 20, 22, 25]
y_values = [0, 2, 4, 8, 16, 32]
pyplot.plot(x_values, y_values, "o-")</pre>

and re-run the script. What changed?

Note how matplotlib automatically resizes the graph to fit all of the points in the figure for you.
</li>
<li>
Read about how to generate random integers on http://docs.python.org/library/random.html#random.randint.

Then, instead of hard-coding y values in <code>basic_plot.py</code>, generate a list of random y values and plot them.

An example plot using random y values might look like this:
<br />
[[File:Basic_plot.png|300px]]
</li>
</ol>

<b>Read these short documents</b>:
* Pyplot tutorial (just this one section; stop before the next section "Controlling line properties"): http://matplotlib.sourceforge.net/users/pyplot_tutorial.html#pyplot-tutorial
* List of line options, including line style and marker shapes and colors: http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.plot

<b>Check your understanding</b>:
* What does matplotlib pick as the x values if you don't supply them yourself?
* What options would you pass to <code>pyplot.plot</code> to generate a plot with red triangles and dotted lines?

=== 2. Plotting the world population over time ===

<ol>
<li>
Run <code>python world_population.py</code>. This will pop up a window with a dot plot of the world population over the last 10,000 years.
</li>
<li>
Open <code>world_population.py</code>. Read through the code in this file.

In this example, we read our data from a file. Open the data file <code>world_population.txt</code> and examine the format of the file.
</li>
<li>
Find the documentation on http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.plot for customizing the linewidth of plots. Then change the world population plot to use a magenta, down-triangle marker and a linewidth of 2.
</li>
</ol>

<b>World population resources</b>:
<ul>
<li>
File input and output: http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files.
</li>
<li>
Splitting sprints into parts based on a delimiter: http://www.hacksparrow.com/python-split-string-method-and-examples.html
</li>
</ul>

<b>Check your understanding</b>:
* In <code>world_population.py</code>, what does <code>file("world_population.txt", "r").readlines()</code> return?
* In <code>world_population.py</code>, what does <code>point.split()</code> return?


=== 3. Plotting life expectancy over time ===

In a new file, write code to plot the data in <code>life_expectancies_usa.txt</code>. The format in this file is <year>,<male life expectancy>,<female life expectancy>.

You can call <code>pyplot.plot</code> multiple times to draw multiple lines on the same figure. For example:

<pre>pyplot.plot(my_data_1, "mo-", label="my data 1")
pyplot.plot(my_data_2, "bo-", label="my data 2")</pre>

will plot <code>my_data_1</code> in magenta and <code>my_data_2</code> in blue on the same figure.

Supply labels for your plots, like above. Then use <code>pyplot.legend</code> to give your graph a legend. Just plain <code>pyplot.legend()</code> will work, but providing more options may give a better effect.

Your graph should look something like this:

[[File:Life_expectancies.png|300px]]

To save your graph to a file instead of or in addition to displaying it, call <code>pyplot.savefig</code>.

<b>Life expectancy resources</b>:
<ul>
<li>
File input and output: http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files.
</li>
<li>
Splitting sprints into parts based on a delimiter: http://www.hacksparrow.com/python-split-string-method-and-examples.html
</li>
<li>
Examples of legends: http://matplotlib.sourceforge.net/examples/pylab_examples/legend_auto.html
</li>
<li>
Ways to configure your legend: http://matplotlib.sourceforge.net/api/legend_api.html
</li>
<li>
Saving your graph to a file: http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.savefig
</li>
</ul>

==Bonus exercises==

=== 1. Letter frequency analysis of the US Constitution ===

# Run <code>python constitution.py</code>. It will generate a bar chart showing the frequency of each letter in the alphabet in the US Constitution.
# Open and read through <code>constitution.py</code>. The code for gathering and displaying the frequencies is a bit more complicated than the previous scripts in this projects, but try to trace the general strategy for plotting the data. Be sure to read the comments!
# Try to answer the following questions:
## On line 11, what is <code>string.ascii_lowercase</code>?
## On line 18, what is the purpose of <code>char = char.lower()</code>?
## What are the contents of <code>labels</code> after the <code>for</code> loop on line 30 completes?
## On line 41, what are the two arguments passed to <code>pyplot.xticks</code>
## On line 44, we use <code>pyplot.bar</code> instead of our usual <code>pyplot.plot</code>. What are the 3 arguments passed to <code>pyplot.bar</code>?
# We've included a mystery text file <code>mystery.txt</code>: an excerpt from an actual novel. Alter <code>constitution.py</code> to process the data in <code>mystery.txt</code> instead of <code>constitution.txt</code>, and re-run the script. What do you notice that is odd about this file? You can read more about this odd novel [http://en.wikipedia.org/wiki/Gadsby_(novel) here].


=== 2. Tour the matplotlib gallery ===

You can truly make any kind of graph with matplotlib. You can even create animated graphs. Check out some of the amazing possibilities, including their source code, at the matplotlib gallery: http://matplotlib.sourceforge.net/gallery.html.

[[File:matplotlib_gallery.png|750px]]


===Congratulations!===

You've read, modified, and created scripts that plot and analyze data using matplotlib. Keep practicing!

[[File:Fireworks.png|150px]]
[[File:Balloons.png|150px]]

Revision as of 21:54, 19 June 2016

Project

Learn how to plot data with the matplotlib plotting library. Ditch Excel forever!

Goals

  • practice reading data from a file
  • practice using the matplotlib Python plotting library to analyze data and generate graphs


Project setup

1. Install the project dependencies

Installing matplotlib is unfortunately rather complicated on OSX and Windows. Please ask for help if you get stuck or don't know where to start!

First, install numpy. On Linux, you can use your package manager to install the python-numpy package. On Windows and OSX, you can download the appropriate binary from http://sourceforge.net/projects/numpy/files/NumPy/1.7.1/.

Then, install matplotlib. On Linux, you can use your package manager to install the python-matplotlib package. On Windows and OSX, you can download the appropriate binary from https://github.com/matplotlib/matplotlib/downloads/.

If either of these installs fails, fear not! Wave over a staff member and we'll help.

2. Download and un-archive the Matplotlib project skeleton code

Un-archiving will produce a Matplotlib folder containing several Python and text files.

3. Test your setup

Run the basic_plot.py script in your Matplotlib directory. A window with a graph should pop up.

Project steps

1. Create a basic plot

  1. Run python basic_plot.py. This will pop up a window with a dot plot of some data.
  2. Open basic_plot.py. Read through the code in this file. The meat of the file is in one line:
    pyplot.plot([0, 2, 4, 8, 16, 32], "o")

    In this example, the first argument to pyplot.plot is the list of y values, and the second argument describes how to plot the data. If two lists had been supplied, pyplot.plot would consider the first list to be the x values and the second list to be the y values.

  3. Change the plot to display lines between the data points by changing
    pyplot.plot([0, 2, 4, 8, 16, 32], "o")

    to

    pyplot.plot([0, 2, 4, 8, 16, 32], "o-")

    and re-run the script. What changed?

  4. Add x-values to the data by changing
    pyplot.plot([0, 2, 4, 8, 16, 32], "o-")

    to

    x_values = [0, 4, 7, 20, 22, 25]
    y_values = [0, 2, 4, 8, 16, 32]
    pyplot.plot(x_values, y_values, "o-")

    and re-run the script. What changed?

    Note how matplotlib automatically resizes the graph to fit all of the points in the figure for you.

  5. Read about how to generate random integers on http://docs.python.org/library/random.html#random.randint. Then, instead of hard-coding y values in basic_plot.py, generate a list of random y values and plot them. An example plot using random y values might look like this:

Read these short documents:

Check your understanding:

  • What does matplotlib pick as the x values if you don't supply them yourself?
  • What options would you pass to pyplot.plot to generate a plot with red triangles and dotted lines?

2. Plotting the world population over time

  1. Run python world_population.py. This will pop up a window with a dot plot of the world population over the last 10,000 years.
  2. Open world_population.py. Read through the code in this file. In this example, we read our data from a file. Open the data file world_population.txt and examine the format of the file.
  3. Find the documentation on http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.plot for customizing the linewidth of plots. Then change the world population plot to use a magenta, down-triangle marker and a linewidth of 2.

World population resources:

Check your understanding:

  • In world_population.py, what does file("world_population.txt", "r").readlines() return?
  • In world_population.py, what does point.split() return?


3. Plotting life expectancy over time

In a new file, write code to plot the data in life_expectancies_usa.txt. The format in this file is <year>,<male life expectancy>,<female life expectancy>.

You can call pyplot.plot multiple times to draw multiple lines on the same figure. For example:

pyplot.plot(my_data_1, "mo-", label="my data 1")
pyplot.plot(my_data_2, "bo-", label="my data 2")

will plot my_data_1 in magenta and my_data_2 in blue on the same figure.

Supply labels for your plots, like above. Then use pyplot.legend to give your graph a legend. Just plain pyplot.legend() will work, but providing more options may give a better effect.

Your graph should look something like this:

To save your graph to a file instead of or in addition to displaying it, call pyplot.savefig.

Life expectancy resources:

Bonus exercises

1. Letter frequency analysis of the US Constitution

  1. Run python constitution.py. It will generate a bar chart showing the frequency of each letter in the alphabet in the US Constitution.
  2. Open and read through constitution.py. The code for gathering and displaying the frequencies is a bit more complicated than the previous scripts in this projects, but try to trace the general strategy for plotting the data. Be sure to read the comments!
  3. Try to answer the following questions:
    1. On line 11, what is string.ascii_lowercase?
    2. On line 18, what is the purpose of char = char.lower()?
    3. What are the contents of labels after the for loop on line 30 completes?
    4. On line 41, what are the two arguments passed to pyplot.xticks
    5. On line 44, we use pyplot.bar instead of our usual pyplot.plot. What are the 3 arguments passed to pyplot.bar?
  4. We've included a mystery text file mystery.txt: an excerpt from an actual novel. Alter constitution.py to process the data in mystery.txt instead of constitution.txt, and re-run the script. What do you notice that is odd about this file? You can read more about this odd novel here.


2. Tour the matplotlib gallery

You can truly make any kind of graph with matplotlib. You can even create animated graphs. Check out some of the amazing possibilities, including their source code, at the matplotlib gallery: http://matplotlib.sourceforge.net/gallery.html.


Congratulations!

You've read, modified, and created scripts that plot and analyze data using matplotlib. Keep practicing!