Pandas and Python…..Oh My.

21 Dec

Crunching data and rearranging data in Python is cool, but I really need to visualize it. Nothing fancy, just a bar or line chart. I recently saw a D3 implementation in Python – awesome, but for now I just want to stick to Matplotlib. I grabbed a few books on scientific python and data in python. They seem to love IPython – the web notebook is pretty cool. Another tool that is often mention is Pandas. This is what I want to use – it uses Matplotlib for plotting. The one feature that caught my attention right away was the DataFrame. Think of it as a Excel spreadsheet, then if anyone asks about it, tell them it’s like data.frame() in R.  When I learn something, I like to start bare bones, then build it up with extra options and variations. I have put together some very minimal examples of plotting DataFrames and a Series in Pandas. From here you should have a good grasp of how to do more.

Plotting a Series.

Plotting a Series.

Plotting a Series requires a Series and a type of chart. Here is my code:

from pandas import Series
import matplotlib.pyplot as plt

b=[2,4,6,8,10]
a=Series(b,index=[‘a’,’b’,’c’,’d’,’e’])
Series.plot(a, kind=’bar’) #change to ‘barh’ for horizontal. Can also declare kind=’line’
plt.show()

Plotting a DataFrame

Plotting a DataFrame

 

Plotting a DataFrame is what I need the most in my work. Earlier I compared a DataFrame to an Excel spreadsheet. Here is what a DataFrame looks like:

Simple DataFrame

Simple DataFrame

Looking at the DataFrame and the Chart,  notice that each row plots as a group labeled by the index and columns. The DataFrame is created by passing a Numpy Array.:

a=np.array([[3,6,8,9,6],[2,3,4,5,6],[4,5,6,7,8],[3,6,5,8,6],[5,8,8,6,5]])
df=DataFrame(a, columns=[‘a’,’b’,’c’,’d’,’e’], index=[2,4,6,8,10])

To plot the chart, just call plot and pass a type. Here is the complete code:

from pandas import DataFrame
import matplotlib.pyplot as plt
import numpy as np

a=np.array([[3,6,8,9,6],[2,3,4,5,6],[4,5,6,7,8],[3,6,5,8,6],[5,8,8,6,5]])
df=DataFrame(a, columns=[‘a’,’b’,’c’,’d’,’e’], index=[2,4,6,8,10])

df.plot(kind=’bar’)
plt.show()

This is how I learned to use Pandas DataFrame and to plot my data. Knowing this, I felt much more comfortable looking at more advanced examples online.

About these ads

2 Responses to “Pandas and Python…..Oh My.”

Trackbacks/Pingbacks

  1. Matplotlib Bar Chart « Architecture and Planning - December 21, 2012

    […] did a post on Pandas and plotting. In that post, Matplotlib was used from Pandas. In this post, I will show a simple bar chart in […]

  2. IPython Notebook « Architecture and Planning - December 27, 2012

    […] view notebooks on the web using the website http://nbviewer.ipython.org. I have a notebook for my Pandas Bar Chart Example. On the viewer site, enter the URL: […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 39 other followers

%d bloggers like this: