Archive | September, 2013

More pymongo

13 Sep

20130913-134209.jpg

I am back in MongoDB mode. I grabbed some pollen data from the City of Albuquerque. It has a date, location, pollen type and the count from 2004 to 2013. I loaded this in to MongoDB from a CSV with this script:
from pymongo import MongoClient

client=MongoClient()
db=client.abq
c=db.pollen

file=open(“pollen.csv”)
for x in file.readlines():
temp=x.split(“,”)
tempdate=temp[0].split(“T”)
date=tempdate[0]
location=temp[1]
type=temp[2]
count=temp[3]
c.insert({“date”:date,”location”:location,”type”:type,”count”:count})

I can then query for elm data sorted by date.
from pymongo import MongoClient

client=MongoClient()
db=client.abq
c=db.pollen

file=open(“elm.txt”,”w+”)

x=c.find({“type”:”Elm”}).sort(“date”)
for entry in x:
s=entry[“count”]
file.write(s+”\n”)

Great! But I want all elm data on the east side of ABQ sorted by date and plotted. Easy! Matplotlib and Pandas help out here:

from pandas import Series
import matplotlib.pyplot as plt
from pymongo import MongoClient
client=MongoClient()
db=client.abq
c=db.pollen
x=c.find({“$and”:[{“type”:”Elm”},{“location”:”EASTSIDE”}]}).sort(“date”)

b=[]
labels=[]

for w in x:
s=w[“count”]
i=int(s)
b.append(i)
t=w[“date”]
labels.append(t)

a=Series(b,index=labels)
Series.plot(a,kind=’bar’)

plt.show()

Advertisements

MongoDB, pymongo and GridFS

11 Sep

It has been a while since I’ve done anything with MongoDB-I changed jobs and don’t get to code much anymore. I had the urge to learn more and was interested in storing files in MongoDB using GridFS. I googled, read StackOverflow and MongoDB in Action. The problem was most info was storing text as a file or even when storing a file, the code got the _id when executing the put. So of course get is easy, you have the id. Figuring how to get after the fact was where I got stuck, also had to switch file operations to binary. Here is what I have for putting a file in MongoDB and how to retrieve it later.

from pymongo import MongoClient
import gridfs

client=MongoClient()
db=client.mytest
data=open(“image.png”,”rb”)
fs=gridfs.GridFS(db)
thedata=data.read()
stored=fs.put(thedata,filename=”inmongoimage”)

To get it back in the same code you call:
out=fs.get(stored).read()

This works because stored has the _id of the put operation. But what if I need to retrieve in different code from the inserts? Here is how I got it out with some extra code for info:
imports….
client=MongoClient()
db=client.mytest
fs=gridfs.GridFS(db)

filelist=fs.list()
#returns the file names stored

fileone=filelist[0].encode(“ascii”,”ignore”)
#returns a string of the first file name

outdata=fs.get_version(fileone,”rb”).read()
output=open(“somefile.png”,”wb”)
output.write(outdata)
output.close()

Put allows for more metadata than just the filename:
fs.put(thedata,filename=”file.jpg”,field=”string of text”,anumberfield=52)

To find on a different field:
fs.get_version(anumberfield=52)