Saving Tweepy output to MongoDB

August 24th, 2012
|

Tweepy is a great easy to use twitter library for python. MongoDB is a great way to save your data. However ¬†Tweepy doesn’t give you the raw JSON from twitter – so you need to add a monkeypatch (love that term) – include the code from¬†http://misja.posterous.com/getting-json-out-of-tweepy at the top of your project file:

import tweepy
import json

@classmethod
def parse(cls, api, raw):
	status = cls.first_parse(api, raw)
	setattr(status, 'json', json.dumps(raw))
	return status

tweepy.models.Status.first_parse = tweepy.models.Status.parse
tweepy.models.Status.parse = parse

 

Now you can access tweet.json like you would tweet.id or tweet.text. However if you try to fire this into mongoDB then it will fail, as it thinks you are firing in a string. You need to make it a json object first – the magic bit of code you need is:

import json
db[str(collectionname)].save(json.loads(status.json))

 

and it goes in. Bonza.

Twitter