Friday, 12 August 2016

Twitter To MongoDB Via Python



Last night I managed to get my weekend project finished. I decided it would be a good exercise to figure out how to read data from the Twitter API through a Python script and write the results into MongoDB (getting them to Hive instead is next on the list). This was my first real Python script that wasn’t in the command line or written to complete an exercise from a book. This process of working through use cases I make up for myself is really proving to be the best way for me to learn and I would recommend it to everyone that likes to jump in.

Getting into Python

The first thing I realized while starting to figure out how I was going to do this was that my goal of using Vim to write all my code was causing me trouble. It’s hard enough to learn one thing at a time, but working learning two things on top of each other is quite a challenge. With that realized I downloaded PyCharm to use as my IDE and couldn’t have been happier. The code completion, shortcuts, layout, and theme all helped me to start focusing on the code and less about the mechanics of using Vim. I definitely liked it more than using Eclipse. I can definitely imagine a day in future using Vim exclusively, but for now, in spite of all the hardcore programmer advice on StackOverflow, I’m going to use an IDE and worry about the rest later.

Pip those Packages

Next up was getting the necessary libraries to read from Twitter and write to MongoDB. I ended up using the twitter-python package in the end. I tried it first, then read something that said tweepy was better, tried that, and quickly made my way back after feeling a bit lost in the documentation. After that, I grabbed the pymongo driver and was off an running. (btw, I already had a MongoDB instance running on my MacBook so that part took care of itself)

The Result

After a bit (or a lot) of hacking around, setting up a Twitter App on my account to get the credentials, and reading a lot of documentation I ended up with the code result shown below which I put up on Github as well for fun. Don’t worry, the code over there reads my Twitter keys from a file instead of being hardcoded as in the picture below. For the amount of time, it took and all the trial and error, the result may look like “not much” to an outside observer but that’s okay by me. It’s the journey and not the destination.


Original Article Written By: Fred LaCrosse

0 comments:

Post a Comment