All Stories

Project 3 - Can we predict if an Amazon review will be helpful or not?

For project 3, code-named “McNulty,” the goal was to gain exposure to classification methods, understanding of their use, and practice implementing them using scikit-learn. For my project, I chose to...

Quick-start Apache Spark Environment Using Docker Containers

Are you learning or experimenting with Apache Spark? Do you want to quickly use Spark with a Jupyter iPython Notebook and Pyspark, but don’t want to go through a lot...

Project 2 - Predicting Oscar Nominations

In project 1, we focused on learning the fundamental components of the Data Science “toolkit” by analyzing NYC MTA Subway data. In project 2, code-named Project Luther, we built on...

Project 1 - Using NYC Subway data to determine the best location to hand out event flyers

tl;dr In project 1, we learned the fundamental components of the Data Science “toolkit” by analyzing NYC MTA Subway data to recommend the optimal time/subway stop for a non-profit to...

Jupyter Python Notebook Keyboard Shortcuts and Text Snippets for Beginners

Here are some of the keyboard shortcuts and text snippets I’ve shared with others during Pair Programming sessions that have been well received. They’ve saved me countless hours programming and...

Faster Python Data Scraping with gevent.pool

In my second project at Metis, I wanted to develop a linear regression model to predict the number of Academy Award nominations for a movie. In order to collect the...