All Stories

Massive-Scale Entity Resolution Using Spark + Graph

I presented the following content on Massive-Scale Entity Resolution (ER) Using Spark + Graph at the 2019 Spark + AI Summit. Check out the video and slides on the conference...

An Introduction to Apache Cassandra for Architects, Ops, and Developers

tl/dr: Apache Cassandra is a NoSQL database with flexible deployment options that’s highly performant (especially for writes), scalable, fault-tolerant, and proven in production. Common use-cases include IoT, messaging, and fraud...

Key Apache Spark Trends from Spark Summit East 2017

I was lucky enough to attend Spark Summit East 2017 February 8-9. I had to brave the 12” of snow blizzard Nico brought to Boston, but overall learned a lot...

Final Project - Improving Brand Analytics with an Image Logo Detection Convolutional Neural Net in TensorFlow

For my final Metis project, I developed an application that can improve brand analytics through logo detection in images. The core of my solution leverages a Deep Convolutional Neural Network...

Reflecting on my Metis Data Science Bootcamp Experience

I just graduated from the Spring 2016 NYC Metis Data Science Bootcamp (DS7 cohort) so I figured it would be a great opportunity to reflect on the experience. If you’re...

Using TensorBoard to Visualize Image Classification Retraining in TensorFlow

tl;dr I contributed code to the Google TensorFlow project on GitHub that adds TensorBoard visualizations to the existing TensorFlow “How to Retrain Inception’s Final Layer for New Categories” tutorial. My...