Max's Musings

DC >> NYC to learn data science

tl/dr: Apache Cassandra is a NoSQL database with flexible deployment options that’s highly performant (especially for writes), scalable, fault-tolerant, and proven in production. Common use-cases include IoT, messaging, and fraud detection. You probably shouldn’t use Cassandra if you have a small dataset, have highly transactional data, or need to do...

I was lucky enough to attend Spark Summit East 2017 February 8-9. I had to brave the 12” of snow blizzard Nico brought to Boston, but overall learned a lot about the strategic direction of the Apache Spark open source project and ecosystem. In this post I’ll fill you in...

I just graduated from the Spring 2016 NYC Metis Data Science Bootcamp (DS7 cohort) so I figured it would be a great opportunity to reflect on the experience. If you’re interested in learning more about Metis (i.e., you’re researching, applying, or preparing to attend), this post will provide you with...

tl;dr I contributed code to the Google TensorFlow project on GitHub that adds TensorBoard visualizations to the existing TensorFlow “How to Retrain Inception’s Final Layer for New Categories” tutorial. My additions make it easier to understand, debug, and optimize the retraining process. Check it out by walking through the updated...