Scala Data Pipelines for Music Recommendations
Chris Johnson‘s presentation from Data Day Texas:
From: Erik Bernhardsson
Chris Johnson‘s presentation from Data Day Texas:
From: Erik Bernhardsson
I just made it to Sweden suffering from jet lag induced insomnia, but this blog post will not cover that. Instead, I will talk a little bit about technical debt. The concept of technical debt always resonated with me, partly because I always like the analogy with “real” debt.
From: Erik Bernhardsson
Just search for “hackers gif“. There you go. Fun for your work emails for the next 500 years. From the awesome movie Hackers. That movie together with The Warriors convinced me that I wanted to live in NYC when I was like… 14 years old.
From: Erik Bernhardsson
I was talking with some data engineers at Spotify and had a moment of nostalgia. 2008 I was writing my master's thesis at Spotify and had to run a Hadoop job to extract some data from the logs.
From: Erik Bernhardsson
More Luigi presentations!
From: Erik Bernhardsson
At NYC Data Science meetup! Unfortunately the space is full but the talk will be livestreamed – check out the meetup web page for a link tomorrow.
From: Erik Bernhardsson
This is the last post about deep learning for chess/go/whatever. But this really cool paper by Christopher Clark and Amos Storkey was forwarded to me by Michael Eickenberg. It's about using convolutional neural networks to play Go.
From: Erik Bernhardsson
My previous blog post about deep learning for chess blew up and made it to Hacker News and a couple of other places. One pretty amazing thing was that the Github repo got 150 stars overnight.
From: Erik Bernhardsson
I've been meaning to learn Theano for a while and I've also wanted to build a chess AI at some point. So why not combine the two? That's what I thought, and I ended up spending way too much time on it.
From: Erik Bernhardsson
Say you build a machine learning model, like a movie recommender system. You need to optimize for something. You have 1-5 stars as ratings so let's optimize for mean squared error. Great. Then let's say you build a new model.
From: Erik Bernhardsson
I keep forgetting to buy a costume for Halloween every year, so this year I prepared and got myself a Luigi costume a month in advance. Only to realize I was going to be out of town the whole weekend.
From: Erik Bernhardsson
I spent a couple of hours this weekend going through some pull requests and issues to Annoy, which is an open source C++/Python library for Approximate Nearest Neighbor search. I set up Travis-CI integration and spent some time on one of the issues that multiple people had reported.
From: Erik Bernhardsson
I'm at RecSys 2014, meeting a lot of people and hanging out at talks. Some of the discussions here was about the filter bubble which prompted me to formalize my own thoughts. I firmly believe that it's the role of a system to respect the user's intent.
From: Erik Bernhardsson
Note: This is a silly application.
From: Erik Bernhardsson
Inspired by Sander Dieleman's internship at Spotify, I've been playing around with deep learning using Theano.
From: Erik Bernhardsson
Many years ago, I used to think that A/B tests were foolproof and all you need to do is compare the metrics for the two groups. The group with the highest conversion rate wins, right?
From: Erik Bernhardsson
I’ve been spending quite some time lately playing around with RNN’s for collaborative filtering.
From: Erik Bernhardsson
One obvious thing to anyone living in NYC is how tourists cluster in certain areas. I was curious about the larger patterns around this, so I spent some time looking at data. The thing I wanted to understand is: what areas are dominated by tourists?
From: Erik Bernhardsson
During my time at Spotify, I've reviewed thousands of resumes and interviewed hundreds of people. Lots of them were rejected but lots of them also got offers. Finally, I've also had my share of offers rejected by the candidate.
From: Erik Bernhardsson
From my presentation at MLConf, one of the points I think is worth stressing again is how extremely well combining different algorithms works.
From: Erik Bernhardsson