Implicit data and collaborative filtering
A lot of people these days know about collaborative filtering.
From: Erik Bernhardsson
A lot of people these days know about collaborative filtering.
From: Erik Bernhardsson
If you have a few minutes, you should check out mine and Chris Johnson‘s panel proposal.
From: Erik Bernhardsson
IntelliJ Spring Bean Injection Notification
From: Dan Vega
Groovy collections vs My Current Thought Process
From: Dan Vega
I just answered a Quora question about what, if any, are the differences in the algorithms that are behind recommendations for music and movies.
From: Erik Bernhardsson
Andy Sloane decided to call my 2D visualization and raise it to 3D. (Looks a little weird in the iframe but check out the link). It's based on a LDA model with 200 topics, so the artists tend to stick to clusters where each cluster is a topic.
From: Erik Bernhardsson
I've turned into a lazy bastard and I'm just posting presentations on this blog, but here's one from Rohan Singh at Spotify talking about the backend infrastructure of the Discover page.
From: Erik Bernhardsson
I was just at the NYC Predictive Analytics meetup talking about how we build machine learning algorithms using Hadoop to power music recommendations. Great meetup, where we had two speakers, me and Blake Shaw from Foursquare.
From: Erik Bernhardsson
I thought this article about the company culture at HubSpot is kind of funny. “HubSpot's Awesome Presentation Shows how to Create a 21st Century Culture”. Just FYI: You're not different. You're a bunch of white hipsters aged 25-30 dressed up in the same theme.
From: Erik Bernhardsson
I was in Portland, OR for a few days hanging out at OSCON. Was fun. I also talked a bit about Luigi: Next week I'm presenting at the NYC Predictive Analytics meetup together with Blake Shaw from Foursquare.
From: Erik Bernhardsson
Sometimes you have to maximize some function $$ f(w_1, w_2, ldots, w_n) $$ where $$ w_1 + w_2 + ldots + w_n = 1 $$ and $$ 0 le w_i le 1 $$ . Usually, $$ f $$ is concave and differentiable, so there's one unique global maximum and you can solve it by applying gradient ascent.
From: Erik Bernhardsson
Continuing in the same spirit of shameless self-promotion, here's some recent Luigi press: Reddit thread A Guide to Python Frameworks for Hadoop (slides from the NYC Hadoop User Group) This presentation from the Open Analytics NYC meetup about how Foursquare uses Luigi Luigi is in the middle of a...
From: Erik Bernhardsson
Just open sourced hdfs2cass which is a Hadoop job (written in Java) to do efficient Cassandra bulkloading. The nice thing is that it queries Cassandra for its topology and uses that to partition the data so that each reducer can upload data directly to a Cassandra node.
From: Erik Bernhardsson
We had an unconference at Spotify last Thursday and I added a semi-trolling semi-serious topic about abolishing documentation. Or NoDoc, as I'm going to call this movement. This was meant to be mostly a thought experiment, but I don't see it as complete madness.
From: Erik Bernhardsson
I've been obsessed with Wikipedia for the past ten years. Occasionally I find some good articles worth sharing and that's why I created the wikiphilia Twitter handle. Just a long stream of stuff that for one reason or another may be interesting.
From: Erik Bernhardsson
The Discovery page, the new start page in Spotify, is finally out to a fairly significant percentage of all users. Really happy since we have worked on it for the past six months. Here's a screen shot:
From: Erik Bernhardsson
I was browsing around on the Internet and the physics geek in me started reading about Fermat's principle. And suddenly something came back to me that I've been trying to suppress for many years – how I never understood why there's anything fundamental about the principal of least time.
From: Erik Bernhardsson