What I learned from processing big data with Spark

During my semester project, I was faced with the task of processing a large data set (6 TB) consisting of all the revisions in the English Wikipedia till October 2016. We chose Apache Spark as our cluster-computing framework, and hence I ended up spending a lot of time working with... [Read More]

Of sheep and beer

A story of herding effects in beer reviews

Imagine you’re visiting a town for the first time, and after walking around all morning seeing the sights, you find yourself in a square in front of a bakery that sells delicious-smelling croissants. Actually, there are two bakeries in the square, one on either side of it. In front of... [Read More]