Matt Sinton-Hewitt
Blog

Matt Sinton-Hewitt

I'm a Senior Developer at Scott Logic. I like Functional Programming, Big Data and Machine Learning.

Data Engineering
Spark is well known in Big Data for its incredible performance and expressive API. However, it just takes one small misstep to transform a massively parallel powerhouse into a pathetically poor performer. This post presents an example and the steps that can be taken to indentify the problem.