A substantial part of Big Data comes from analyzing high volume event data like user clicks or server logs over an extended period of time. However, performing even basic analyses like counting in real-time with low latency can be difficult and costly using approaches like stream processing which rely on scaling only. In this talk I'll discuss a number of algorithmic approaches mainly based on ideas from stream mining. These are methods which are able to work with a finite amount of memory at the expense of exactness, leading to approximate results which are perfectly fine for many application areas. That way, one can deal with huge event streams already with a single server.
Talk objectives: Get an overview of the different approaches and the considerations one has to take account when applying these algorithms.
Target audience: People with some analytics background, data scientists looking for new tools, people working with user interaction data, social media, log analytics, Internet of Things sensor data, and other kinds of real-time data.
Mikio Braun is a PostDoc in Machine Learning at the TU Berlin and co-founder of streamdrill, a startup focussing on efficient real-time analytics. He is interested in techniques beyond brute force scaling for managing high volume event streams, in particular approaches which use approximative algorithms, and applying these techniques in social media analysis, and user interaction data.
Github: mikiobraun
Twitter: @mikiobraun