Scalable Stream Processing or Addressing Velocity, the 2nd V.
Big Data is claimed to have three main dimensions: Volume, Velocity, and Variety. The 2nd V: Velocity is in the focus of this talk. The presentation will cover the basics of large-scale and scalable stream processing engines in general and the famous open source implementation Strom, which is now part of the Hadoop ecosystem in greater detail. Stream processing lends mechanics from high performance computing such as message passing and extends these concepts by allowing for right scaling the computational resources to the task at hand. The talk will conclude with real world implementation examples from studies that concentrated on analysing Twitter data in near-real time for users sentiment on certain topics and for disaster detection.