We use a host of tricks these days for handling data at scale. Disk structures are tuned to specific workloads. Streams are used to create continuous pipelines of processing. Hardware offers incredible diversity in terms of latency and throughput.
The tools available: Cassandra, Postgres, Hadoop, Kafka, Hazelcast, Storm etc all come with tradeoffs unique to themselves. We’ll look at these as individual elements. We’ll also look at compositions that leverage these individual sweet spots to create more powerful, holistic platforms.
Talk objectives:
- To gain a better understanding of how data technology works and options you have for scaling out solutions as they grow, particularly for common Finance use cases.
Target audience:
- Developers interested in how data technology works and how larger data problems may be solved.
Ben is an engineer and architect working on the Apache Kafka Core Team at Confluent Inc (the company behind Apache Kafka). He's worked with distributed data infrastructure for the last ten years, switching between engineering products and helping companies use them. His early career spanned a variety of projects at Thoughtworks and UK-based enterprise companies.
He writes at benstopford.com.
Github: benstopford
Twitter: @benstopford