Opinion: With Apache Beam, Tread Carefully

Machine learning teams today often want to process data in real-time. They also standardize on Python as a programming language. Apache Beam supports Python and gives a lot of hope, but my opinion is that this combination is practically too complex and unreliable to put in production.

[Read More]

Tutorial: Understanding Beam with a Local Beam, Flink and Kafka Environment

This is a tutorial-style article. I wrote it in June/July 2022, but found time to clean up and make a blog post only in September 2022. This tutorial is relevant to software engineers and data scientists who work with Apache Beam on top of Apache Flink. Our goal is to set up a local Beam and Flink environment that can run cross-language Beam pipelines. Specifically, in this tutorial, I will discuss how to set up your laptop to run a Beam pipeline that: [Read More]