Lakehouse ??? Databricks vs. AWS EMR

DisclaimerThe decision on which ETL tool to use took place in May, EMR Serverless was in preview, and it did not support Delta Lake natively back then.

About this blog series

At claimsforce, our initial approach to big data was a two-tier architecture consisting of a Data Lake stage in Amazon S3 and a Data Warehouse stage in Amazon Redshift (outline here). Over time we realized that having two stages comes with disadvantages like engineering and maintenance effort, infrastructure costs, and data staleness. We aim to replace the combination of a Data Lake and Data Warehouse with a unified system — the Lakehouse. In this blog series, we will document our journey toward a Lakehouse setup.

Learn More

Tags: AWS EMR