5 reasons to choose Delta format (on Databricks)

In this blog post, I will explain 5 reasons to prefer the Delta format to parquet or ORC when you are using Databricks for your analytic workloads.

But before we start, let’s have a look at what is delta format.

Delta … an introduction

Delta is a data format based on Apache Parquet. It’s an open source project (https://github.com/delta-io/delta), delivered with Databricks runtimes and it’s the default table format from runtimes 8.0 onwards.

You can use Delta format through notebooks and applications executed in Databricks with various APIs (PythonScalaSQL etc.) and also with Databricks SQL.

As said above, Delta is made of many components:

Read More

Tags: Delta format