Data Engineering End-to-End Project ??? PostgreSQL, Airflow, Docker, Pandas

In this article, we are going to get a CSV file from a remote repo, download it to the local working directory, create a local PostgreSQL table, and write this CSV data to the PostgreSQL table with write_csv_to_postgres.py script.

Then, we will get the data from the table. After some modifications and pandas practices, we will create 3 separate data frames with the create_df_and_modify.py script.

In the end, we will get these 3 data frames, create related tables in the PostgreSQL database, and insert the data frames into these tables with write_df_to_postgres.py

All these scripts will run as Airflow DAG tasks with the DAG script.

Think of this project as a practice of pandas and an alternative way of storing the data in the local machine.

Website

Tags: Docker Pandas