In this article, we are going to get a CSV file from a remote repo, download it to the local working directory, create a local PostgreSQL table, and write this CSV data to the PostgreSQL table with write_csv_to_postgres.py script.
Then, we will get the data from the table. After some modifications and pandas practices, we will create 3 separate data frames with the create_df_and_modify.py script.
In the end, we will get these 3 data frames, create related tables in the PostgreSQL database, and insert the data frames into these tables with write_df_to_postgres.py
All these scripts will run as Airflow DAG tasks with the DAG script.
Think of this project as a practice of pandas and an alternative way of storing the data in the local machine.