This serverless pipeline is updated automatically by GCP Cloud Scheduler every day.
This application provides a dynamic and visually appealing analytics dashboard for Frankfurt Airport (FRA) outbound flights. Powered by a fully automated, serverless Medallion architecture, data is fetched hourly from the Aviationstack API. Instead of processing raw data directly on the dashboard, a containerized PySpark job running on Google Cloud Run transforms the data through Bronze, Silver, and Gold layers. It applies advanced probability-based data augmentation for missing delays and stores the final structured data in Google BigQuery. This setup ensures data integrity, realistic visualizations, and highly optimized query performance.
The source code for this project is available on GitHub at the following link: Aviation Pipeline.
Figure: Serverless Aviation Pipeline with PySpark & GCP CI/CD