Seamless CI/CD Pipelines with GitHub Actions on GCP: Your Tools for Effective MLOps

Author:Murphy  |  View: 23454  |  Time: 2025-03-23 18:23:00

THE FULL STACK 7-STEPS MLOPS FRAMEWORK

Photo by Hassan Pasha on Unsplash

This tutorial represents lesson 7 out of a 7-lesson course that will walk you step-by-step through how to design, implement, and deploy an ML system using MLOps good practices. During the course, you will build a production-ready model to forecast energy consumption levels for the next 24 hours across multiple consumer types from Denmark.

By the end of this course, you will understand all the fundamentals of designing, coding and deploying an ML system using a batch-serving architecture.

This course targets mid/advanced machine learning engineers who want to level up their skills by building their own end-to-end projects.

Nowadays, certificates are everywhere. Building advanced end-to-end projects that you can later show off is the best way to get recognition as a professional engineer.


Table of Contents:

  • Course Introduction
  • Course Lessons
  • Data Source
  • Lesson 7: Deploy All the ML Components to Gcp. Build a CI/CD Pipeline Using Github Actions.
  • Lesson 7: Code
  • Conclusion
  • References

Course Introduction

At the end of this 7 lessons course, you will know how to:

  • design a batch-serving architecture
  • use Hopsworks as a feature store
  • design a feature engineering pipeline that reads data from an API
  • build a training pipeline with hyper-parameter tunning
  • use W&B as an ML Platform to track your experiments, models, and metadata
  • implement a batch prediction pipeline
  • use Poetry to build your own Python packages
  • deploy your own private PyPi server
  • orchestrate everything with Airflow
  • use the predictions to code a web app using FastAPI and Streamlit
  • use Docker to containerize your code
  • use Great Expectations to ensure data validation and integrity
  • monitor the performance of the predictions over time
  • deploy everything to GCP
  • build a CI/CD pipeline using GitHub Actions

If that sounds like a lot, don't worry. After you cover this course, you will understand everything I said before. Most importantly, you will know WHY I used all these tools and how they work together as a system.

If you want to get the most out of this course, I suggest you access the GitHub repository containing all the lessons' code. This course is designed to quickly read and replicate the code along the articles.

By the end of the course, you will know how to implement the diagram below. Don't worry if something doesn't make sense to you. I will explain everything in detail.

Diagram of the architecture you will build during the course [Image by the Author].

By the end of Lesson 7, you will know how to manually deploy the 3 ML pipelines and the web app to GCP. Also, you will build a CI/CD pipeline that will automate the deployment process using GitHub Actions.


Course Lessons:

  1. Batch Serving. Feature Stores. Feature Engineering Pipelines.
  2. Training Pipelines. ML Platforms. Hyperparameter Tuning.
  3. Batch Prediction Pipeline. Package Python Modules with Poetry.
  4. Private PyPi Server. Orchestrate Everything with Airflow.
  5. Data Validation for Quality and Integrity using GE. Model Performance Continuous Monitoring.
  6. Consume and Visualize your Model's Predictions using FastAPI and Streamlit. Dockerize Everything.
  7. Deploy All the ML Components to GCP. Build a CI/CD Pipeline Using Github Actions.
  8. [Bonus] Behind the Scenes of an ‘Imperfect' ML Project – Lessons and Insights

As Lesson 7 focuses on teaching you how to deploy all the components to GCP and build a CI/CD pipeline around it, for the full experience, we recommend you watch the other lessons of the course.

Check out Lesson 4 to learn how to orchestrate the 3 ML pipelines using Airflow and Lesson 6 to see how to consume the model's predictions using FastAPI and Streamlit.


Data Source

We used a free & open API that provides hourly energy consumption values for all the energy consumer types within Denmark [1].

They provide an intuitive interface where you can easily query and visualize the data. You can access the data here [1].

The data has 4 main attributes:

  • Hour UTC: the UTC datetime when the data point was observed.
  • Price Area: Denmark is divided into two price areas: DK1 and DK2 – divided by the Great Belt. DK1 is west of the Great Belt, and DK2 is east of the Great Belt.
  • Consumer Type: The consumer type is the Industry Code DE35, owned and maintained by Danish Energy.
  • Total Consumption: Total electricity consumption in kWh

Note: The observations have a lag of 15 days! But for our demo use case, that is not a problem, as we can simulate the same steps as it would in real-time.

A screenshot from our web app showing how we forecasted the energy consumption for area = 1 and consumer_type = 212 [Image by the Author].

The data points have an hourly resolution. For example: "2023–04–15 21:00Z", "2023–04–15 20:00Z", "2023–04–15 19:00Z", etc.

We will model the data as multiple time series. Each unique price area and consumer type tuple represents its unique time series.

Thus, we will build a model that independently forecasts the energy consumption for the next 24 hours for every time series.

Check out the video below to better understand what the data looks like

Tags: Full Stack Mlops Gcp github-actions Hands On Tutorials Machine Learning

Comment