Google Cloud Alternatives to Cloud Composer

Author:Murphy  |  View: 27379  |  Time: 2025-03-23 18:56:20

Opinion

Image By Author, Fire Logo Designed By Freepik¹

Did you know that as a Google Cloud user, there are many services to choose from to orchestrate your jobs ? For batch jobs, the natural choice has been Cloud Composer for a long time. However, it does not have to continue. This article is about introducing 2 alternatives to Cloud Composer for job orchestration in Google Cloud.

The main topics of this content are as follow:

Use Cases where Cloud Composer Shines

Two Alternatives to Cloud Composer

Strengths And Weaknesses Benchmark

Summing Up

Use Cases where Cloud Composer Shines

A job orchestrator needs to satisfy a few requirements to qualify as such. Personally I expect to see 3 things in a job orchestrator at a minimum:

  • Firstly, an orchestrator must be able to orchestrate any group of tasks with dependencies between them, no matter what job the tasks perform
  • Secondly, an orchestrator must support sharing data between the tasks of a job
  • Thirdly, an orchestrator must allow recurrent job execution and on demand job execution

Cloud Composer satisfies the 3 aforementioned criteria and more. It is a powerful fully fledged orchestrator based on Apache Airflow which supports nice features like backfill, catch up, task rerun, and dynamic task mapping.

Power is dangerous. Power attracts the worst and corrupts the best (Edward Abbey)

Power is dangerous. The statement holds true for Cloud Composer. I'd always advise to try simpler solutions (more on them in the next sections) and keep Cloud Composer for complex cases. In my opinion, following are some situations where using Cloud Composer is completely justified:

  1. You need to run a large scale job orchestration system with hundreds or thousands of jobs
  2. You have jobs with complex and/or dynamic dependencies between the tasks. For instance, the final structure of your jobs depends on the outputs of the first tasks in the job.
  3. You have tasks with non trivial trigger rules and constraints. For instance you want the task to trigger as soon as any of its upstream tasks has failed.

Two Alternatives to Cloud Composer

There are simpler solutions to consider when looking for a job orchestrator in Cloud Composer.

Alternative 1: Vertex AI Pipelines

Example of Pipeline Run in Vertex AI Pipeline, Image By Author

Vertex AI Pipelines is a job orchestrator based on Kubeflow Pipelines (which is based on Kubernetes). It is a serverless product, meaning that there is no virtual machines or clusters to create. Although the orchestrator has been originally used for Machine Learning (ML) based pipelines, it is generic enough to adapt to any type of job. In my opinion, binding Vertex AI Pipelines (and more generally Kubeflow Pipelines) to ML is more of a cliché that is adversely affecting the popularity of the solution.

Alternative 2: Cloud Workflows (+ Cloud Scheduler)

Example of Workflow Run in Cloud Workflows, Image By Author

Cloud Workflows is a serverless, lightweight service orchestrator. It has 2 major requirements:

  1. The tasks to orchestrate must be HTTP based services (Cloud Functions or Cloud Run are used most of the time)
  2. The scheduling of the jobs is externalized to Cloud scheduler

People will often used it to orchestrate APIs or micro-services, thus avoiding monolithic architectures.


Strengths And Weaknesses Benchmark

When comes the time to choose between many options, it is usually a good idea to rank the options according to well defined success criteria. I've chosen 4 criteria here (0: bad – 2: average – 5: good)

  1. Simplicity: How simple it is for a team to learn and use the solution ?
  2. Maintenability: How easy it is to make changes to the workflows after they are created ?
  3. Scalability: How stable the solution remains when the number of worflow increases ?
  4. Cost

Note: Please, be aware that the criteria as well as the evaluations are subjective and only represent my point of view

Comparison of the solutions, Image By Author

With its steep learning curve, Cloud Composer is not the easiest solution to pick up. That's being said, Cloud Workflows does not have any processing capability on its own, which is why it's always used in combination with other services like Cloud Functions or Cloud Runs. In addition, scheduling has to be taken care of by Cloud Scheduler.

As for maintenability and scalability, Cloud Composer is the master because of its infinite scalability and because the system is very observable with detailed logs and metrics available for all components. On this scale, Cloud Composer is tightly followed by Vertex AI Pipelines.

Cloud Composer is on the highest side, as far as Cost is concerned, with Cloud Workflows easily winning the battle as the cheapest solution among the three.


Summing Up

Depending on your needs in terms of jobs orchestration, there might be in Google Cloud, a more appropriate solution than Cloud Composer. In the one hand, Cloud Workflows is much cheaper and meets all the basic requirements for a job orchestrator. In the other hand, Vertex AI Pipelines is more integrated to Kubernetes and will probably be easier to pick up for teams that already have a good knowledge of Kubernetes. Thank you for your time and stay tuned for more.

[1] www.freepik.com

Tags: Airflow Google Cloud Platform Google Cloud Workflows Kubeflow Pipelines

Comment