A Complete Guide to Master Step Functions on AWS
Motivation
Building and maintaining a complex system remains a common challenge in industry, especially when with cloud infrastructures.
Fortunately, a variety of tools such as AWS Step Functions
have been developed to streamline the orchestration of those workflows, regardless of complexity level.
This article provides a step-by-step guide to use Step Functions from setting up a cloud account to implementing a real life scenario.
What is AWS Step Functions?
AWS Step Functions is a Serverless orchestration service with the main purpose of creating of visual workflows. It enables a seamless coordination of AWS Lambda functions and other AWS resources.
Step functions can be integrated with Amazon EC2, Amazon ECS, on-premise servers, Amazon API Gateway, and Amazon SQS queues, to name a few. This versatility of AWS Step Functions makes it suitable for a wide range of applications.
Getting Started with AWS Step Functions
The goal of this section is to provide the steps to getting started with AWS Step Functions, from understanding the building blocks to navigating the AWS Step Functions interface.
Building Blocks of Step Functions
Step Functions is based on the following blocks: (1) state machines, and (2) tasks, and let's understand these concepts through an example.
State Machines
: A workflow that defines the sequence of events, conditional logic, and the overall flow of execution of tasks.Tasks
: It takes an input and generates an output (e.g.: querying a database, making an API call).
Let's consider a use case that performs a daycare registration process using AWS Step Functions.
Before diving into the process of leveraging Step Functions, let's understand the overall steps:
Collect Registration data
: this first step aims to collect registration information from parents. Once submitted, a Lambda function is triggered to send that information to the next step in the workflow.Verify Registration data
: this step checks that all required is provided by the parents. If the verification is successful, the workflow proceeds to the next step. Otherwise, a meaningful error message is returned to the parents.Check Availability
: checks the availability of spots in the daycare. If there is availability, the workflow proceeds to the next step. Otherwise, a meaningful error message is returned to the parents.Confirm Registration
: this final step final step confirms the registration and sends a confirmation message to the parents, including relevant details about the start date and fees.
This underlying workflow is illustrated below. It has a State Machine with four main tasks, which are all self-explanatory.
checkInformation
checkAgeRange
checkSpotsAvailability
confirmRegistration

Building Your First AWS Step Function
Now, let's dive into the technical implementation, from setting up the prerequisites to implementing an end-to-end workflow and deploying it.
Prerequisites to Implementing Step Functions
Before diving into the details of the use case, let's first go over the prerequisites required for a successful implementation:
AWS account
: needed to access AWS services, and one can be created from the AWS website.Knowledge of JSON
: basic understanding of JSON is required to understand the input and output data format of the Step Functions.
Navigating the AWS Step Functions Interface
Before diving into the core features of our use case, let's get familiar with the Step Function interface through the following four steps after logging into your AWS account:
- Type the "Step Functions" keyword in the search bar from the top.
- Choose the corresponding icon from the results.
- Hit the "Get Started" icon to start creating the first step function.
- Finally, since we want to create our own state machine, select the "Create your own" tab.

After the fourth step, we can start designing the state machine with the help of the "Design" tab, which contains three main functionalities: "Actions", "Flow", and "Patterns."

Actions
: correspond to individual operations that can be performed within a workflow, and they correspond to specific AWS services.Flow
: represents the control flow that specifies the execution path of the state machine. We have "Choice" for branching logic, "Parallel" for concurrent execution paths, "Map" for iterating over a collection, "Pass" as a no-operation or state data enricher, "Wait" for time delays, "success" to end a workflow successfully, and "Fail" to end it due to an error is all part of the workflow's flow control.Patterns
: pre-defined templates making it easier to build complex state machines.
Creating the Workflow
Four lambda functions are required to successfully implement the above day-care workflow.
The following seven steps highlight all the necessary steps to create a lambda function, and the process remains the same for all four; the only difference remains in the content of those functions.
The overall code of the article is available on the GitHub page. Even though the code is easy to understand, it is highly recommended to follow the whole content of this article for a better learning experience.

After the completion of the seven steps, the following window should appear, showing important information such as:
- The name of the function
- Its Amazon Resource Name (ARN) link, and
- The integrated area to implement the function's logic.

checkInformation
lambda functionNow, repeat the same process for the remaining three tasks (functions) checkAgeRange
, checkSpotsAvailability
, and confirmRegistration
.
An example of the input JSON is given below. It's important to understand it since it affects the way the functions are implemented.
- The JSON contains information about the child being registered, including its first name, last name, and date of birth.
- It also includes details about the parents, the days of the week the child will be attending the daycare, and any additional information.
{
"registration_info": {
"child": {
"firstName": "Mohamed",
"lastName": "Diallo",
"dateOfBirth": "2016-07-01"
},
"parents": {
"mother": {
"firstName": "Aicha",
"lastName": "Cisse",
"email": "[email protected]",
"phone": "123-456-7890"
},
"father": {
"firstName": "Ibrahim",
"lastName": "Diallo",
"email": "[email protected]",
"phone": "098-765-4321"
}
},
"daysOfWeek": [
"Monday",
"Tuesday",
"Wednesday",
"Thursday",
"Friday"
],
"specialInstructions": "Mohamed has a peanut allergy."
}
}
Each lambda function is described below:

The underlying implementation of each function is provided below:
checkInformation
function
import json
def checkInformation(event, context):
registration_info = event['registration_info']
required_fields = ['child', 'parents', 'daysOfWeek']
for field in required_fields:
if field not in registration_info:
return {
'statusCode': 400,
'body': f'Missing required field: {field}'
}
return {
'statusCode': 200,
'body': json.dumps(registration_info)
}
checkAgeRange
function
import json
import datetime
def checkAgeRange(event, context):
registration_info = json.loads(event['body'])
dob = registration_info['child']['dateOfBirth']
today = datetime.date.today()
dob_date = datetime.datetime.strptime(dob, '%Y-%m-%d').date()
age = today.year - dob_date.year - ((today.month, today.day) < (dob_date.month, dob_date.day))
if age < 2 or age > 5:
return {
'statusCode': 400,
'body': json.dumps('Child is not within the acceptable age range for this daycare.')
}
registration_info['child']['age'] = age
return {
'statusCode': 200,
'body': json.dumps(registration_info)
}
checkSpotsAvailability
function
import json
def checkSpotsAvailability(event, context):
registration_info = json.loads(event['body'])
spots_available = 20 # This should be dynamically determined, not hardcoded
if spots_available <= 0:
return {
'statusCode': 400,
'body': json.dumps('No spots available in the daycare.')
}
return {
'statusCode': 200,
'body': json.dumps(registration_info)
}
confirmRegistration
function
import json
import datetime
def confirmRegistration(event, context):
registration_info = json.loads(event['body'])
age = registration_info['child']['age'] # This was added in the checkAgeRange function
if age >= 2 and age < 3:
fees = 800
elif age >= 3 and age < 4:
fees = 750
elif age >= 4 and age < 5:
fees = 700
else: # age >= 5
fees = 650
start_date = datetime.date.today() + datetime.timedelta(weeks=2)
confirmation_details = {
'fees': fees,
'start_date': start_date.isoformat()
}
response = {**registration_info, **confirmation_details}
return {
'statusCode': 200,
'body': json.dumps(response)
}
With all this in place, we can start creating our daycare state machine using the Step Functions graphical interface.
The final state machine is given below, and let's understand the major steps that led to this workflow:

Before we dive in, it is important to note that the statusCode
field from the output of a lambda function is used to determine the next state in the state machine.
- If the value is 200, it means that the check was successful, and we proceed to the next step.
- If the
statusCode
is 400, then the check failed, in which case we return the relevant message depending on the function that executed the underlying task.
Check Information
- The state machine starts at this step.
- A lambda function is invoked to check if all the required information is present in the registration form.
- If the information is complete, the process moves to the next step. If not, it ends with a fail state notifying that the information is incomplete.
Check Age Range
- This step is reached only if the information check is successful.
- Another lambda function is invoked to check if the child's age falls within the acceptable range for the daycare.
- If the age is within the range, the process moves to the next step. If not, it ends with a fail state notifying that the age is invalid.
Check Spots Availability
- This step is reached only if the age check was successful.
- A lambda function is invoked to check if there are available spots in the daycare.
- If there are spots available, the process moves to the next step. If not, it ends with a fail state notifying that there are no spots available.
Confirm Registration
- This is the final step and is reached only if there are spots available in the daycare.
- A Lambda function is invoked to confirm the registration and calculate the fees based on the child's age.
- The process ends after this step with a success state, confirming the registration.
To learn more about Lambda functions, Streaming Data with AWS Kinesis and Lambda teaches how to work with streaming data using serverless technologies on AWS.
Create IAM Roles
The next step is to define the IAM roles so that the step functions can invoke our lambda functions. This is done by following these steps:

This IAM role can be assigned to the state machine as follows, starting from the "Config" tab.

After saving, we should get the following message to see if everything went well.

Once we are satisfied with the state machine, the next step is to create it using the "Create" button located at the top on the right.

Deploying and Testing Your Workflow
Our workflow has been deployed, and now it is time to test the state machine. We will test two scenarios:
- A failure case with a valid age range, in which case the child we are trying to register is more than 5 years old. This corresponds to the initial JSON.
- A success case where the child is 3 years old.


Conclusion
This short tutorial provided a brief overview of AWS Step Functions and how it can help in orchestrating a real world workflow.
I hope this short tutorial helped you acquire new skill sets.
Also, If you like reading my stories and wish to support my writing, consider becoming a Medium member. With a $ 5-a-month commitment, you unlock unlimited access to stories on Medium.
Would you like to buy me a coffee ☕️? → Here you go!
Feel free to follow me on Medium, Twitter, and YouTube, or say Hi on LinkedIn. It is always a pleasure to discuss AI, ML, Data Science, NLP, and MLOps stuff!