Collecting Data with Apache Airflow on a Raspberry Pi

Often, we need to collect some data within a certain period of time. It can be data from the IoT sensor, statistical data from social networks, or something else. As an example, the YouTube Data API allows us to get the number of views and subscribers for any channel at the current moment, but the analytics and historical data are available only to the channel owner. Thus, if we want to get weekly or monthly summaries about these channels, we need to collect this data ourselves. In the case of the IoT sensor, there may be no API at all, and we also need to collect and save data on our own. In this article, I will show how to configure Apache Airflow on a Raspberry Pi, which allows running tasks for a long period of time without involving any cloud provider.
Obviously, if you're working for a large company, you will probably not need a Raspberry Pi. In that case, if you need an extra cloud instance, just create a Jira ticket for your MLOps department