Real-Time Analytics Solution for Usage-Based API Billing and Metering

Disclaimer: The author of this article is a Developer Advocate at Redpanda, which is a critical component of the solution discussed. The author also brings prior expertise in API Management and Apache Pinot to the table. Hence, the proposed solution is a combination of these technologies aimed at solving a prevalent problem.
An API business refers to a company that packages its services or functionalities as a set of API (Application Programming Interface) products. These APIs can be sold to new and existing customers, who can then integrate these functionalities into their own applications. The company can generate revenue by charging these customers based on their usage of the APIs.
A company operating an API business needs a data infrastructure component to track API call volume and bill consumers accordingly.
In this post, I present a reference architecture for building a real-time API usage tracking solution using Apache APISIX, Redpanda, and Apache Pinot. This post focuses more on the "why" rather than the "how". Consider this as a solution design exercise, not a detailed tutorial. I aim to help you extract the solution pattern as a blueprint and make it reusable in future projects.
Without further ado, let's begin.
APIs and API Management
If you're new to APIs and API Management concepts, let me provide a quick introduction first.
An API is a business capability delivered to consumers over a network.
In a digital business, APIs allow for programmatic access to business operations, eliminating humans in the loop. These business operations can range from creating orders to transferring funds to updating customer addresses in the CRM.
A typical API environment in a business has three parties involved:
- Backend systems – These are the systems that run business operations.
- Consumers – First-party and third-party applications that wish to consume business operations.
- API proxy – The intermediary that sits in between acting as a proxy.

The role of the API is to separate internal business systems from consumers, thereby eliminating the need for consumers to handle the complexity of backend systems. In this way, it acts as an abstraction layer. APIs work across various communication protocols, with HTTP being the most commonly used. This is where you get to see RESTful and GraphQL API styles.
In production, organizations often use a full lifecycle API Management platform to manage different stages of an API lifecycle, such as API proxy design, deployment, runtime policy engagement, and monitoring. An API Management platform bundles dedicated components for each stage. In the context of this article, we will use Apache APISIX, an open-source API Management platform distributed under Apache license.

That being said, the solution we are building here is not tightly coupled with APISIX. It can be integrated with most full lifecycle API Management products available in the market, provided they offer a suitable interface for integration.
Making money with APIs
Next, let's see how we can make money with APIs. That is, figuring out a monetization strategy to charge consumers based on their usage.
Let me take a realistic example for a better explanation.
Consider a real estate valuation company that offers instant property valuations to home buyers and sellers. These valuations are based on simple factors such as postcode, property type, and area. Currently, the company only provides a web-based user interface.

To scale business operations, attract more customers, and break into new market segments, this company decides to become an API business. Meaning, they will package the valuation engine as a set of API products, sell them to new and existing consumers, and bill them based on their API invocation usage.
This is achieved by first isolating the valuation engine and placing it behind an API Management platform. This allows consumers to access its functionality through a set of APIs.

The valuation API will attract potential consumers from various domains, including but not limited to:
- Real estate companies – Provide an accurate valuation figure for home buyers.
- Banks – House valuation before approving a mortgage.
- Insurance providers – Provide more accurate quotes for home and content insurance.
- The government – Easily calculates property taxes.
API monetization model
So, how does this company make money? Package the valuation API as a set of API products and sell them with a subscription tier!
A subscription tier is a quota that allows a consumer to make a fixed amount of API calls per month. If that quota exceeds, the consumers will be throttled out or they will be charged for excess usage.
For example, the valuation API can be offered with three subscription tiers as follows.
- Bronze with 10k requests per month
- Gold with 100k requests per month
- Platinum with unlimited requests per month

Consumers can subscribe to the API based on their anticipated usage, choosing from various tiers. At the end of the month, the company will bill consumers based on their actual usage.
Our goal is to find an efficient and reliable way to measure API usage for each consumer.
Planning the solution
Now that we understand the problem we are trying to solve, let me walk you through a couple of design decisions before diving into the implementation.
KPI metrics
The first step is identifying the KPIs or metrics we expect from the solution. I'm particularly interested in the following five.
- API usage – API invocations per consumer over time
- API latency – End-to-end latency to spot sluggish, slow APIs
- Unique users – How many unique invocations per API?
- Geographical usage distribution – Where most of the API users are coming from?
- Error count – More invocations faults means something wrong with the backend
Ideally, I'd love to see all these visualized on a dashboard like this.

Stakeholders
The second design decision is the solution stakeholders – to whom the solution should deliver these metrics. There are primarily three parties.
Customers and partners – Consumers would love to see their quota usage on a real-time dashboard, along with a billing estimate.
API operations team – This team manages the API Management infrastructure, particularly interested in the health information of the APIs, such as latency, throughput, errors, etc.
API product team – This team owns API products. They want to run ad-hoc experiments to see which APIs are more popular, the demographics of consumers, etc.

Batch or real-time?
As the final design decision, I would make an 80:20 split between real-time and batch metrics.
Data loses its value over time. The sooner we process it, the sooner we can take appropriate action. So we will make the API traffic and health metrics real-time.
Consider a situation where a consumer API key is compromised. A rogue API client uses the stolen key or authentication token to invoke APIs on behalf of the consumer. The system could detect this sudden spike in traffic from the API key, identify it as an anomaly, and block the key while alerting the consumer. Upon receiving the alert, the consumer can immediately regenerate the API keys to minimize costs.
However, not every use case requires real-time processing. Some use cases naturally fit well for batch processing, such as:
- Monthly usage-based billing reports for customers.
- Weekly API health reports for the operations team.
- Daily API traffic reports for the product team.
Implementation
Now we have reached the midpoint of this article and this is where I present you the following solution architecture based on our discussion so far.

I know the diagram is crowded and there are many unknown technologies in there. So, let me break it down into three layers and discuss each in detail.
Data collection
As I mentioned before, an API Management system has several moving parts to perform API lifecycle management operations like design, run time aspects like traffic shaping, authentication, and subscription management. There are more aspects to it.

However, the component that we are most interested in is the API gateway. It is where all the API traffic flows through to the backend.
Our first task is to identify a touch point in the API gateway. This will allow us to collect API requests and responses going back and forth. We will then build a data pipeline to move this information to an analytical data store in real-time, facilitating future queries.

However, when implementing this write path, writing data directly to the underlying datastore could potentially introduce several problems.
Tight coupling between the APIM system and the analytics infrastructure – You could end up rewriting a significant portion of the analytics infrastructure when switching to a new APIM vendor in the future.
Synchronous writes – Both systems must be available when operational, making it difficult to take the analytics system down for maintenance.
Scalable data ingestion – Sudden traffic spikes on the API gateway would overwhelm the analytics system, forcing both systems to scale in unison.
This prompts us to put some kind of buffer in between, decoupling APIM from the analytics infrastructure. A streaming data platform, like Apache Kafka, would be a great fit here as it is capable of ingesting high throughput data streams from the API gateway with a low latency.
We will use Redpanda, a Kafka API-compatible streaming data platform, in the solution as it provides more performance and simplicity over Kafka. However, if you are fixated on using only Kafka, that's fine. The solution works seamlessly for both technologies.
With Redpanda in the middle, our data pipeline looks like this:

The addition of Redpanda decoupled both systems and made the write path asynchronous. This enables the analytics system to go offline for maintenance and resume from where it left off. Furthermore, Redpanda can absorb sudden traffic spikes, preventing the analytics system from becoming overwhelmed and needing to scale to match the API gateway.
Now the question is how to make the connection between APISIX and Redpanda. Luckily, APISIX provides a built-in data sink for Kafka. Whenever an API request is made to the gateway and the response is returned, this sink publishes a record to a Kafka topic in real-time. We can use this sink with Redpanda as it is Kafka API compatible.

APISIX formats individual API invocation as JSON events and includes critical metrics like latency, HTTP status, and the timestamp. These will be mapped to relevant dimensions at the analytics data store.

What if an API Management platform doesn't come up with a Kafka sink? Well, you could potentially stream the HTTP access log of the API gateway to Kafka as an alternative. You may use a tool like Filebeat or an equivalent for that.
Analytics database
Now that we have API invocation events landing in Redpanda, the next step is to identify a suitable analytics database technology.
Could it be an OLTP database, a key-value store, or a data warehouse? Let's evaluate them based on the following expectations.
- Streaming data ingestion – Must be able to ingest from real-time data sources, like Kafka. There shouldn't be any batch data loading here. Streaming ingestion ensures a higher data freshness.
- Low-latency queries – Query latency must be within the sub-second range to satisfy user-facing analytics dashboards.
- High query throughput – Must be able to handle concurrent queries coming from user-facing analytics dashboards without comprising the latency.
We pick Apache Pinot as the analytics database as it meets all the above criteria.
Apache Pinot is a real-time distributed OLAP database, designed to serve OLAP workloads on streaming data with extremely low latency and high concurrency. Pinot natively integrates with Kafka, allowing real-time ingestion from a Kafka topic as data is generated. The ingested data is indexed and stored in columnar format, enabling efficient query execution on them.
With Pinot in the architecture, our end-to-end data pipeline looks like this. Note that Pinot integrates seamlessly with Redpanda due to its API compatibility.

Do we need a stream processor in the pipeline? Not really. But it depends on the intended use case.
Instead of using a stream processor, you can use Redpanda's Wasm-based in-broker data transformations to remove sensitive fields from the API event payload. However, a stateful stream processor like Apache Flink will add more value to the pipeline when:
- Real-time joins and enrichment are needed – You need more dimensions to be propagated to Pinot, which can be derived by joining several streams together. E.g. IP geocoding.
- Alerting – Trigger alerts and fire up downstream event-driven workflows based on anomalies in the usage.
Serving layer
Our analytics data pipeline is now complete. All pipeline components reside in the data infrastructure layer. If needed, one can access the Pinot query console to run ad-hoc SQL queries to generate metrics.
However, not every stakeholder/user of the solution wants to do that. We need to present metrics to each user group in a manner that they find intuitive and comfortable. This is where we implement the serving layer – the last mile of the analytics.

Our priority is the consumers. They need real-time dashboards visualizing their usage and billing estimates. For that, you could build Python-based data applications by leveraging a framework like Streamlit. Pinot Python driver, pinotdb, can bridge the application and Pinot query environment.
User groups requiring BI and ad-hoc exploration, especially the API product owners, could use Pinot's ODBC interfaces to plug their preferred BI tools, such as Tableau and Apache Superset.
For batch workloads, Pinot can be plugged into a query federation engine, like Presto or Trino, via relevant Pinot connectors.
Wrap up
Let's wrap up the post by listing the order of the steps in the pipeline implementation.
- Provision a Redpanda cluster, create topics and set ACLs.
- Configure the Kafka sink in APISIX.
- Create Pinot schemas and tables.
- Massage data as needed.
- Create/plug dashboards.
This solution assumes a self-hosted deployment model where the tools mentioned in the architecture are hosted and managed by the business itself. However, it's important to note that the same design principles can be applied even if you choose the hosted versions of these tools. Each component in the architecture could potentially be replaced by a hosted service, making the solution highly adaptable to various deployment strategies.
As mentioned earlier, this post primarily addresses "why" instead of "how to". The goal is to understand the underlying solution pattern rather than the precise implementation. Consider this post as a blueprint for your next real-time analytics project. You can adjust it to incorporate different technologies as needed.