Predictive Power Score: Calculation, Pros, Cons, and JavaScript Code

Author:Murphy | View: 24209 | Time: 2025-03-22 19:52:40

The Predictive Power Score (that I will just abbreviate as PPS hereafter) is a statistical metric used to measure the strength of a predictive relationship between two variables. But unlike traditional correlation measures, such as Pearson's correlation coefficient r, which only work well for linear relationships between two continuous variables, the PPS is designed to handle a wider variety of relationships, including non-linear ones and categorical data.

PPS and its key points, with one first example

The PPS ranges from 0 to 1, where 0 means there's just no predictive power (the variable is unable to predict the target) and 1 means perfect predictive power (the variable perfectly predicts the target).

Notice that being always equal to or higher than zero, the PPS does not give information about the direction of the relationship as you can get with say Pearson's correlation coefficient r which spans from -1 for anticorrelation to +1 for full positive correlation. The PPS only measures how well one variable can predict another.

Probably the most important advantage of the PPS is its ability to capture non-linear relationships between variables. But this comes with a tweak that isn't very explicitly disclosed in many presentations and articles about this score: the PPS works by applying some kind of model, typically some machine learning model, to check how well one variable can predict another. I took this explicitly into account when writing this article, to make one that's different to others you can find online.

Notice as well that the PPS is not symmetric as, for example, Pearson's correlation coefficient. For example, if you have a scatter plot with quadratic dependence of a variable Y on X, the Person correlation coefficient of Y vs X will be the same as that of X vs Y, and it will be around 0 for example if the data spans the center around the minimum:

This and all pictures below are screenshots taken by the author from his own programs and plots.

But a simple single-layer neural network finds that indeed, a relationship can be modeled, like this (done with brain.js in one of the apps you can try and edit in this article):

Returning a PPS of 0.9447 in this case.

At this point I feel I should emphasize and reflect on the following:

The result of the PPS will depend on the model used.

If the model is extremely simple, it may not capture the relationships and return a low PPS. In the extreme, using linear regression as the model, PPS will behave like a regular Pearson correlation.

On the contrary, if the model overfits the data, the PPS will be unrealistically high – hence this must be considered carefully.

Into the PPS

Here I present some apps and examples of the PPS in action through two ML libraries that run in web browsers, hence – you guessed it, especially if you follow me and my posts – writing all my code in JavaScript.

I prepared the examples on Glitch.com, which allows you to easily "remix" the code and do all the tests and edits you want with a single click!

How PPS is computed

Before we get to see some code, let's see how the PPS is computed:

First we need a predictive model that predicts the target variable from the predictor variable. In this case, where we'll be using artificial neural networks, we must train them and run them on the predictor variable and store the results.
Then, the performance of the model is evaluated using metrics like accuracy or the coefficient of determination R² for regression-like problems like those I present in my examples.
The PPS score is then calculated by comparing the performance of the model to a baseline model such as some random guessing or a constant prediction:

It is important to note that, as you will see in the code examples below, the predictor and target values are normalized before being fed into the neural networks. This might be not mandatory, but I've done so because neural networks typically perform better with normalized input data.

Another note regarding the next examples is that I use mean square error to compute the PPS. That is, we compare the networks' predictions to a baseline model that returns the mean of the target values, using the mean square error to the input Y.

Example 1, using Brain.js as the predictive model

We will first use a basic neural network with brain.js to model the relationship between the two variables X and Y those PPS we intend to obtain.

The coding steps are straightforward: we load the brain.js library (via a CDN, so there isn't even a need to download it first); then we train a neural network using X and Y, which are encoded as arrays split out of a string; and finally we calculate the PPS by comparing the neural network's performance with a baseline model that here I chose as a constant response model that returns the mean of the training target for all test cases.

You can see this brain.js-based example 1 on Glitch here, which you can "remix" to edit at will. The example includes also some very basic plotting of the input X and Y arrays together with the predictions of Y made by brain.js once trained. Here's an example run with the data hard-coded as an example (but that you can change right away), with a neural network that has a single hidden layer with 10 neuros (which you can change on line 53):

View of the example provided below at Glitch (that you can edit by "remixing") for a PPS calculation on two variables X and Y modeled with Brain.js using s ingle hidden layer with 10 neurons.

As you see in the example, I also added in the apps the capability to plot the data and predictions from the network.

Example 2, Tensorflow.js as the predictive model

Unfortunately, brain.js is quite limited compared to other packages for machine learning, including some with web-compatible libraries such as Tensorflow: Tensorflow.js

Since this library is more powerful and also a bit more complex, I thought this was a nice opportunity to start playing with it around, thinking about deeper future tests – maybe to be reported here too.

Here's the code for this second Tensorflow.js-based example, that I don't show running because the app looks visually the same as the brain.js example.

By looking at the code you will identify some important differences, that I marked with comments in the code on Glitch. First, data goes into tensorflow's tensor2d objects. Second, when you setup the tensorflow model (line 53) you can very easily add neuron layers building them with quite some flexibility, for example to set the number of units, the kind of activation function, etc. Third, note how the model has to be "compiled", and that's when you input the optimizer, loss, and other parameters. Training then proceeds with no surprises, except that I tried many ways to get verbose behavior, in order to follow how the training takes place, but never managed to do this (I also tried by inserting callbacks as suggested on StackOverflow, but this didn't work either… so, to be researched further!)

What the PPS is useful for

Before ending this post, I put forward some ideas about what the PPS might be useful for:

Feature Selection: PPS could be used to identify which features have strong predictive power over a target variable in a dataset. For this you could run PPS on every possible pair of features and show the resulting matrix of PPS values as a heatmap. Remember that the PPS is not symmetric, so you will really have to test all possible combinations of variables, not just half of them.
Exploratory Data Analysis: Similar to feature selection, PPS should be useful to understand which variables are worth focusing on when building predictive models, by excluding those that are too correlated.
Categorical relationships: In principle, PPS is flexible enough that it can analyze not only numerical data as we did in these examples, but also categorical data.

References

I got into the PPS through this blog post by Florian Wetschoreck here on Medium:

RIP correlation. Introducing the Predictive Power Score

I also consulted these other resources:

The predictive power score | Macrosynergy

Predictive Power Score vs Correlation

Last, if you're interested in Python rather than JavaScript, here's a library that calculates PPS in this language:

ppscore

www.lucianoabriata.com I write about everything that lies in my broad sphere of interests: nature, science, technology, Programming, etc. Subscribe to get my new stories by email. To consult about small jobs check my services page here. You can contact me here. You can tip me here.

Tags: Artificial Intelligence Data Analysis Machine Learning Programming Statistics