Finding Order in Chaos with Polynomial Chaos Expansion, using uncertainpy and chaospy

Three years ago I moved from Rome, Italy and I started living in Cincinnati, Ohio, USA, after a PhD offer from University of Cincinnati. There were (and there are) a lot of things that I miss about my city: the food, the weather, the beauty of the eternal city. One thing that I absolutely don't miss about my city is the insane traffic.
A good friend of mine texted me the other day and said
"Piero, today the traffic is so bad and the city is a total chaos".
Now, obviously, I didn't correct him (especially knowing how the traffic is in Rome), but the term chaos has a totally different meaning in math and physics with respect to how we use "chaos" in our everyday lives.
A popular definition of chaos, when we refer to chaos in math, is the one of a problem that is regulated by deterministic equations, but the evolution of the system is extremely dependent on the original conditions. This means that even with an extremely small change in the original conditions, the evolution of the system can be incredibly different. To say it in the words of Lorentz¹ this means that:
"The present determines the future, but the approximate present does not approximately determine the future."
¹ http://mpe.dimacs.rutgers.edu/2013/03/17/chaos-in-an-atmosphere-hanging-on-a-wall/
This means that the only way that we are able to predict the evolution of a state is by considering it from a probabilistic point of view.Given the starting point of a process, we won't be able to predict exactly the arrival point of the system, as it's chaotic, but we will be able to predict it probabilistically, in the sense that we will be able to get, for example, the mean and start deviation.
This kind of chaos can be treated numerically, for example using Python. In this blog post, we will describe the Polynomial Chaos Expansion (PCE) starting from the abstract Random Walk to the application on a real case with our coffee temperature ☕️
Let's get started!
1. Random Walk
The random walk is something that is well known to all the mathematicians and the physicists who are reading. This model has been used pretty much everywhere, from finance to physics, and it is very simple. It is also known in the literature as Brownian motion and it works like this:
- We start from a point x = 0
- With the same probability we can go from x=0 to x=1 or from x=0 to x=-1. We define this point as x_1
- Again, we can increase the value of x_1 by +1 or -1. We will define this point x_2.
- We repeat point 3 with x_2 for N-2 more times
For once, I think that the pseudocode is even more easy to understand than explaining it in words
RandomWalk(N):
x = 0
i = 0
while i
Now let's explore this, shall we?
1.2 Code
In this part of the post, we will describe the Random Walk using the Python language code. You will need to import very basic libraries like numpy and matplotlib.pyplot.
This is the Random Walk code:
If we run this, let's say 100 times we get the following paths:
What's very interesting is that you can find the gaussian distribution if you consider the last step:

Let's leave it here for now. I promise, we will use this.
2. Differential Equations
Now, all we know about life, wait, literally, all we know about life we know it because of the differential equation.
The differential equations are the tools that physics use to describe the evolution of a system. My high school teacher explained it by saying:
"To describe the world you need two things: to differentiate and to integrate. It's very easy to differentiate, it's very hard to integrate."
For example, let's consider the location y of a squirrel climbing on the tree:

Let's say that the speed of the squirrel is v(t) = (t/60)2, where t = seconds. So our superhero start with a v(t=0) = 0 and after two minutes he gets to the speed v(120) = 22 = 4 m/s .
Given this information, what is the location of the super-squirrel?
What we need to do is to integrate the velocity equation and we get:

How do we get that c constant? We just set what happens at t = 0. We assume that our squirrel starts at height = 0 so c = 0.
So the location of our super squirrel that climbs the tree is the following:

In general, a certain solution y can be seen as an integration of another quantity, say xand a starting condition.
In this scheme we talked about:
- The time (t) that is the time variable (that goes from the start to the end of the experiment)
- x, that is the object that we are integrating (in the example above x is the velocity)
- The solution (y) that is the solution that we obtain integrating x (in the example above y is the location of the super squirrel)
So in this case we can say that x (t) = integration of y(t).
There is more. You can have some parameters in your system that are fixed but that can change the evolution of your system. So:
x(t, list of parameters) = integration of y(t, list of parameters)
For example. Let's talk about coffee. Coffee? Yes, coffee.
2.1 Newton's law of cooling
A guy that was pretty decent in Physics (lol) named Isaac Netwon, among the many gifts he left us, explained how to describe the heat transfer of a hot body. In other words, he told us how things cool down

The law of cooling proposed by Newton states that the rate of heat from the body to the outside is proportional to a constant k that is dependent on the surface area and its heat transfer coefficient and the difference between the temperature T at time t and the temperature of the environment T_env.
If we want to get the temperature (T), we need to integrate the rate of heat (dT/dt). This is the equation:

To get the temperature T, given T_env and k (remember this!!!), we need to integrate dT/dt.
3. Wiener's Chaos!
Rarely, very rarely, we are able to define an analytical solution for the differential equation. That is why my high school professor said that it's very hard to integrate. We are more likely in need of doing a numerical integration, which means solving the differential equation in a numerical way (a.k.a. with an algorithm).
There are super well-known methods (algorithms) for integrations, like the trapezoidal or the Riemann sum. They work, with their pros and cons, and they work efficiently. They are not the problem.
The real problem is in the parameters (e.g. T_env and kappa) of the differential equation. Let me elaborate.
Do you remember T_env and k from the equation above? We have no idea of what they might actually be, and it can completely change the evolution of our system.
The beautiful mind of Norbert Wiener made a very elegant formulation of the differential equations with additional random parameters. In particular, and now all our talk makes sense, the differential equations with random parameters are defined as chaotic and can be described using random walks (ah-ha!) as a polynomial. By doing this, we are able to understand the solution T(t) in a probabilistic way!
I understand it can be confusing: let's do it step by step