Solving Inverse Problems With Physics-Informed DeepONet: A Practical Guide With Code Implementation

Author:Murphy  |  View: 29843  |  Time: 2025-03-23 18:05:45
Photo by 愚木混株 cdd20 on Unsplash

In my previous blog, we delved into the concept of physics-informed DeepONet (PI-DeepONet) and explored why it is particularly suitable for operator learning, i.e., learning mappings from an input function to an output function. We also turned theory into code and implemented a PI-DeepONet that can accurately solve an ordinary differential equation (ODE) even with unseen input forcing profiles.

Figure 1. Operators transform one function into another, which is a concept frequently encountered in real-world dynamical systems. Operator learning essentially involves training a neural network model to approximate this underlying operator. A promising method to achieve that is DeepONet. (Image by author)

The ability to solve these forward problems with PI-DeepONet is certainly valuable. But is that all PI-DeepONet can do? Well, definitely not!

Another important problem category we frequently encountered in computational science and engineering is the so-called Inverse Problem. In essence, this type of problem reverses the flow of information from output to input: the input is unknown and the output is observable, and the task is to estimate the unknown input from the observed output.

Figure 2. In forward problems, the objective is to predict the outputs given the known inputs via the operator. In inverse problems, the process is reversed: known outputs are used to estimate the original, unknown inputs, often with only partial knowledge of the underlying operator. Both forward and inverse problems are commonly encountered in computational science and engineering. (Image by author)

As you might have guessed, PI-DeepONet can also be a super useful tool for tackling these types of problems. In this blog, we will take a close look at how that can be achieved. More concretely, we will address two case studies: one with parameter estimation, and the other one with input function calibrations.

This blog intends to be self-contained, with only a brief discussion on the basics of physics-informed (PI-) learning, DeepONet, as well as our main focus, PI-DeepONet. For a more comprehensive intro to those topics, feel free to check out my previous blog.

With that in mind, let's get started!

Table of Content

· 1. PI-DeepONet: A refresher · 2. Problem Statements · 3. Problem 1: Parameter Estimation3.1 How it works3.2 Implementing a PI-DeepONet pipeline3.3 Results discussion · 4. Problem 2: Input Function Estimation4.1 Solution stratgies4.2 Optimization routine: TensorFlow 4.3 Optimization routine: L-BFGS · 5. Take-away · Reference


1. PI-DeepONet: A refresher

As its name implies, PI-DeepONet is the combination of two concepts: physics-informed learning, and DeepONet.

Physics-informed learning is a new paradigm of Machine Learning and gains particular traction in the domain of dynamical system modeling. Its key idea is to explicitly bake the governing differential equations directly into the machine learning model, often through the introduction of an additional loss term in the loss function that accounts for the residuals of the governing equations. The premise of this learning approach is that the model built this way will respect known physical laws and offer better generalizability, interpretability, and trustworthiness.

DeepONet, on the other hand, resides in the traditional pure data-driven modeling domain. However, what's unique about it is that DeepONet is specifically designed for operator learning, i.e., learning the mapping from an input function to an output function. This situation is frequently encountered in many dynamical systems. For instance, in a simple mass-spring system, the time-varying driving force serves as an input function (of time), while the resultant mass displacement is the output function (of time as well).

DeepONet proposed a novel network architecture (as shown in Figure 3), where a branch net is used to transform the profile of the input function, and a trunk net is used to transform the temporal/spatial coordinates. The feature vectors output by these two nets are then merged via a dot product.

Figure 3. The architecture of DeepONet. The uniqueness of this method lies in its separation of branch and trunk nets to handle input function profiles and temporal/spatial coordinates, respectively. (Image by author)

Now, if we layer the concept of physics-informed learning on top of the DeepONet, we obtain what is known as PI-DeepONet.

Figure 4. Compared to a DeepONet, a PI-DeepONet contains extra loss terms such as the ODE/PDE residual loss, as well as the initial condition loss (IC loss) and boundary condition loss (BC loss). The conventional data loss is optional for PI-DeepONet, as it can directly learn the operator of the underlying dynamical system solely from the associated governing equations. (Image by author)

Once a PI-DeepONet is trained, it can predict the profile of the output function for a given new input function profile in real-time, while ensuring that the predictions align with the governing equations. As you can imagine, this makes PI-DeepONet a potentially very powerful tool for a diverse range of dynamic system modeling tasks.

However, in many other system modeling scenarios, we may also need to perform the exact opposite operation, i.e., we know the outputs and want to estimate the unknown inputs based on observed output and our prior knowledge of system dynamics. Generally speaking, this type of scenario falls in the scope of inverse modeling. A question that naturally arises here is: can we also use PI-DeepONet to address inverse estimation problems?

Before we get into that, let's first more precisely formulate the problems we are aiming to solve.


2. Problem Statements

We will use the same ODE discussed in the previous blog as our base model. Previously, we investigated an initial value problem described by the following equation:

with an initial condition s(0) = 0. In the equation, u(t) is the input function that varies over time, and s(t) denotes the state of the system at time t. Our previous focus is on solving the forward problem, i.e., predicting s(·) given u(·). Now, we will shift our focus and consider solving two types of inverse problems:

1️⃣ Estimating unknown input parameters

Let's start with a straightforward inverse problem. Imagine our governing ODE has now evolved to be like this:

initial condition s(0) = 0, a and b are unknowns.

with a and b being the two unknown parameters. Our objective here is to estimate the values of a and b, given the observed u(·) and s(·) profiles.

This type of problem falls in the scope of parameter estimation, where unknown parameters of the system need to be identified from the measured data. Typical examples of this type of problem include system identification for control engineering, material thermal coefficients estimation in computational heat transfer, etc.

In our current case study, we will assume the true values for a and b are both 0.5.

2️⃣ Estimating the entire input function profile

For the second case study, we ramp up the problem complexity: Suppose that we know perfectly about the ODE (i.e., we know the exact values of a and b). However, while we have observed the s(·) profile, we don't yet know the u(·) profile that has generated this observed output function. Consequently, our objective here is to estimate the u(·) profile, given the observed s(·) profile and known ODE:

initial condition s(0) = 0, a=0.5, b=0.5.

Since we now aim to recover an entire input function instead of a small set of unknown parameters, this case study will be much more challenging than the first one. Unfortunately, this type of problem is inherently ill-posed and requires strong regularization to help constrain the solution space. Nevertheless, they often arise in various fields, including environmental engineering (e.g., identifying the profile of pollutant sources), aerospace engineering (e.g., calibrating the applied loads on aircraft), and wind engineering (e.g., wind force estimation).

In the following two sections, we will address these two case studies one by one.


3. Problem 1: Parameter Estimation

In this section, we tackle the first case study: estimating the unknown parameters in our target ODE. We will start with a brief discussion on how the general physics-informed Neural Networks can be used to solve this type of problem, followed by implementing a PI-DeepONet-based pipeline for parameter estimation. Afterward, we will apply it to our case study and discuss the obtained results.

3.1 How it works

In the original paper on physics-informed neural networks (PINNs), Raissi and co-authors outlined the strategy of using PINNs to solve inverse problem calibration problems: in essence, we can simply set the unknown parameters (in our current case, parameters a and b) as trainable parameters in the neural network, and optimize those unknown parameters together with the weights and bias of the neural net to minimize the loss function.

Of course, the secret sauce lies in constructing the loss function: as a physics-informed learning approach, it not only contains a data mismatch term, which measures the discrepancy between the predicted output of the network and the observed data, but also a physics-informed regularization term, which calculates the residuals (i.e., the difference between the left and right-hand side of the differential equation) using the outputs of the neural network (and their derivatives) and the current estimates of the parameters a and b.

Now, when we perform this joint optimization, we're effectively searching for a and b values that would lead to a network's outputs that simultaneously fit the observed data and satisfy the governing differential equation. When the loss function reaches its minimum value (i.e., the training is converged), the final values of a and b we obtain are the ones that have achieved this balance, and they are thus constituting the estimates of the unknown parameters.

Figure 5. When using PI-DeepONet, the unknown parameters a and b are jointly optimized with the weights and biases of the DeepONet model. When the training is converged, the a and b values we end up with constitute their estimates. (Image by author)

3.2 Implementing a PI-DeepONet pipeline

Enough about the theory, it's time to see some code

Tags: Deep Dives Inverse Problem Machine Learning Neural Networks Physics Informed Learning

Comment