Notebook

Physics Informed Neural Networks¶

Presenter: Filippo Maria Bianchi

Repository: github.com/FilippoMB/Physics-Informed-Neural-Networks-tutorial

Introduction¶

What are PINNs?

PINNs are Neural Networks used to learn a generic function $f$.
Like standard NNs, PINNs account for observation data $\{ x_i \}_{i=1}^N$ in learning $f$.
In addition, the optimization of $f$ is guided by a regularization term, which encourages $f$ to be the solution of a Partial Differential Equation (PDE).

Traditional PDE solvers¶

Simple problems can be solved analytically.
E.g., consider the velocity:

$$v(t) = \frac{d x}{d t} = \lim_{h \rightarrow 0} \frac{x(t+h) - x(t)}{h}$$

Solution:

$$ v(t) = \begin{cases} 3/2 & \text{if}\; t \in \{ 0, 2 \} \\ 0 & \text{if}\; t \in \{ 2, 4 \} \\ -1/3 & \text{if}\; t \in \{ 4, 7 \} \end{cases} $$

In most real-world problems solutions cannot be found analytically.
Differential equations are solved numerically.
E.g., they apply the definition of derivative for all the point of the time domain.

Limitations of PDE solvers

Computationally expensive and scale bad to big data.
Integrating external data sources (e.g., from sensors) is problematic.

Neural Networks¶

Universal function approximators.
Can consume any kind of data $\boldsymbol{X}$.
Are trained to minimize a loss, e.g., the error between the predictions $\boldsymbol{\hat{y}}$ and the desired outputs $\boldsymbol{y}$.

In [1]:

# Imports
import torch
from torch import nn
import numpy as np
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt
from matplotlib import cm

Let's start by creating a simple neural network in PyTorch.

In [2]:

# Define a simple neural network for regression
class simple_NN(nn.Module):
    def __init__(self):
        super(simple_NN, self).__init__()
        self.linear_tanh_stack = nn.Sequential(
            nn.Linear(1, 16),
            nn.Tanh(),
            nn.Linear(16, 32),
            nn.Tanh(),
            nn.Linear(32, 16),
            nn.Tanh(),
            nn.Linear(16, 1),
        )

    def forward(self, x):
        out = self.linear_tanh_stack(x)
        return out

Then,

Create a small dataset $\{x_i, y_i\}_{i=1, \dots 5}$.
Use the NN to make predictions: $\hat{y}_i = \rm{NN}(x_i)$.
Train the NN by minimizing $\rm{MSE}(\boldsymbol{y}, \boldsymbol{\hat{y}})$.

In [3]:

# Define dataset
x_train = torch.tensor([[1.1437e-04],
        [1.4676e-01],
        [3.0233e-01],
        [4.1702e-01],
        [7.2032e-01]], dtype=torch.float32)
y_train = torch.tensor([[1.0000],
        [1.0141],
        [1.0456],
        [1.0753],
        [1.1565]], dtype=torch.float32)

In [4]:

# Initialize the model
model = simple_NN()

# define loss and optimizer
loss_fn = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)

# Train
for ep in range(1000):

    # Compute prediction error
    pred = model(x_train)
    loss = loss_fn(pred, y_train)

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if ep % 200 == 0:
        print(f"epoch: {ep}, loss: {loss.item():>7f}")

epoch: 0, loss: 0.865811
epoch: 200, loss: 0.000253
epoch: 400, loss: 0.000151
epoch: 600, loss: 0.000068
epoch: 800, loss: 0.000017

In [5]:

# evaluate the model on all data points in the domain
domain = [0.0, 1.5]
x_eval = torch.linspace(domain[0], domain[1], steps=100).reshape(-1, 1)
f_eval = model(x_eval)

# plotting
fig, ax = plt.subplots(figsize=(12, 5))
ax.scatter(x_train.detach().numpy(), y_train.detach().numpy(), label="Training data", color="blue")
ax.plot(x_eval.detach().numpy(), f_eval.detach().numpy(), label="NN approximation", color="black")
ax.set(title="Neural Network Regression", xlabel="$x$", ylabel="$y$")
ax.legend();

The NN does a good job in fitting the data samples.
However, it has no information on what function should learn when $x>0.8$.

Physics Informed NNs¶

Use PDEs to adjust the NN output.
Train the model with an additional loss that penalizes the violation of the PDE.

$$ \mathcal{L}_{\text{tot}} = \mathcal{L}_{\text{data}} + \mathcal{L}_{\text{PDE}}$$

Advantages

Combine information from both data and from physical models.

Compared to traditional NNs, $\mathcal{L}_{\text{PDE}}$ regularizes the model limiting overfitting and improving generalization.
Compared to traiditional PDE solvers, PINNs are more scalable and can consume any kind of data.

Example I: population growth¶

Logistic equation for modeling the population growth:

$$ \frac{d f(t)}{d t} = Rt(1-t)$$

$f(t)$ is the population growth over time $t$
$R$ is the max growth rate
To identify a solution, a boundary condition must be imposed, e.g., at $t=0$:

$$f(t=0)=1$$

In [6]:

R = 1.0
ft0 = 1.0

Use the NN to model $f(t)$, i.e., $f(t) = \rm{NN}(t)$
We can easily compute the derivative $\frac{d\rm{NN}(t)}{dt}$ thanks to automatic differentiation provided by deep learning libraries

In [7]:

def df(f: simple_NN, x: torch.Tensor = None, order: int = 1) -> torch.Tensor:
    """Compute neural network derivative with respect to input features using PyTorch autograd engine"""
    df_value = f(x)
    for _ in range(order):
        df_value = torch.autograd.grad(
            df_value,
            x,
            grad_outputs=torch.ones_like(x),
            create_graph=True,
            retain_graph=True,
        )[0]

    return df_value 

We want our NN to satisfy the following equation:

$$ \frac{d\rm{NN}(t)}{dt} - Rt(1-t) = 0 $$

To do that, we add the following physics-informed regularization term to the loss:

$$ \mathcal{L}_\rm{PDE} = \frac{1}{N} \sum_{i=1}^N \left( \frac{d\rm{NN}}{dt} \bigg\rvert_{t_i} - R t_i (1-t_i) \right)^2 $$

where $t_i$ are collocation points, i.e., a set of points from the domain where we evaluate the differential equation.

In [8]:

# Generate 10 evenly distributed collocation points
t = torch.linspace(domain[0], domain[1], steps=10, requires_grad=True).reshape(-1, 1)

Only minimizing $\mathcal{L}_\rm{PDE}$ does not ensure a unique solution.
We must include the boundary condition by adding the following loss:

$$ \mathcal{L}_\rm{BC} = \left( \rm{NN}(t_0) - 1 \right)^2 $$

This lets the NN converge to the desired solution among the infinite possible ones.

The final loss is given by:

$$ \mathcal{L}_\rm{PDE} + \mathcal{L}_\rm{BC} + \mathcal{L}_\rm{data} $$

In [9]:

# Wrap everything into a function
def compute_loss(nn: simple_NN, 
                 t: torch.Tensor = None, 
                 x: torch.Tensor = None,
                 y: torch.Tensor = None,
                 ) -> torch.float:
    """Compute the full loss function as pde loss + boundary loss
    This custom loss function is fully defined with differentiable tensors therefore
    the .backward() method can be applied to it
    """

    pde_loss = df(nn, t) - R * t * (1 - t)
    pde_loss = pde_loss.pow(2).mean()

    boundary = torch.Tensor([0.0])
    boundary.requires_grad = True
    bc_loss = nn(boundary) - ft0
    bc_loss = bc_loss.pow(2)
    
    mse_loss = torch.nn.MSELoss()(nn(x), y)
    
    tot_loss = pde_loss + bc_loss + mse_loss
    
    return tot_loss

In [10]:

model = simple_NN()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)

# Train
for ep in range(2000):

    loss = compute_loss(model, t, x_train, y_train)

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if ep % 200 == 0:
        print(f"epoch: {ep}, loss: {loss.item():>7f}")

epoch: 0, loss: 3.063274
epoch: 200, loss: 0.000948
epoch: 400, loss: 0.000451
epoch: 600, loss: 0.000110
epoch: 800, loss: 0.000090
epoch: 1000, loss: 0.000087
epoch: 1200, loss: 0.000085
epoch: 1400, loss: 0.000084
epoch: 1600, loss: 0.000083
epoch: 1800, loss: 0.000082

In [11]:

# numeric solution
def logistic_eq_fn(x, y):
    return R * x * (1 - x)

numeric_solution = solve_ivp(
    logistic_eq_fn, domain, [ft0], t_eval=x_eval.squeeze().detach().numpy()
)

f_colloc = solve_ivp(
    logistic_eq_fn, domain, [ft0], t_eval=t.squeeze().detach().numpy()
).y.T

In [12]:

# evaluation on the domain [0, 1.5]
f_eval = model(x_eval)

# plotting
fig, ax = plt.subplots(figsize=(12, 5))
ax.scatter(t.detach().numpy(), f_colloc, label="Collocation points", color="magenta", alpha=0.75)
ax.scatter(x_train.detach().numpy(), y_train.detach().numpy(), label="Observation data", color="blue")
ax.plot(x_eval.detach().numpy(), f_eval.detach().numpy(), label="NN solution", color="black")
ax.plot(x_eval.detach().numpy(), numeric_solution.y.T,
        label="Analytic solution", color="magenta", alpha=0.75)
ax.set(title="Logistic equation solved with NNs", xlabel="t", ylabel="f(t)")
ax.legend();

Example II: 1d wave¶

Now, we want our NN to learn a function $f(x,t)$ that satisfies the following $2^\rm{nd}$ order PDE:

$$\frac{\partial^2 f}{\partial x^2} = \frac{1}{C} \frac{\partial^2 f}{\partial t^2}$$

where $C$ is a positive constant.

Differently from before, $f$ depends on two variables: space ($x$) and time ($t$).
We modify our neural network to accept to input variables.

In [13]:

class simple_NN2(nn.Module):
    def __init__(self):
        super(simple_NN2, self).__init__()
        self.linear_tanh_stack = nn.Sequential(
            nn.Linear(2, 16),    # <--- 2 input variables
            nn.Tanh(),
            nn.Linear(16, 32),
            nn.Tanh(),
            nn.Linear(32, 16),
            nn.Tanh(),
            nn.Linear(16, 1),
        )

    def forward(self, x, t):
        x_stack = torch.cat([x, t], dim=1) # <--- concatenate x and t
        out = self.linear_tanh_stack(x_stack)
        return out

The function we defined before, df(), computes a derivatives of any order w.r.t. only one input variable.
We need to modify it slightly to differentiate w.r.t. both $x$ and $t$.

In [14]:

def df(output: torch.Tensor, input_var: torch.Tensor, order: int = 1) -> torch.Tensor:
    """Compute neural network derivative with respect to input features using PyTorch autograd engine"""
    df_value = output      # <-- we directly take the output of the NN
    for _ in range(order):
        df_value = torch.autograd.grad(
            df_value,
            input_var,
            grad_outputs=torch.ones_like(input_var),
            create_graph=True,
            retain_graph=True,
        )[0]
    return df_value

def dfdt(model: simple_NN2, x: torch.Tensor, t: torch.Tensor, order: int = 1):
    """Derivative with respect to the time variable of arbitrary order"""
    f_value = model(x, t)
    return df(f_value, t, order=order)

def dfdx(model: simple_NN2, x: torch.Tensor, t: torch.Tensor, order: int = 1):
    """Derivative with respect to the spatial variable of arbitrary order"""
    f_value = model(x, t)
    return df(f_value, x, order=order)

Loss definition¶

For this example, we do not consider measurement data.
We train the NN with a loss that only accounts for physical equations.
The first term of the loss encourages respecting the 1-dimensional wave equation:

$$\mathcal{L}_\rm{PDE} = \left( \frac{\partial^2 f}{\partial x^2} - \frac{1}{C} \frac{\partial^2 f}{\partial t^2} \right)^2 $$

As before, there are infinite solutions satisfying this equation.
We need to restrict the possible solutions by:
1. imposing periodic boundary conditions at the domain extrema.
2. imposing an initial condition on $f(x, t_0)$.
3. imposing an initial condition on $\frac{\partial f(x, t)}{\partial t} \bigg\rvert_{t=0}$.

We define the domain of $x$ as $[x_0, x_1]$.
In this example, $x_0 = 0$ and $x_1 = 1$, but they could be different values.

The following loss penalizes the violation of the boundary conditions:

$$\mathcal{L}_\rm{BC} = f(x_0, t)^2 + f(x_1, t)^2$$

Next, we must define the initial condition on $f(x, t_0)$.

The following loss penalizes departure from the desired initial condition:

$$\mathcal{L}_\rm{initF} = \left( f(x, t_0) - \frac{1}{2} \rm{sin}(2\pi x) \right)^2 $$

Finally, we must specify the initial condition on $\frac{\partial f(x, t)}{\partial t} \bigg\rvert_{t=0}$.

The following loss penalizes departure from the desired initial condition of the 1st order derivative:

$$\mathcal{L}_\rm{initDF} = \left( \frac{\partial f}{\partial t} \bigg\rvert_{t=0} \right)^2 $$

The total loss is given by:

$$\mathcal{L}_\rm{PDE} + \mathcal{L}_\rm{BC} + \mathcal{L}_\rm{initF} + \mathcal{L}_\rm{initDF}$$

In [15]:

def initial_condition(x) -> torch.Tensor:
    res = torch.sin( 2*np.pi * x).reshape(-1, 1) * 0.5
    return res

In [16]:

def compute_loss(
    model: simple_NN2,
    x: torch.Tensor = None, 
    t: torch.Tensor = None,
    x_idx: torch.Tensor = None, 
    t_idx: torch.Tensor = None,  
    C: float = 1.0,
    device: str = None,
    ) -> torch.float:

    # PDE
    pde_loss = dfdx(model, x, t, order=2) - (1/C**2) * dfdt(model, x, t, order=2)

    # boundary conditions
    boundary_x0 = torch.ones_like(t_idx, requires_grad=True).to(device) * x[0]      
    boundary_loss_x0 = model(boundary_x0, t_idx)                                    # f(x0, t)
    boundary_x1 = torch.ones_like(t_idx, requires_grad=True).to(device) * x[-1]    
    boundary_loss_x1 = model(boundary_x1, t_idx)                                    # f(x1, t)
    
    # initial conditions
    f_initial = initial_condition(x_idx)                         # 0.5*sin(2*pi*x)
    t_initial = torch.zeros_like(x_idx)                          # t0
    t_initial.requires_grad = True
    initial_loss_f = model(x_idx, t_initial) - f_initial         # L_initF
    initial_loss_df = dfdt(model, x_idx, t_initial, order=1)     # L_initDF
    
    # obtain the final  loss by averaging each term and summing them up
    final_loss = \
        pde_loss.pow(2).mean() + \
        boundary_loss_x0.pow(2).mean() + \
        boundary_loss_x1.pow(2).mean() + \
        initial_loss_f.pow(2).mean() + \
        initial_loss_df.pow(2).mean()

    return final_loss

In [17]:

device = "cuda" if torch.cuda.is_available() else "cpu"

# generate the time-space meshgrid
x_domain = [0.0, 1.0]; n_points_x = 100
t_domain = [0.0, 1.0]; n_points_t = 150
x_idx = torch.linspace(x_domain[0], x_domain[1], steps=n_points_x, requires_grad=True)
t_idx = torch.linspace(t_domain[0], t_domain[1], steps=n_points_t, requires_grad=True)
grids = torch.meshgrid(x_idx, t_idx, indexing="ij")
x_idx, t_idx = x_idx.reshape(-1, 1).to(device), t_idx.reshape(-1, 1).to(device)
x, t = grids[0].flatten().reshape(-1, 1).to(device), grids[1].flatten().reshape(-1, 1).to(device)

# initialize the neural network model
model = simple_NN2().to(device)

In [18]:

# Train
optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)
for ep in range(3000):

    loss = compute_loss(model, x=x, t=t, x_idx=x_idx, t_idx=t_idx, device=device)

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if ep % 300 == 0:
        print(f"epoch: {ep}, loss: {loss.item():>7f}")

epoch: 0, loss: 0.133278
epoch: 300, loss: 0.058259
epoch: 600, loss: 0.033736
epoch: 900, loss: 0.026548
epoch: 1200, loss: 0.023956
epoch: 1500, loss: 0.076558
epoch: 1800, loss: 0.014723
epoch: 2100, loss: 0.012283
epoch: 2400, loss: 0.003159
epoch: 2700, loss: 0.001655

In [19]:

# Prediction
y = model(x, t)
y_np = y.reshape([100,-1]).to("cpu").detach().numpy()

# Plot
X, Y = np.meshgrid(np.linspace(0, 1, 150), np.linspace(0, 1, 100))
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
ax.plot_surface(X, Y, y_np, linewidth=0, antialiased=False, cmap=cm.coolwarm,)
ax.set_xlabel("t"), ax.set_ylabel("x"), ax.set_zlabel("f")
plt.show()

Conclusions¶

Growth rate

We saw the difference between:
- Fitting a NN only on observations.
- Adding a regularization term from a 1st order PDE.

1d wave

We saw how to include:
- A 2nd order PDE.
- Multiple constraints on the initial conditions.

Next steps

With more complex equations, convergence is not achieved so easily.
Also, For time-dependent problems, many useful tricks have been devised over the past years such as:
- Decomposing the solution domain in different parts solved using different neural networks.
- Smart weighting of different loss contributions to avoid converging to trivial solutions.

References¶

[1] Raissi, Maziar, Paris Perdikaris, and George E. Karniadakis. "Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations." Journal of Computational physics 378 (2019): 686-707.

[2] Raissi, Maziar, Paris Perdikaris, and George E. Karniadakis. "Physics Informed Deep Learning".

[3] Nascimento, R. G., Fricke, K., & Viana, F. A. (2020). A tutorial on solving ordinary differential equations using Python and hybrid physics-informed neural network. Engineering Applications of Artificial Intelligence, 96, 103996.

[4] Dagrada, Dario. "Introduction to Physics-informed Neural Networks" (code).

[5] Paialunga Piero. "Physics and Artificial Intelligence: Introduction to Physics Informed Neural Networks".

[6] "Physics-Informed-Neural-Networks (PINNs)" - implementation of PINNs in TensorFlow 2 and PyTorch for the Burgers' and Helmholtz PDE.