#!/usr/bin/env python # coding: utf-8 # # Physics Informed Neural Networks # # **Presenter:** Filippo Maria Bianchi # # **Repository:** [github.com/FilippoMB/Physics-Informed-Neural-Networks-tutorial](https://github.com/FilippoMB/Physics-Informed-Neural-Networks-tutorial) # ## Introduction # # What are PINNs? # # - PINNs are Neural Networks used to learn a generic function $f$. # - Like standard NNs, PINNs account for observation data $\{ x_i \}_{i=1}^N$ in learning $f$. # - In addition, the optimization of $f$ is guided by a regularization term, which encourages $f$ to be the solution of a Partial Differential Equation (PDE). # ### Traditional PDE solvers # # - Simple problems can be solved analytically. # - E.g., consider the velocity: # # $$v(t) = \frac{d x}{d t} = \lim_{h \rightarrow 0} \frac{x(t+h) - x(t)}{h}$$ # # # # - Solution: # # $$ # v(t) = # \begin{cases} # 3/2 & \text{if}\; t \in \{ 0, 2 \} \\ # 0 & \text{if}\; t \in \{ 2, 4 \} \\ # -1/3 & \text{if}\; t \in \{ 4, 7 \} # \end{cases} # $$ # # # - In most real-world problems solutions cannot be found analytically. # - Differential equations are solved numerically. # - E.g., they apply the definition of derivative for *all* the point of the time domain. # **Limitations of PDE solvers** # # - Computationally expensive and scale bad to big data. # - Integrating external data sources (e.g., from sensors) is problematic. # ## Neural Networks # # # # - Universal function approximators. # - Can consume any kind of data $\boldsymbol{X}$. # - Are trained to minimize a loss, e.g., the error between the predictions $\boldsymbol{\hat{y}}$ and the desired outputs $\boldsymbol{y}$. # In[1]: # Imports import torch from torch import nn import numpy as np from scipy.integrate import solve_ivp import matplotlib.pyplot as plt from matplotlib import cm # Let's start by creating a simple neural network in PyTorch. # In[2]: # Define a simple neural network for regression class simple_NN(nn.Module): def __init__(self): super(simple_NN, self).__init__() self.linear_tanh_stack = nn.Sequential( nn.Linear(1, 16), nn.Tanh(), nn.Linear(16, 32), nn.Tanh(), nn.Linear(32, 16), nn.Tanh(), nn.Linear(16, 1), ) def forward(self, x): out = self.linear_tanh_stack(x) return out # Then, # - Create a small dataset $\{x_i, y_i\}_{i=1, \dots 5}$. # - Use the NN to make predictions: $\hat{y}_i = \rm{NN}(x_i)$. # - Train the NN by minimizing $\rm{MSE}(\boldsymbol{y}, \boldsymbol{\hat{y}})$. # In[3]: # Define dataset x_train = torch.tensor([[1.1437e-04], [1.4676e-01], [3.0233e-01], [4.1702e-01], [7.2032e-01]], dtype=torch.float32) y_train = torch.tensor([[1.0000], [1.0141], [1.0456], [1.0753], [1.1565]], dtype=torch.float32) # In[4]: # Initialize the model model = simple_NN() # define loss and optimizer loss_fn = nn.MSELoss() optimizer = torch.optim.Adam(model.parameters(), lr=1e-2) # Train for ep in range(1000): # Compute prediction error pred = model(x_train) loss = loss_fn(pred, y_train) # Backpropagation optimizer.zero_grad() loss.backward() optimizer.step() if ep % 200 == 0: print(f"epoch: {ep}, loss: {loss.item():>7f}") # In[5]: # evaluate the model on all data points in the domain domain = [0.0, 1.5] x_eval = torch.linspace(domain[0], domain[1], steps=100).reshape(-1, 1) f_eval = model(x_eval) # plotting fig, ax = plt.subplots(figsize=(12, 5)) ax.scatter(x_train.detach().numpy(), y_train.detach().numpy(), label="Training data", color="blue") ax.plot(x_eval.detach().numpy(), f_eval.detach().numpy(), label="NN approximation", color="black") ax.set(title="Neural Network Regression", xlabel="$x$", ylabel="$y$") ax.legend(); # - The NN does a good job in fitting the data samples. # - However, it has no information on what function should learn when $x>0.8$. # ## Physics Informed NNs # # - Use PDEs to adjust the NN output. # - Train the model with an additional loss that penalizes the violation of the PDE. # # $$ \mathcal{L}_{\text{tot}} = \mathcal{L}_{\text{data}} + \mathcal{L}_{\text{PDE}}$$ # # # # **Advantages** # # Combine information from both data and from physical models. # - Compared to traditional NNs, $\mathcal{L}_{\text{PDE}}$ regularizes the model limiting overfitting and improving generalization. # - Compared to traiditional PDE solvers, PINNs are more scalable and can consume any kind of data. # ### Example I: population growth # # Logistic equation for modeling the population growth: # # $$ \frac{d f(t)}{d t} = Rt(1-t)$$ # # - $f(t)$ is the population growth over time $t$ # - $R$ is the max growth rate # - To identify a solution, a boundary condition must be imposed, e.g., at $t=0$: # # $$f(t=0)=1$$ # In[6]: R = 1.0 ft0 = 1.0 # - Use the NN to model $f(t)$, i.e., $f(t) = \rm{NN}(t)$ # - We can easily compute the derivative $\frac{d\rm{NN}(t)}{dt}$ thanks to automatic differentiation provided by deep learning libraries # # In[7]: def df(f: simple_NN, x: torch.Tensor = None, order: int = 1) -> torch.Tensor: """Compute neural network derivative with respect to input features using PyTorch autograd engine""" df_value = f(x) for _ in range(order): df_value = torch.autograd.grad( df_value, x, grad_outputs=torch.ones_like(x), create_graph=True, retain_graph=True, )[0] return df_value # - We want our NN to satisfy the following equation: # # $$ \frac{d\rm{NN}(t)}{dt} - Rt(1-t) = 0 $$ # # - To do that, we add the following physics-informed regularization term to the loss: # # $$ \mathcal{L}_\rm{PDE} = \frac{1}{N} \sum_{i=1}^N \left( \frac{d\rm{NN}}{dt} \bigg\rvert_{t_i} - R t_i (1-t_i) \right)^2 $$ # # - where $t_i$ are **collocation points**, i.e., a set of points from the domain where we evaluate the differential equation. # In[8]: # Generate 10 evenly distributed collocation points t = torch.linspace(domain[0], domain[1], steps=10, requires_grad=True).reshape(-1, 1) # - Only minimizing $\mathcal{L}_\rm{PDE}$ does not ensure a unique solution. # - We must include the boundary condition by adding the following loss: # # $$ \mathcal{L}_\rm{BC} = \left( \rm{NN}(t_0) - 1 \right)^2 $$ # # - This lets the NN converge to the desired solution among the infinite possible ones. # The final loss is given by: # # $$ \mathcal{L}_\rm{PDE} + \mathcal{L}_\rm{BC} + \mathcal{L}_\rm{data} $$ # In[9]: # Wrap everything into a function def compute_loss(nn: simple_NN, t: torch.Tensor = None, x: torch.Tensor = None, y: torch.Tensor = None, ) -> torch.float: """Compute the full loss function as pde loss + boundary loss This custom loss function is fully defined with differentiable tensors therefore the .backward() method can be applied to it """ pde_loss = df(nn, t) - R * t * (1 - t) pde_loss = pde_loss.pow(2).mean() boundary = torch.Tensor([0.0]) boundary.requires_grad = True bc_loss = nn(boundary) - ft0 bc_loss = bc_loss.pow(2) mse_loss = torch.nn.MSELoss()(nn(x), y) tot_loss = pde_loss + bc_loss + mse_loss return tot_loss # In[10]: model = simple_NN() optimizer = torch.optim.Adam(model.parameters(), lr=1e-2) # Train for ep in range(2000): loss = compute_loss(model, t, x_train, y_train) # Backpropagation optimizer.zero_grad() loss.backward() optimizer.step() if ep % 200 == 0: print(f"epoch: {ep}, loss: {loss.item():>7f}") # In[11]: # numeric solution def logistic_eq_fn(x, y): return R * x * (1 - x) numeric_solution = solve_ivp( logistic_eq_fn, domain, [ft0], t_eval=x_eval.squeeze().detach().numpy() ) f_colloc = solve_ivp( logistic_eq_fn, domain, [ft0], t_eval=t.squeeze().detach().numpy() ).y.T # In[12]: # evaluation on the domain [0, 1.5] f_eval = model(x_eval) # plotting fig, ax = plt.subplots(figsize=(12, 5)) ax.scatter(t.detach().numpy(), f_colloc, label="Collocation points", color="magenta", alpha=0.75) ax.scatter(x_train.detach().numpy(), y_train.detach().numpy(), label="Observation data", color="blue") ax.plot(x_eval.detach().numpy(), f_eval.detach().numpy(), label="NN solution", color="black") ax.plot(x_eval.detach().numpy(), numeric_solution.y.T, label="Analytic solution", color="magenta", alpha=0.75) ax.set(title="Logistic equation solved with NNs", xlabel="t", ylabel="f(t)") ax.legend(); # ### Example II: 1d wave # # - Now, we want our NN to learn a function $f(x,t)$ that satisfies the following $2^\rm{nd}$ order PDE: # # $$\frac{\partial^2 f}{\partial x^2} = \frac{1}{C} \frac{\partial^2 f}{\partial t^2}$$ # # - where $C$ is a positive constant. # - Differently from before, $f$ depends on two variables: space ($x$) and time ($t$). # - We modify our neural network to accept to input variables. # In[13]: class simple_NN2(nn.Module): def __init__(self): super(simple_NN2, self).__init__() self.linear_tanh_stack = nn.Sequential( nn.Linear(2, 16), # <--- 2 input variables nn.Tanh(), nn.Linear(16, 32), nn.Tanh(), nn.Linear(32, 16), nn.Tanh(), nn.Linear(16, 1), ) def forward(self, x, t): x_stack = torch.cat([x, t], dim=1) # <--- concatenate x and t out = self.linear_tanh_stack(x_stack) return out # - The function we defined before, `df()`, computes a derivatives of any order w.r.t. only one input variable. # - We need to modify it slightly to differentiate w.r.t. both $x$ and $t$. # In[14]: def df(output: torch.Tensor, input_var: torch.Tensor, order: int = 1) -> torch.Tensor: """Compute neural network derivative with respect to input features using PyTorch autograd engine""" df_value = output # <-- we directly take the output of the NN for _ in range(order): df_value = torch.autograd.grad( df_value, input_var, grad_outputs=torch.ones_like(input_var), create_graph=True, retain_graph=True, )[0] return df_value def dfdt(model: simple_NN2, x: torch.Tensor, t: torch.Tensor, order: int = 1): """Derivative with respect to the time variable of arbitrary order""" f_value = model(x, t) return df(f_value, t, order=order) def dfdx(model: simple_NN2, x: torch.Tensor, t: torch.Tensor, order: int = 1): """Derivative with respect to the spatial variable of arbitrary order""" f_value = model(x, t) return df(f_value, x, order=order) # #### Loss definition # # - For this example, we do not consider measurement data. # - We train the NN with a loss that only accounts for physical equations. # - The first term of the loss encourages respecting the 1-dimensional wave equation: # # $$\mathcal{L}_\rm{PDE} = \left( \frac{\partial^2 f}{\partial x^2} - \frac{1}{C} \frac{\partial^2 f}{\partial t^2} \right)^2 $$ # - As before, there are infinite solutions satisfying this equation. # - We need to restrict the possible solutions by: # 1. imposing periodic boundary conditions at the domain extrema. # 2. imposing an initial condition on $f(x, t_0)$. # 3. imposing an initial condition on $\frac{\partial f(x, t)}{\partial t} \bigg\rvert_{t=0}$. # - We define the domain of $x$ as $[x_0, x_1]$. # - In this example, $x_0 = 0$ and $x_1 = 1$, but they could be different values. # # # # - The following loss penalizes the violation of the boundary conditions: # # $$\mathcal{L}_\rm{BC} = f(x_0, t)^2 + f(x_1, t)^2$$ # - Next, we must define the initial condition on $f(x, t_0)$. # # # # - The following loss penalizes departure from the desired initial condition: # # $$\mathcal{L}_\rm{initF} = \left( f(x, t_0) - \frac{1}{2} \rm{sin}(2\pi x) \right)^2 $$ # - Finally, we must specify the initial condition on $\frac{\partial f(x, t)}{\partial t} \bigg\rvert_{t=0}$. # # # # The following loss penalizes departure from the desired initial condition of the 1st order derivative: # # $$\mathcal{L}_\rm{initDF} = \left( \frac{\partial f}{\partial t} \bigg\rvert_{t=0} \right)^2 $$ # The total loss is given by: # # $$\mathcal{L}_\rm{PDE} + \mathcal{L}_\rm{BC} + \mathcal{L}_\rm{initF} + \mathcal{L}_\rm{initDF}$$ # In[15]: def initial_condition(x) -> torch.Tensor: res = torch.sin( 2*np.pi * x).reshape(-1, 1) * 0.5 return res # In[16]: def compute_loss( model: simple_NN2, x: torch.Tensor = None, t: torch.Tensor = None, x_idx: torch.Tensor = None, t_idx: torch.Tensor = None, C: float = 1.0, device: str = None, ) -> torch.float: # PDE pde_loss = dfdx(model, x, t, order=2) - (1/C**2) * dfdt(model, x, t, order=2) # boundary conditions boundary_x0 = torch.ones_like(t_idx, requires_grad=True).to(device) * x[0] boundary_loss_x0 = model(boundary_x0, t_idx) # f(x0, t) boundary_x1 = torch.ones_like(t_idx, requires_grad=True).to(device) * x[-1] boundary_loss_x1 = model(boundary_x1, t_idx) # f(x1, t) # initial conditions f_initial = initial_condition(x_idx) # 0.5*sin(2*pi*x) t_initial = torch.zeros_like(x_idx) # t0 t_initial.requires_grad = True initial_loss_f = model(x_idx, t_initial) - f_initial # L_initF initial_loss_df = dfdt(model, x_idx, t_initial, order=1) # L_initDF # obtain the final loss by averaging each term and summing them up final_loss = \ pde_loss.pow(2).mean() + \ boundary_loss_x0.pow(2).mean() + \ boundary_loss_x1.pow(2).mean() + \ initial_loss_f.pow(2).mean() + \ initial_loss_df.pow(2).mean() return final_loss # In[17]: device = "cuda" if torch.cuda.is_available() else "cpu" # generate the time-space meshgrid x_domain = [0.0, 1.0]; n_points_x = 100 t_domain = [0.0, 1.0]; n_points_t = 150 x_idx = torch.linspace(x_domain[0], x_domain[1], steps=n_points_x, requires_grad=True) t_idx = torch.linspace(t_domain[0], t_domain[1], steps=n_points_t, requires_grad=True) grids = torch.meshgrid(x_idx, t_idx, indexing="ij") x_idx, t_idx = x_idx.reshape(-1, 1).to(device), t_idx.reshape(-1, 1).to(device) x, t = grids[0].flatten().reshape(-1, 1).to(device), grids[1].flatten().reshape(-1, 1).to(device) # initialize the neural network model model = simple_NN2().to(device) # In[18]: # Train optimizer = torch.optim.Adam(model.parameters(), lr=1e-2) for ep in range(3000): loss = compute_loss(model, x=x, t=t, x_idx=x_idx, t_idx=t_idx, device=device) # Backpropagation optimizer.zero_grad() loss.backward() optimizer.step() if ep % 300 == 0: print(f"epoch: {ep}, loss: {loss.item():>7f}") # In[19]: # Prediction y = model(x, t) y_np = y.reshape([100,-1]).to("cpu").detach().numpy() # Plot X, Y = np.meshgrid(np.linspace(0, 1, 150), np.linspace(0, 1, 100)) fig, ax = plt.subplots(subplot_kw={"projection": "3d"}) ax.plot_surface(X, Y, y_np, linewidth=0, antialiased=False, cmap=cm.coolwarm,) ax.set_xlabel("t"), ax.set_ylabel("x"), ax.set_zlabel("f") plt.show() # ## Conclusions # # **Growth rate** # # - We saw the difference between: # - Fitting a NN only on observations. # - Adding a regularization term from a 1st order PDE. # **1d wave** # # - We saw how to include: # - A 2nd order PDE. # - Multiple constraints on the initial conditions. # **Next steps** # # - With more complex equations, convergence is not achieved so easily. # - Also, For time-dependent problems, many useful tricks have been devised over the past years such as: # - Decomposing the solution domain in different parts solved using different neural networks. # - Smart weighting of different loss contributions to avoid converging to trivial solutions. # ## References # # [[1](https://www.sciencedirect.com/science/article/pii/S0021999118307125)] Raissi, Maziar, Paris Perdikaris, and George E. Karniadakis. "Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations." Journal of Computational physics 378 (2019): 686-707. # # [[2](https://maziarraissi.github.io/PINNs/)] Raissi, Maziar, Paris Perdikaris, and George E. Karniadakis. "Physics Informed Deep Learning". # # [[3](https://www.sciencedirect.com/science/article/pii/S095219762030292X)] Nascimento, R. G., Fricke, K., & Viana, F. A. (2020). A tutorial on solving ordinary differential equations using Python and hybrid physics-informed neural network. Engineering Applications of Artificial Intelligence, 96, 103996. # # [[4](https://towardsdatascience.com/solving-differential-equations-with-neural-networks-afdcf7b8bcc4)] Dagrada, Dario. "Introduction to Physics-informed Neural Networks" ([code](https://github.com/madagra/basic-pinn)). # # [[5](https://towardsdatascience.com/physics-and-artificial-intelligence-introduction-to-physics-informed-neural-networks-24548438f2d5)] Paialunga Piero. "Physics and Artificial Intelligence: Introduction to Physics Informed Neural Networks". # # [[6](https://github.com/omniscientoctopus/Physics-Informed-Neural-Networks)] "Physics-Informed-Neural-Networks (PINNs)" - implementation of PINNs in TensorFlow 2 and PyTorch for the Burgers' and Helmholtz PDE.