Neural networks through the lens of partial differential equations : from discrete layers to continuous dynamics

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Neural networks through the lens of partial differential equations : from discrete layers to continuous dynamics Zakariaei, Niloufar

Abstract

This thesis studies deep neural networks through the lens of partial differential equations (PDEs), showing how continuous-time and operator-based perspectives can improve the stability, interpretability, and scalability of modern learning systems. Rather than treating neural networks as purely discrete stacks of layers, this work views them as discretizations of underlying dynamical systems, enabling architectural and algorithmic design guided by physical principles. We first develop the Reaction–Diffusion Neural Network (RDNN), an IMEX architecture in which each layer combines an implicit diffusion step with an explicit reaction update. This construction yields unconditional stability, helping mitigate exploding and vanishing gradients, while also improving robustness to noise and domain shifts. It further provides an interpretable framework in which learned parameters correspond to physically meaningful quantities such as diffusion and reaction coefficients. We then extend this idea by introducing the Advection–Diffusion–Reaction Neural Network (ADRNN), which incorporates a learnable transport mechanism to capture directional and long-range interactions. By combining advection, diffusion, and reaction within each layer, the resulting architecture unifies transport, smoothing, and nonlinear feature transformation in a single framework. Empirical results show that this approach improves stability and produces sharper long-horizon forecasts in spatio-temporal prediction tasks. Finally, we address the challenge of efficient training at high resolution through a multiscale optimization framework inspired by multigrid methods. To overcome the unstable gradient behavior of standard convolutions under mesh refinement, we introduce Mesh-Free Convolutions (MFCs), a class of operators derived from differential operator theory whose behavior remains stable across discretizations. Coupled with a coarse-to-fine training strategy, this framework substantially reduces computational cost while maintaining or improving predictive accuracy. Overall, this thesis demonstrates that PDE-based structure can serve not only as a tool for interpreting neural networks, but also as a principled foundation for designing models and training methods that are more stable, physically grounded, and computationally efficient.

Item Metadata

Title	Neural networks through the lens of partial differential equations : from discrete layers to continuous dynamics
Creator	Zakariaei, Niloufar
Supervisor	Haber, Eldad
Publisher	University of British Columbia
Date Issued	2026
Description	This thesis studies deep neural networks through the lens of partial differential equations (PDEs), showing how continuous-time and operator-based perspectives can improve the stability, interpretability, and scalability of modern learning systems. Rather than treating neural networks as purely discrete stacks of layers, this work views them as discretizations of underlying dynamical systems, enabling architectural and algorithmic design guided by physical principles. We first develop the Reaction–Diffusion Neural Network (RDNN), an IMEX architecture in which each layer combines an implicit diffusion step with an explicit reaction update. This construction yields unconditional stability, helping mitigate exploding and vanishing gradients, while also improving robustness to noise and domain shifts. It further provides an interpretable framework in which learned parameters correspond to physically meaningful quantities such as diffusion and reaction coefficients. We then extend this idea by introducing the Advection–Diffusion–Reaction Neural Network (ADRNN), which incorporates a learnable transport mechanism to capture directional and long-range interactions. By combining advection, diffusion, and reaction within each layer, the resulting architecture unifies transport, smoothing, and nonlinear feature transformation in a single framework. Empirical results show that this approach improves stability and produces sharper long-horizon forecasts in spatio-temporal prediction tasks. Finally, we address the challenge of efficient training at high resolution through a multiscale optimization framework inspired by multigrid methods. To overcome the unstable gradient behavior of standard convolutions under mesh refinement, we introduce Mesh-Free Convolutions (MFCs), a class of operators derived from differential operator theory whose behavior remains stable across discretizations. Coupled with a coarse-to-fine training strategy, this framework substantially reduces computational cost while maintaining or improving predictive accuracy. Overall, this thesis demonstrates that PDE-based structure can serve not only as a tool for interpreting neural networks, but also as a principled foundation for designing models and training methods that are more stable, physically grounded, and computationally efficient.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2026-04-02
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0451812
URI	http://hdl.handle.net/2429/93919
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Geophysics
Affiliation	Science, Faculty of; Earth, Ocean and Atmospheric Sciences, Department of
Degree Grantor	University of British Columbia
Graduation Date	2026-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Neural networks through the lens of partial differential equations : from discrete layers to continuous dynamics Zakariaei, Niloufar

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights