Neural Networks for Physical Modelling Synthesis

ISMIR 2025 Tutorial

Rodrigo Diaz

Queen Mary University of London

Why Neural Networks?

Traditional Physical Modelling:

  • Pros:
    • Extensive parameter control (e.g., excitation shape, string tension).
    • Physically interpretable.
    • Compact representation.
  • Cons:
    • Computationally expensive (often requires solving PDEs at audio rates).
    • Deriving model equations is a domain-specific, expert-driven task.
    • Difficult to estimate parameters from real-world audio.

Why Neural Networks?

Can we combine data-driven learning and physics-based priors?

Physics-Informed NNs Neural ODEs Neural Operators Deep Koopman Deep State Space Models
Embeds the PDE into the loss function [1] Learns the system’s continuous-time dynamics [2] Learns the solution operator for the PDE. Maps functions to functions [3] Learns a linear representation of the dynamics in an observable space [4] Recurrent models with structured initialisation [5]

Physics-Driven

Data-Driven

Tutorial Roadmap

A brief tour

  1. Physics-Informed Neural Networks
  2. Neural Ordinary Differential Equations
  3. Neural Operators
  4. Deep Koopman Operator
  5. Deep State Space Models
  6. Conclusion

Getting the code

To get the code for part3 of the tutorial you can clone the GitHub repository:

git clone git@github.com:ismir-physical-modeling/tutorial-notebooks.git
git checkout part3

Recommended to use a uv (https://docs.astral.sh/uv/)

uv sync

Otherwise, you can create a venv (>=3.11) and install the dependencies with pip:

pip install .

Download the data from the Google Drive link (https://tinyurl.com/nh4kutj4) and place it in the data/ folder, next to the notebooks.

Physics-Informed Neural Networks

Physics-Informed Neural Networkslosses

Physics-Informed Neural Networks

Embed the physical law itself into the learning objective [1].

A neural network \(u_\theta(\mathbf{x}, t)\) directly approximates the solution to a PDE. The loss function forces the network’s output to satisfy two things:

  1. Data Fidelity: Match observed data points (initial/boundary conditions).
  2. Physics Fidelity: Obey the governing PDE at all points in the domain.

The physics part of the loss is mostly in the PDE residual: \[ \mathcal{L}_{\text{physics}} = || \mathcal{N} u_\theta - f ||^2_{\Omega} \] where \(\mathcal{N}\) is the differential operator.

Example: Damped Harmonic Oscillator

Example: Damped Harmonic Oscillator

Physics-Informed Neural Networks: Challenges & Summary

  • Core Idea: Embeds the governing PDE directly into the loss function using automatic differentiation.
  • Application: Powerful for enforcing physical constraints in data-limited high-dimensional scenarios [6].
  • Challenges: Training is notoriously difficult [7], often requiring custom strategies [8], not very effective compared to other approaches [9].

Neural Ordinary Differential Equations

Neural Ordinary Differential Equations

A Neural ODE (NODE) defines the continuous evolution of a state \(\mathbf{u}(t)\) by parameterizing its time derivative with a neural network \(f_\theta\):

\[ \frac{d\mathbf{u}(t)}{dt} = f_\theta(\mathbf{u}(t), t) \]

The network learns the the dynamics, not the solution trajectory itself. The final state is found by integrating this equation using a numerical ODE solver. [10]

Neural Ordinary Differential Equations

  • Training: Gradients are propagated back through the solver’s operations, either by BPTT (discretise-optimise) or using the adjoint sensitivity method (optimise-discretise) [2].

  • Numerical Integration: We can leverage the vast literature in numerical integration for stability and efficiency.

  • Hybrid Modelling: We can combine known physics with learned components, optimizing physical parameters and neural networks.

Neural ODEs: Time Integration

The discretisation scheme defines the architecture of the recurrent block.

For Störmer-Verlet:

Example: Non-Linear String

Neural ODEs: Challenges & Summary

Challenges:

  • Core Idea: Neural ODEs optimise continuous parameters and use discretisations based on well-known numerical schemes. Allow hybrid models that combine known physics with learned dynamics.
  • Application: Good for parameter estimation for known physical models [[11]][12]
  • Challenges: Use back-propagation through time (BPTT). ODE solver choice can significantly impact stability and accuracy.

Neural Operators

Neural Operators

For time-dependent problems, Neural Operators learn the operator that maps functions (initial conditions, forcing terms and paramaters) directly to the solution at a future time.

Given a time-dependent PDE with an initial state \(\mathbf{u}_0(\mathbf{x})\), parameters \(\mathbf{a}\), and a forcing function \(\mathbf{f}(\mathbf{x}, t)\):

\[ \frac{\partial \mathbf{u}}{\partial t} = \mathcal{N}_a(\mathbf{u}) + f(\mathbf{x}, t) \]

The neural operator approximates:

\[ \mathbf{u}(\cdot, T) \approx \mathcal{G}_\theta(\mathbf{u}_0, a, \mathbf{f}) \]

Neural Operators: Fourier kernel

\[ \begin{aligned} u(x) &= \int_{\Omega} \kappa(x, y) u_0(y) \mathrm{d} y \\ u(x) &= (\kappa * v)(x) \end{aligned} \]

using the convolution theorem, we have:

\[ \mathcal{F}(u) = \mathcal{F}(\kappa) \cdot \mathcal{F}(u_0) \quad \implies \quad u(x) = \mathcal{F}^{-1}(\mathcal{F}(\kappa) \cdot \mathcal{F}(u_0))(x) \]

The Fourier Neural Operator approximates this by replacing the transform of the kernel with learnable weights

\[ u(x) \approx \mathcal{F}^{-1}(\mathcal{R}_{\theta} \cdot \mathcal{F}(v))(x) \]

where we optimise the (typically truncated) complex weights \(\mathcal{R}_{\theta}\) during training.

Neural Operators: An Interpretation

For linear PDEs, the operator can be interpreted as a modal method [13]:

\[ u(x, t) \approx \mathcal{T}^{-1}(e^{\mathcal{\Lambda}t} \cdot \mathcal{T}(u_0))(x) \]

where the forward and inverse transforms are defined as follows:

\[ \begin{aligned} \mathcal{T}\{u(x,t)\} &= \int_{V}\tilde{K}_{\mu}^{H}(x)u(x,t)dx && \text{(Forward Projection)} \\ \mathcal{T}^{-1}\{q_{\mu}(t)\} &= \sum_{\mu=0}^{\infty}\frac{{q}_{\mu}(t)K_{\mu}(x)}{||K_{\mu}||} && \text{(Back-Projection)} \end{aligned} \]

  • \(K_\mu\) are the eigenfunctions of the spatial operator
  • \(\Lambda\) is a diagonal matrix of the corresponding eigenvalues

Example: Non-Linear String

AR:

Neural Operators: Summary & Challenges

  • Core Idea: Learns the solution operator of a PDE, often implemented efficiently as a Fourier Neural Operator (FNO).
  • Application: It can be used to model the dynamics of vibrating systems (e.g., strings, membranes) for sound synthesis [13], [14].
  • Challenges: Models can be computationally complex and require large datasets. Not efficient for sample-by-sample processing.

Deep Koopman Networks

Linearizing the Non-Linear

From [15]

The Koopman Idea: Instead of analyzing the system in its native state space, lift it to a different, possibly infinite-dimensional observable space, where the dynamics become linear. The Koopman Operator \(K\) is the linear operator that evolves the observables forward in time \[g(\mathbf{u}_{k+1}) = (Kg)(\mathbf{u}_k)\]

Deep Koopman Networks (DKNs)

How do we find the observable functions \(g\)? The space of all possible observables is infinite-dimensional.



Autoencoder

with dynamics constraints in the latent space

Example

Losses:

\[ \begin{aligned} \mathcal{L}_{\text {consistency }} & =\sum_{k=1}^{L-1}\left\|\varphi\left(x_k\right)-\Lambda^k \varphi\left(x_0\right)\right\|_2^2 \\ \mathcal{L}_{\text {pred }} & =\sum_{k=1}^{L-1}\left\|x_k-\varphi^{-1}\left(\Lambda^k \varphi\left(x_0\right)\right)\right\|_2^2 \\ \mathcal{L}_{\text {enc }} & =\left\|x_0-\varphi^{-1}\left(\varphi\left(x_0\right)\right)\right\|_2^2 \\ \mathcal{L} & =\alpha_1 \mathcal{L}_{\text {pred }}+\alpha_2 \mathcal{L}_{\text {enc }}+\alpha_3 \mathcal{L}_{\text {consistency }} \end{aligned} \]

Example

Example: 2D Visualisation

Koopman and Modal Analysis

For Linear Systems (e.g., an ideal string)

  • The Koopman eigenfunctions (\(\varphi_i\)) are the spatial modes of the string.
  • The Koopman eigenvalues (\(\lambda_i\)) represent the modal frequencies and damping.

For Non-Linear Systems

We find some parameterisation of linear resonators that account for the coupling and pitch glide effects.

Deep Koopman Networks: Summary & Challenges

  • Core Idea: Autoencoder that transforms non-linear dynamics into a linear evolution in a latent space.
  • Application: Modal-like synthesis for non-linear systems [16], [17]
  • Challenges: Needs lots of data and sensitive to initial parameters.

Deep State Space Models

Deep State Space Models: The Core Idea

Representation for linear time-invariant (LTI) systems:

\[ \begin{aligned} \dot{\mathbf{x}}(t) &= \mathbf{A} \mathbf{x}(t) + \mathbf{B} \mathbf{u}(t) \quad & \ \mathbf{y}(t) &= \mathbf{C} \mathbf{x}(t) + \mathbf{D} \mathbf{u}(t) \quad & \end{aligned} \]

Deep State Space Models: Recurrence and Convolution

  1. Recurrent Representation: After discretisation, the SSM is a linear RNN. This is very efficient for auto-regressive inference (generating one sample at a time).

\[ \begin{aligned} \mathbf{x}_k &= \bar{\mathbf{A}} \mathbf{x}_{k-1} + \bar{\mathbf{B}} \mathbf{u}_k \\ \mathbf{y}_k &= \bar{\mathbf{C}} \mathbf{x}_k \end{aligned} \]

  1. Convolutional Representation: Because the system is LTI, its output is the convolution of the input with a fixed impulse response (or kernel)

\[ \mathbf{h}_k = \bar{\mathbf{C}} \bar{\mathbf{A}}^k \bar{\mathbf{B}} \]

This allows us to use FFT to apply the convolution efficiently.

Deep State Space Models: Summary & Challenges

  • Core Idea: A sequence model based on a learnable state-space representation that can be viewed as both a recurrent system (for efficient inference) and a convolutional system (for fast, parallel training).
  • Application: Good at modelling long sequences, making them effective for tasks like unconditional raw audio generation [18] and physical modelling [19].
  • Primary Challenge: Performance is sensitive to the initialisation of parameters. Large number of parameters can make them difficult to train and deploy in real-time audio applications.

Comparison with the String

Comparison

Method Core Idea Prior Time Advantage Challenge
Neural ODE Learn vector field Weak-Strong Continuous Flexible/hybrid Solver cost
PINN PDE in loss Strong Continuous Inverse problems Hard optimisation
Neural Operator Learn operator Weak Agnostic Discretisation invariant Computational cost
Koopman Linear in latent Weak Discrete Fast, interpretable Data and initialisation
DSSM Linear in latent (with non-linear blocks) Weak Continuous/Discrete Fast Large, black box

Thank You & Questions

[1]
M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, pp. 686–707, Feb. 2019, doi: 10.1016/j.jcp.2018.10.045
[2]
R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” Advances in neural information processing systems, vol. 31, 2018.
[3]
N. Kovachki et al., “Neural operator: Learning maps between function spaces with applications to PDEs,” J. Mach. Learn. Res., vol. 24, no. 1, pp. 89:4061–89:4157, Jan. 2023.
[4]
B. Lusch, J. N. Kutz, and S. L. Brunton, “Deep learning for universal linear embeddings of nonlinear dynamics,” Nature Communications, vol. 9, no. 1, p. 4950, Nov. 2018, doi: 10.1038/s41467-018-07210-0
[5]
A. Gu, K. Goel, and C. Re, “Efficiently Modeling Long Sequences with Structured State Spaces,” in International Conference on Learning Representations, Mar. 2022.
[6]
M. Olivieri, X. Karakonstantis, M. Pezzoli, F. Antonacci, A. Sarti, and E. Fernandez-Grande, “Physics-informed neural network for volumetric sound field reconstruction of speech signals,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2024, no. 1, p. 42, Sep. 2024, doi: 10.1186/s13636-024-00366-2
[7]
P. Rathore, W. Lei, Z. Frangella, L. Lu, and M. Udell, “Challenges in training PINNs: A loss landscape perspective,” in Proceedings of the 41st International Conference on Machine Learning, in ICML’24, vol. 235. Vienna, Austria: JMLR.org, Jul. 2024, pp. 42159–42191.
[8]
S. Wang, S. Sankaran, H. Wang, and P. Perdikaris, “An Expert’s Guide to Training Physics-informed Neural Networks.” arXiv, Aug. 2023. doi: 10.48550/arXiv.2308.08468. Available: https://arxiv.org/abs/2308.08468. [Accessed: Aug. 20, 2025]
[9]
N. McGreivy and A. Hakim, “Weak baselines and reporting biases lead to overoptimism in machine learning for fluid-related partial differential equations,” Nature Machine Intelligence, vol. 6, no. 10, pp. 1256–1269, Oct. 2024, doi: 10.1038/s42256-024-00897-5
[10]
P. Kidger, “On Neural Differential Equations.” arXiv, Feb. 2022. doi: 10.48550/arXiv.2202.02435. Available: https://arxiv.org/abs/2202.02435. [Accessed: May 22, 2025]
[11]
R. Diaz and M. Sandler, “Fast Differentiable Modal Simulation of Non-linear Strings, Membranes, and Plates.” arXiv, Italy, May 2025. doi: 10.48550/arXiv.2505.05940. Available: https://arxiv.org/abs/2505.05940. [Accessed: Jun. 16, 2025]
[12]
V. Zheleznov, S. Bilbao, A. Wright, and S. King, “Learning Nonlinear Dynamics in Physical Modelling Synthesis using Neural Ordinary Differential Equations,” in Proc. 28th int. Conf. Digital audio effects, 2025.
[13]
J. Parker, S. Schlecht, M. Schäfer, and R. Rabenstein, “Physical Modeling Using Recurrent Neural Networks with Fast Convolutional Layers,” in Proc. Int. Conf. On Digital Audio Effects (DAFx 22), 2022, pp. 138–145.
[14]
M. Middleton, D. T. Murphy, and L. Savioja, “The application of Fourier neural operator networks for solving the 2D linear acoustic wave equation,” in Proceedings of the 10th Convention of the European Acoustics Association, European Acoustics Association, Sep. 2023.
[15]
S. L. Brunton and J. N. Kutz, Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control, 2nd ed. Cambridge: Cambridge University Press, 2022. doi: 10.1017/9781009089517
[16]
V. Huhtala, L. Juvela, and S. J. Schlecht, KLANN: Linearising Long-Term Dynamics in Nonlinear Audio Effects Using Koopman Networks,” IEEE Signal Processing Letters, vol. 31, pp. 1169–1173, 2024, doi: 10.1109/LSP.2024.3389465
[17]
R. Diaz, C. De La Vega Martin, and M. Sandler, “Towards efficient modelling of string dynamics: A comparison of state space and koopman based deep learning methods,” in Proc. Int. Conf. On Digital Audio Effects (DAFx 24), Guildford, UK, 2024.
[18]
K. Goel, A. Gu, C. Donahue, and C. Re, “It’s Raw! Audio Generation with State-Space Models,” in Proceedings of the 39th International Conference on Machine Learning, PMLR, Jun. 2022, pp. 7616–7633.
[19]
C. De La Vega Martin, R. Diaz, and M. Sandler, “Evaluation of Neural Surrogates for Physical Modelling Synthesis of Nonlinear Elastic Plates,” in Workshop on machine learning for audio at ICML25, 2025.