Flow matching, Score-based Generative Model, Schrödinger Bridge and Optimal Transport
Reference
Denoising Diffusion Probabilistic Models
Diffusion Schrödinger Bridge Matching
Simplified Diffusion Schrödinger Bridge
Adversarial Schrödinger Bridge Matching
Introduction
Generative models resonate with two deep principles: the thermodynamics of entropy and the mathematics of optimal transport.
Symbols and Preliminaries
Probability Space
is a probability space. is the measurable space. - A random variable is a measurable map
, i.e. - The distribution (pushforward measure) of
is - If
is absolutely continuous w.r.t. the Lebesgue measure , then there exists a density function such that - Notation: we write
; if admits a density, we often abbreviate .
Stochastic Process
- A stochastic process is a time-parametrized family
of random variables
- The marginal distribution at time
is
Standard Brownian Motion (Wiener Process)
A
Itô Integral and Itô’s Lemma
Quadratic Variation.
For a continuous semimartingale
For one-dimensional Brownian motion
Itô Integral.
Let
Itô’s Lemma (Itô formula, one-dimensional).
Suppose
Multidimensional Itô’s Lemma.
If
Remark.
- The term
- This correction term is what distinguishes Itô calculus from classical
calculus and is fundamental in stochastic analysis.
Other
The Kullback–Leibler (KL) divergence between two probability measures
are:
For KL divergence between Gaussian distribution, we have
Diffusion Models
DDPM
How to sample from
We define a markovian stochastic process
We call
By induction, we know
So we can sample
Assume
Now given
Calculate the posterior probability density:
Instead we consider
Note that
Now, to train the
we minimize the KL divergence:
And minimize
Score-based Generative Model (SGM)
We view noise injection as an SDE and learn the
score
Forward SDE
Let
DDPM as VP-SDE
Set the Variance-Preserving (VP) SDE
This linear SDE has the explicit solution
Hence the marginal conditional matches DDPM.
Claim. The DDPM forward chain with
Variance-Exploding SDE (VE-SDE)
Alternatively, set
Reverse-time SDE and Probability Flow ODE
The reverse SDE (Anderson, 1982) is
The equivalent deterministic probability flow ODE is
Noise Prediction vs. Score Prediction
For VP,
An alternative data predictor:
Denoising Score Matching (DSM)
Training objective:
For VP, choosing
Derivation and Intuition
TODO: Weak form.
Flow Matching Models
See Flow Matching
Schrödinger Bridge Models
Relation with Entropy-Regularized Optimal Transport
IPF (DSB, S-DSB)
IMF
D-IMF (ADSB)
CDSB
TODO.