Variational Calculus in VAE

Given a probability distribution $p(x)$, we want to find an encoder $q_{\theta}(z|x)$ that approximates the true posterior $p(z|x)$. And we resort to minimize the Kullback-Leibler divergence between them:

\[ \min D_{KL}\left(q_{\theta}(z|x) \| p(z|x)\right) \]

And $D_{KL}\left(q_{\theta}(z|x) \| p(z|x)\right)$ can be computed as:

\[ \begin{align*} D_{KL}\left(q_{\theta}(z|x) \| p(z|x)\right) &= \mathbb{E}_{q_{\theta}(z|x)}\left[\log\frac{q_{\theta}(z|x)}{p(z|x)}\right]\\ &=\int q_{\theta}(z|x) \log\frac{q_{\theta}(z|x)}{p(z|x)} dz\\ &= J[q]\\ s.t. \int q_{\theta}(z|x) dz = 1 \end{align*} \] We introduce a Lagrange multipler $\lambda$, and we can write the constrained optimization problem as:

$$ \[\begin{align*} S[q, \lambda] &= J[q] + \lambda\left(\int q_{\theta}(z|x) dz-1\right)\\ &= \int \left[ q(z) \log q(z) -q(z) \log p(z|x)\right] dz+ \lambda\left(\int q(z) dz- 1\right)\\ &= \int \left[ q(z) \log q(z) -q(z) \log p(z|x) + \lambda q(z) \right] dz - \lambda \end{align*}\] $$

Use the Euler-Lagrange Equation w.r.t. $q$ \[ \frac{\partial L}{\partial q} -\frac{d}{dz} \frac{\partial L}{\partial \dot q} = 0 \]

Reduce to \[ \frac{\partial L}{\partial q} = 0 \] since $L$ is not explicitly dependent on $\dot q$.

Therefore \[ \log q(z) + 1 - \log p(z|x) + \lambda =0 \]

We can solve the above equations and get \[ q(z) = \exp\left(\log p(z|x) - 1 - \lambda\right) = C \cdot p(z|x) \] And with normalizing constraint, \[ \int q(z) dz = 1 \] we know that $q(z) = p(z|x)$.

And the next step is to decompose \[ \begin{align*} D_{KL}\left(q_{\theta}(z|x) \| p(z|x)\right) &= \mathbb{E}_{q_{\theta(z|x)}}\left[\log \frac{q_{\theta}(z|x)}{p(z)}\right] + \mathbb{E}_{q_{\theta(z|x)}}\left[\log \frac{p(z)}{p(z|x)}\right] \\ &= D_{KL}\left(q_{\theta}(z|x) \| p(z)\right) + \mathbb{E}_{q_\theta(z|x)}\left[\log p(x)\right]\\ &= D_{KL}\left(q_{\theta}(z|x) \| p(z)\right) + \log p(x) \end{align*} \]

which is equivalent to maximizing the ELBO:

\[ \mathcal{L}(q) = - D_{KL}\left(q_{\theta}(z|x) \| p(z)\right) =\mathbb {E}_{z\sim q(z|x)}\left [ \log\frac{p(x,z)}{q_{\theta}(z|x)}\right] \]

And ELBO can be expressed as:

\[ \mathcal{L}(q) = \mathbb{E}_{z\sim q(z|x)} [\log p_{\theta}(x|z)] - D_{KL}(q_{\theta}(z|x) \| p(z)) \] And $p_{\theta}(x|z)$ represent the decoder.

Machine Learning

#Generative Models #VAE #Calculus of Variations

Variational Calculus in VAE

https://notdesigned.github.io/2025/08/25/Variational-Calculus-in-VAE/

Author

Luocheng Liang

Posted on

August 25, 2025

Licensed under

Flow matching, Score-based Generative Model, Schrödinger Bridge and Optimal Transport Previous

Renormalization Group Flow as Optimal Transport Next