2025-06-07

Variational Autoencoder

Table of Contents

With autoencoder, the same latent variable \(z\) gives the same output \(\hat{x}\) because the decoding is deterministic. VAEs introduce stochasticity so that new samples can be generated.

This is done by breaking down the latent space \(z\) to a mean vector (\(\mu\)) and standard deviation vector (\(\sigma\)). The encoder outputs \(\mu\) and \(\sigma\) from which \(z\) can be sampled.

1. VAE Loss

vae_optimization-20230316153957.png

Figure 1: VAE optimization

Loss function \(L(\phi, \theta, x)\) is reconstruction loss + regularization term.

  • Encoder computes: \(q_{\phi}(z|x)\) i.e. the distribution of latent representation given the input image
  • Decoder computes: \(p_{\theta}(x|z)\) i.e. the distribution of images given the latent representation
  • Reconstruction loss: log-likelihood (?), \(||x-\hat{x}||^2\)

Regularization Loss:

Regularization loss: \(D(q_{\phi}(z|x)\ ||\ p(z))\) is divergence in the two probability distribution.

  • \(q_{\phi}(z|x)\) is inferred latent distribution
  • \(p(z)\) is a prior distribution on the latent space
  • A common choice for the prior is a Normal Gaussain distribution
    • Encourages encodings to distribute evenly around the center of the latent space
    • Penalize the network when it tries to "cheat" by clustering points in specific regions (i.e. by memorizing the data)

We use Regularization function so that:

  • Latent space is continuous
  • Latent space is Completeness: Sampling from latent space must give meaningful content

If regularization is not enforced:

  • variance can be small and
  • means may be distributed far apart so that there is no meaningful content in between

However, greater Regularization can adversely effect the reconstruction. So, a balance is needed.

2. Optimization

Backpropagation cannot be done through Sampling operation. So, we have to use a clever idea: Reparametrize the sampling layer \(z \sim N(\mu, \sigma^2)\) as \(z = \mu + \sigma \times \epsilon\) where \(\epsilon\) is sampled stochastically.

reparametization_of_sampling_layer-20230316155303.png

Figure 2: Reparametization of Sampling Layer

3. Disentanglement \(\beta\) -VAEs

We want latent variables that are uncorrelated with each other. \(\beta\) -VAEs achieve this by enforcing diagonal prior on the latent variables to encourage independence.

beta_vaes-20230316155613.png

Figure 3: \(\beta\) -VAEs

4. Resources


You can send your feedback, queries here