All projects Generative Models

PixelCNN

Exact-likelihood image modeling, one pixel at a time.

Study & implementation 2022 AutoregressiveExact likelihoodMasked convolutions
PixelCNN diagram

Motivation

Autoregressive models factorize a joint distribution into a product of conditionals via the chain rule of probability. For images this means predicting each pixel from the pixels already seen — giving a tractable, exact likelihood at the cost of sequential sampling.

Method

PixelCNN enforces the autoregressive ordering with masked convolutions, so each output depends only on pixels above and to the left. Colour channels are handled with a further conditional factorization within each pixel.

\[ p(x) = \prod_{i=1}^{n^2} p\!\left(x_i \mid x_1,\dots,x_{i-1}\right) \]

Mathematical core

The chain rule turns density estimation into a supervised classification problem over pixel intensities. This exact-likelihood viewpoint is a useful contrast to the implicit likelihoods of GANs and the variational bounds of VAEs.