Gaussian Process for Any Neural Architecture
Gaussian Process for Any Neural Architecture: Reference Implementations This repo is a companion to the paper linked below, which shows that the Gaussian process behavior arises in wide, randomly initialized, neural networks regardless of architecture. Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes Despite what the title suggests, this repo does not implement the infinite-width GP kernel for every architecture, but rather demonstrates the derivation and implementation for a few select architectures. Simple RNN GRU Transformer Batchnorm+ReLU Fully-Connected Network