6. Kernels

When pre-training on broad data and fine tuning later: Fix , find linear function on few fine tuning data.

Kernel functions must be symmetric and the kernel matrix PSD

Problem: Some problems are only linearly separable in very high dimensional spaces (simple ex.: circle of one color in other). Let be the func that maps to that higher dimension space.

Minimizer:

Predictor:

We define .

Gaussian & Laplacian Kernels

How prevent overfitting?