6. Kernels
When pre-training on broad data and fine tuning later: Fix , find linear function on few fine tuning data.
Kernel functions must be symmetric and the kernel matrix PSD
Problem: Some problems are only linearly separable in very high dimensional spaces (simple ex.: circle of one color in other). Let be the func that maps to that higher dimension space.
Minimizer:
Predictor:
We define .
Gaussian & Laplacian Kernels
How prevent overfitting?