5. Classification
The magnitude of the weights are irrelevant b/c we only take the sign.
We can control false pos/neg by setting the boundary to something other than 0.
Max-Margin Solution
Too many perfect classifiers → chose weights with max distance between the boundary and the data points that are closest to it.
Boundary to datapoint dist: (projecting onto , then normalizing by )
Since only ‘s direction matters, we fix
If correctly classified: Since picking from perfect ones:
Max-margin classifier:
Support Vector Machine (SVM)
Since minimax difficult to solve → minimize the norm of subject to all points being at least distance 1 from the boundary:
Support vectors: the vectors sitting exactly 1 unit away from the boundary. If you removed any other point, the solution wouldn’t change.
Implicit Bias of GD: GD w/ logistic loss yields max-margin solution (recently proven):