Conceptually, for two vectors x and y, x.y is defined as
- magnitude of a multiplied by projection of y onto x (think of it as shadow cast by y onto x)
- if the x and y are at right angles (orthogonal), x.y will be zero, regardless of the length of either of them
Complex stuff
- The set of weights in a neuron are nothing but a vector (w1, w2, ..)
- That weight vector is orthogonal to the line that is the separating hyperplane
- Dot product of each input data with weight vector tells us two things: 1. how far is it from the separating hyperplane, 2. which side of that hyperplane is it
- Adding bias term is like moving the hyperplane away from origin, but without changing it’s orientation
Primary resources
- Chapter 2, Why Machines Learn by Anil Ananthaswamy (Pages 38-42)