Dot Products

Feb 13, 2025 | LLM Concepts

Conceptually, for two vectors x and y, x.y is defined as

magnitude of a multiplied by projection of y onto x (think of it as shadow cast by y onto x)
if the x and y are at right angles (orthogonal), x.y will be zero, regardless of the length of either of them

Complex stuff

The set of weights in a neuron are nothing but a vector (w1, w2, ..)
That weight vector is orthogonal to the line that is the separating hyperplane
Dot product of each input data with weight vector tells us two things: 1. how far is it from the separating hyperplane, 2. which side of that hyperplane is it
Adding bias term is like moving the hyperplane away from origin, but without changing it’s orientation

Primary resources

Chapter 2, Why Machines Learn by Anil Ananthaswamy (Pages 38-42)