Theorem 1 Given any vectors and a parameter , there exists a mapping , where , such that for any pair of vectors , we have
The lemma is quite fascinating; it states that if one is interested only in pairwise distances then one can reduce the dimension by possibly an exponential factor. This, for instance, has huge consequences in nearest neighbor methods used in data mining and machine learning.
Today, one can think of the proof of the above lemma as an advanced exercise in a probability class after one has proved the Chernoff-Hoeffding bounds. The crux of the proof lies in the following lemma. Let be a matrix where each entry is a normal random variable with mean and variance . Let be any vector in , and let .
The lemma implies that if we pick , then for any set of vectors , their corresponding images satisfy w.h.p. the property that any pairwise distance is not distorted by much. This gives the mapping in the JL theorem.
Proof: For simplicity, let’s assume is a unit vector, and let . Observe that each coordinate since it’s a linear combination of Gaussians whose squared sum is . Therefore, the squared length of
has the chi-squared distribution with degrees of freedom. At this point one could open a reference to get .
Therefore, the JL lemma boils down to a tail bound on the chi-squared distribution. This can be proven either directly (that is, via an integration since the cdf of the chi-squared distribution is exactly known), or similar to how the standard Chernoff bound is proven. Let us do one direction
Given a normal random variable , and a parameter , one can do an integration exercise to get . Putting this in, we get for any ,
Setting , we get