)
Random Features are a trick for learning non-parameteric classifiers and regressors given huge (millions of data points) data sets.
There are many ways to construct random features. In a recent paper, we suggest deriving random features that approximate the positive definite kernels that used in Kernel Machines. This lets us efficiently leverage the performance of state of the art Kernel Machines on very large data sets.
![]() |
Kernel machines (such as the Kernelized Support Vector Machine) are state-of-the-art architectures for classifying data, but training them is slow. They can rarely be trained with more than ~50,000 training examples in practice. To scale training to millions of examples, we utilize a linear version of the machine, wich can easily scale to millions of examples. To do this, we pass the data through random features, and train a linear machine on the featurized data. The random features are designed to ensure that the classifier generated in this way is approximately the same as the classifierd returned by the original (slow-to-train) kernel machine.
While the kernel trick lets us do linear things in
infinite-dimensional space without an infinite amount of work, it
still incurs a high computational cost. Random Features provide a
finite-dimensional alternative to the Kernel Trick by instead
mapping the data to an equivalent randomized feature space. For a
given shift-invariant kernel
we can find a
-dimensional random function
so that
The following table compares our results agains the Core Vector Machine
|
Random Features for Large-Scale Kernel Machines,
Ali Rahimi, Ben Recht,
to appear in Neural Information Processing Systems (NIPS) 2007.
(pdf)
Here is MATLAB code for Random Features. And some datasets and sample runs that appear in the above paper (many thanks to the authors of the Core Vector Machine for supplying formatted versions of many of these). There may be bugs in the code. If you have trouble getting state-of-the-art classification/regression results with this code, please drop me a note and I'll help debug.